1 Introduction

The main metric of an ORAM’s performance, communication overhead, has improved by orders of magnitude over the last few years. However, at least one significant hurdle to actual adoption remains: security of modern ORAMs relies on there being only a single client at all times. While there already exist multi-client ORAMs secure in the face of a semi-honest server [18], existence of an ORAM secure against a fully malicious server is still an open question. The main challenge here stems from the fact that today’s ORAMs modify some of the data on the server after every access. If a malicious server “rewinds” the data and presents an old version to a client, further interactions may reveal details about the access pattern. In single client scenarios, this is typically solved by storing a small token on the client, such as the root of a hash tree [22]. This token authenticates and verifies freshness of all data retrieved from the server, ensuring that no such rewind attack is possible.

In this paper, we address the fundamental problem how multiple clients can share data stored in a single ORAM. With multiple clients, an authentication token is not sufficient. Data may not pass one client’s authentication, simply because it has been modified by one of the other clients. If clients could communicate with each other using a secure out-of-band channel, then it becomes trivial to continually exchange and update each other with the most recent token. However, existence of secure out-of-band communication is often not a reasonable assumption for modern devices. As we will see, it is the absence of out-of-band-communication which makes multi-client ORAM technically challenging.

Current solutions for multi-client ORAM work only in the presence of semi-honest (honest-but-curious) adversaries, which cannot perform rewind attacks on the clients. Often, this is not a very satisfying model, since rewind attacks are very easy to execute for real-world adversaries and would be difficult to detect. Consequently, we only address fully malicious servers. Goodrich et al. [12], in their paper examining multi-client ORAM, recently proposed as an open question whether one could be secure for multiple clients against a malicious server.

Technical Highlights. We introduce the first construction for a multi-client ORAM and prove access pattern indistinguishability, even if the server is fully malicious. Our contribution is twofold, specifically:

  • We start by focusing on two ORAM constructions that follow a “classical” approach, the square-root ORAM by Goldreich [8] and the hierarchical ORAM by Goldreich and Ostrovsky [9]. We adapt these ORAMs for multi-client security. Our approach is to separate client accesses into two parts: non-critical portions, which can be performed securely in the presence of a malicious server and critical portions, which cannot but contains efficient integrity checks which will reveal any malicious behavior and allow the client to terminate the protocol.

    Table 1. Communication and storage worst-case complexity for existing single-client ORAMs and our new multi-client versions. \(\phi \) is the number of different clients supported by the ORAM. \(\hat{O}\) denotes amortized complexity.
  • The “classical” ORAM constructions have been largely overshadowed by more recent tree-based ORAMs [23, 25]. Consequently, we go on to demonstrate how a multi-client secure Path ORAM [25] can be constructed. We solve the key challenge of realizing a multi-client secure version of the read protocol by storing Path ORAM’s metadata using small “classical” ORAMs as building blocks. For block sizes in \(\varOmega (\log ^4{}N)\), this results in a multi-client ORAM which has overall communication complexity of \(O(\phi \cdot \log N)\).

Table 1 summarizes asymptotic behavior for our new multi-client ORAMs and compares them to their corresponding single-client ORAMs.

2 Motivation: Multi-client ORAM

Instead of a single ORAM accessed by a single client, we envision multiple clients securely exchanging or sharing data stored in a single ORAM. For example, imagine multiple employees of a company that read from and write into the same database stored at an untrusted server. Similar to standard ORAM security, sharing data and jointly working on the database should not leak the employees’ access patterns to the server. Alternatively, we can also envision a single person with multiple different devices (laptop, tablet, smartphone) accessing the same data hosted at an untrusted server (e.g., Dropbox). Again, working on the same data should not reveal access patterns. Throughout this paper, we consider the terms “multi-client” and “multi-user” to be equivalent. As suggested by Goodrich et al. [12], we assume that clients all trust each other and leave expansion of our results for more fine-grained access control as future work. In the multi-device scenarios above, it is reasonable that clients trust each other since they belong to a single user.

To provide security, ORAM protocols are stateful. Hiding client accesses to a certain data block is typically achieved by performing shuffling or reordering of blocks, such that two accesses are not recognizable as being the same. An obvious attack that a malicious server can do is to undo or “rewind” that shuffling after the first access and present the same, original view (state) of the data to the client when they make the second access. If the client was to blindly execute their access, and it was the same block of data as the first access, it would result in the same pattern of interactions with the server that the first access did. The server would immediately have broken the security of the ORAM scheme. This is a straightforward attack that can be easily defeated in case of a single client: as an internal state, the client stores and updates a token for authentication and freshness, see Ren et al. [22].

However, with two or more clients sharing data in an ORAM, this attack becomes a new challenge. After watching one client retrieve some data, the adversary rewinds the ORAM’s state and present the original view to the second client. If the second client accesses the same data (or not) that the first client did, the server will recognize it, therefore violating security. Without having some secure side-channel to exchange authentication tokens after every access, it is difficult for clients to detect such an attack.

2.1 Technical Challenges

A multi-client ORAM has to overcome a new technical challenge. Roughly speaking, the server is fully malicious and can present different ORAM states to different clients, i.e., different devices of the same user. As different clients do not have a direct communication channel to synchronize their state, it is difficult for them to synchronize on an ORAM state. We expand on this challenge below.

Adversary Model: This paper tackles the scenario of \(\phi \) trusted clients sharing storage on a fully malicious server (the adversary). Other works such as Maffei et al. [18] have addressed the problem against a semi-honest server, but in many scenarios that may not be sufficient. Real-world attacks that clients need to defend against include, e.g., insider attacks from a cloud provider hosting the server and outside hackers compromising the server. Such attacks would allow for malicious adversarial behavior. In general, there is no strong line between the two adversarial models that suggests that one is more reasonable to defend against. To cope with all possible adversaries, it is therefore important to protect against malicious adversaries, too.

No Out-of-Band Communication: We assume that beyond a single cryptographic key (possibly derived from a password) the clients do not share any long-term secrets and cannot communicate with each other except through the malicious server. This matches with existing cloud settings, since most consumer devices are behind NAT and cannot be directly contacted from the Internet. Major real-world cryptographic applications, for instance WhatsApp [26] and Semaphor [24], have all messages between clients relayed through the server for this reason.

We emphasize that in the malicious setting the server always has the option to simply stop responding or send purposefully wrong data as a denial-of-service attack. This cannot be avoided, but is also not a significant problem since it will be easily detected by clients. In contrast, the attacks we focus on in this paper are those where the server tries to compromise security without being detected.

2.2 Other Applications

A major application of this work is in supporting private cloud storage that is accessible from multiple devices. In reality, this is one of the most compelling use cases for cloud storage a la Dropbox, Google Drive, iCloud, etc. and is a good target for a privacy-preserving solution. Beyond simple storage, Oblivious RAM is used as a subroutine in other constructions, such as dynamic proofs of retrievability [3]. Any construction which uses ORAM and wishes to support multiple clients must rely on an ORAM that is secure for multiple clients.

Finally, an interesting application comes from the release of Intel’s new SGX-enabled processors. SGX enables a trusted enclave to run protected code which cannot be examined from the outside, even by the operating system or hypervisor running on the machine. The major remaining channel for leakage in this system is in the pattern of accesses that the enclave code makes to the untrusted RAM lying outside the processor. It has already been noted that Oblivious RAM may be a valuable tool to eliminate that leakage [5]. Furthermore, it is expected that systems may have multiple enclaves running at the same time which wish to share information through the untrusted RAM. This scenario corresponds exactly with our multi-client ORAM setting, so the solution here could be used to securely offer that functionality.

2.3 Related Work

Most existing work on Oblivious RAM assumes only a single client. Franz et al. [7] proposed an solution for multiple clients using the original square-root ORAM, but relies on a semi-honest server. Goodrich et al. [12] extend that work to more modern tree-based ORAMs, still relying on a semi-honest server. Recent work by Maffei et al. [18] supports multiple clients with read/write access control, but again requires a semi-honest server. Other efficient solutions are possible with an even weaker security model by using trusted hardware on the server side [13, 17].

There is also a concurrent line of work in Parallel ORAMs which target ORAMs running on multi-core or multi-processor systems [2, 4, 20]. These schemes either do not target malicious adversaries or require constant and continuous communication between clients to synchronize their state. As stated above, this is not a viable solution for clients which are not always on or may be across different networks.

Currently, no solution exists that allows for multiple clients interacting with a malicious server and without direct client-to-client communication, or constant polling to create the same effect.

3 Security Definition

We briefly recall standard ORAM concepts. An ORAM provides an interface to read from and write to blocks of a RAM (an array of storage blocks). It supports \(\mathsf {Read}(x)\), to read from the block at address x, and \(\mathsf {Write}(x, v)\) to write value v to block x. The ORAM allows storage of N blocks, each of size B.

To securely realize this functionality, an ORAM interacts with a malicious storage device. Below, we use \(\varSigma \) to represent the interface between a client and the actual storage device. A client with access to \(\varSigma \) issues \(\mathsf {Read}\) and \(\mathsf {Write}\) requests as needed. We stress that we make no assumptions about how the storage device responds to these requests, and allow for arbitrary malicious behavior. For instance, an adversary could respond to a \(\mathsf {Read}\) request with old or corrupt data, refuse to actually perform a \(\mathsf {Write}\) correctly etc. As related work, the (untrusted) storage device is part of a server as in envisioned applications for a (multi-client) ORAM such as outsourced cloud storage.

Definition 1

(ORAM Operation \(\mathrm {OP}\) ). An operation \(\mathrm {OP}\) is defined as \(\mathrm {OP}=(o,x,v)\), where \(o = \{\mathsf {Read}, \mathsf {Write}\}\), x is the virtual address of the block to be accessed and v is the value to write to that block. \(v = \bot \) when \(o = \mathsf {Read}\).

We now present our multi-client ORAM security definition which slightly augments the standard, single-client ORAM definition. We emphasize that clients only interact with the server by read or write operations to a memory location; there are no other messages sent. Therefore the “protocol” is fully defined by these patterns of accesses.

Definition 2

(Multi-client ORAM \(\varPi \) ). A multi-client ORAM \(\varPi = (\mathsf {Init},\mathsf {ClientInit}, \mathsf {Access})\) comprises the following three algorithms.

  1. 1.

    \(\mathsf {Init}(\lambda , N, B, \phi )\) initializes \(\varPi \). It takes as input security parameter \(\lambda \), total number of blocks N, block size B, and number of clients \(\phi \). \(\mathsf {Init}\) initializes the storage device represented by \(\varSigma \) and outputs a key \(\kappa \).

  2. 2.

    \(\mathsf {ClientInit}(\lambda , N, B, j, \kappa )\) uses security parameter \(\lambda \), number of blocks N, block size B, and a secret key \(\kappa \) to initialize client \(u_j\). It outputs a client state \(st_{u_j}\).

  3. 3.

    \(\mathsf {Access}(\mathrm {OP},\varSigma ,st_{u_j})\) performs operation \(\mathrm {OP}\) on the ORAM using client \(u_j\)’s state \(st_{u_j}\) and interface \(\varSigma \). \(\mathsf {Access}\) outputs a new state \(st_{u_j}\) for client \(u_j\).

In contrast to single-client ORAM, a multi-client ORAM introduces the notion of clients. This is modeled by different per-client states, \(st_{u_i}\) for client \(u_i\). After initializing the multi-client ORAM with \(\mathsf {Init}\), Algorithm \(\mathsf {ClientInit}\) is run by each client (with no communication between them) separately and outputs their initial local states \(st_{u_i}\). The \(\mathsf {ClientInit}\) function only requires that each client have a shared key \(\kappa \), which could be derived from a password. Whenever client \(u_i\) executes \(\mathsf {Access}\) on the multi-client ORAM, they can attempt to update the multi-client ORAM represented by \(\varSigma \), and update their own local state \(st_{u_i}\), but not the other clients’ local states.

Finally, we define the security of a multi-client ORAM against malicious servers. Consider the game-based experiment \(\mathsf {Sec}^{\mathsf {ORAM}}_{\mathcal {A},\varPi }(\lambda )\) below. In this game, \(\mathcal {A}\) has complete control over the storage device and how it responds to client requests. For ease of exposition, we model this as \(\mathcal {A}\) outputting their own compromised version \(\varSigma _{\mathcal {A}}\) of an interface to the storage device. It is this malicious interface \(\varSigma _{\mathcal {A}}\) that clients will subsequently use for their \(\mathsf {Access}\) operations. Interface \(\varSigma _{\mathcal {A}}\) is controlled by the adversary and updates state \(st_{\mathcal {A}}\). That is, \(\mathcal {A}\) learns all clients’ calls to the interface, therewith the clients’ requested access pattern, and can adaptively react to clients’ interface requests. To initialize the ORAM, \(\phi \) time steps are used to do the setup for each client, then the game continues for \(\text {poly}(\lambda )\) additional steps where the adversary interactively specifies operations and maliciously modifies the storage.

Experiment 1

(Experiment \(\mathsf {Sec}^{\mathsf {ORAM}}_{\mathcal {A},\varPi }(\lambda )\) )

figure a

In summary, \(\mathcal {A}\) gets oracle access to the ORAM and can adaptively query it during \(poly(\lambda )\) rounds. In each round, \(\mathcal {A}\) selects a client \(u_i\) and determines two operations \(\mathrm {OP}_{j,0}\) and \(\mathrm {OP}_{j,1}\). The oracle performs operation \(\mathrm {OP}_b\) as client \(u_i\) with state \(st_{u_i}\), interacting with the adversary-controlled \(\varSigma _{\mathcal {A}}\) using protocol \(\varPi \). Eventually, \(\mathcal {A}\) guesses b.

Definition 3

(Multi-client ORAM security). An ORAM \(\varPi =(\mathsf {Init}, \mathsf {Access})\) is multi-client secure iff for all PPT adversaries \(\mathcal {A}\), there exists a function \(\epsilon (\lambda )\) negligible in security parameter \(\lambda \) such that

$$\begin{aligned} Pr[ \mathsf {Sec}^{\mathsf {ORAM} }_{\mathcal {A},\varPi }(\lambda ) = 1] < \frac{1}{2} + \epsilon (\lambda ). \end{aligned}$$

Our game-based definition is equivalent to ORAM’s standard security definition with two exceptions: we allow the adversary to arbitrarily change the state of the on-server storage \(\varSigma \), and we split the ORAM algorithm into \(\phi \) different “pieces” which cannot share state among themselves.

As discussed above, this work assumes that all clients trust each other and do not conspire. For ease of exposition, we assume that all client share key \(\kappa \) used for encryption and MAC computations that we will introduce later.

Consistency: An orthogonal concern to security for multi-client schemes is consistency, whether the clients each see the same version of the database when they access it. Because the clients in our model do not have any way of communicating except through the malicious adversary, it is possible for \(\mathcal {A}\) to “desynchronize” the clients so that their updates are not propagated to each other. Our multi-client ORAM guarantees that in this case the clients still have complete security and access pattern privacy, but consistency cannot be guaranteed. This is a well known problem with the only solution being fork consistency [16], which we achieve.

4 Multi-client Security for Classical ORAMs

We start by transforming two classical ORAM constructions, the original square-root solution by Goldreich [8] and the hierarchical one by Goldreich and Ostrovsky [9], into multi-client secure versions, retaining the same communication complexity per client. Our exposition focuses in the beginning on details for the multi-client square-root ORAM, as the hierarchical ORAM is borrowing from the same ideas.

Recall that the square-root ORAM algorithm works by dividing the server storage into two parts: the main memory and the cache, which is of size \(O(\sqrt{N}))\). The main memory is shuffled by a pseudo-random permutation \(\pi \). Every access reads the entire cache, plus one block in the main memory. If the block the client wants is found in the cache, a “dummy” location is read in the main memory, otherwise the actual location of the target block is read and it is inserted into the cache for later accesses. After \(\sqrt{N}\) accesses, the cache is full and the client must download the entire ORAM and reshuffle it into a fresh state.

Specific Challenge: When considering a multi-client scenario, it becomes easy for a malicious server to break security of the square-root ORAM. For example, client \(u_1\) can access a block x that is not in the cache, requiring \(u_1\) to read \(\pi (x)\) from main memory and insert it into the cache. The malicious server now restores the cache to the state it was in before \(u_1\)’s access added block x. If a second client \(u_2\) also attempts to access block x, the server will now observe that both clients read from the same location \(\pi (x)\) in main memory and know that \(u_1\) and \(u_2\) have accessed the same block (or not). Without the clients having a way to communicate directly with each other and pass information that allows them to verify the changes to the cache, the server can always “rewind” the cache back to a previous state. This will eventually force one client to leak information about their accesses.

Rationale: Our approach for multi-client security is based on the observation that the cache update part of the square-root solution is secure by itself. Updating the cache only involves downloading the cache, changing one element in it, re-encrypting, and finally storing it back on the server. Downloading and later uploading the cache implies always “touching” the same \(\sqrt{N}\) blocks. This is independent of what the malicious server presents to a client as \(\varSigma \) and also independent of the block being updated by the client. Changing values inside the cache cannot leak any information to the server, as its content is always newly IND-CPA encrypted. Succinctly, being similar to a trivial ORAM, updating a cache is automatically multi-client secure.

However, reading can leak information. Reading from the main ORAM is conditional on what the client finds in the cache. We call this part the critical part of the access, and the cache update correspondingly non-critical. To counteract this leakage, we implement the following changes to enable multiple clients for the square-root ORAM:

  1. 1.

    Separate ORAMs: Instead of a single ORAM, we use a sequence of ORAMs, \(\varSigma = \mathsf {ORAM}_1,\mathsf {ORAM}_2,\ldots {},\mathsf {ORAM}_\phi ,\) one for each client. Client \(u_i\) will perform the critical part of their access only on \(\mathsf {ORAM}_{i}\)’s main memory and cache. Thus, each client can guarantee they will not read the same address from their ORAM’s main memory twice. However, any change to the cache as part of ORAM \(\mathsf {Read}(x)\) or \(\mathsf {Write}(x,v)\) operations will be written to every ORAM’s cache. Updating the cache on any ORAM is already guaranteed to be multi-client secure and does not leak information.

  2. 2.

    Authenticated Caches: For each client \(u_i\) to guarantee that they will not repeat access to the main memory of \(\mathsf {ORAM}_i\), the cache is stored together with an encrypted access counter \(\chi \) on the server. Each client stores locally a MAC over both the cache and the encrypted access counter \(\chi \) of their own ORAM. Every access to their own cache increments the counter and updates the MAC. Since clients read only from their own ORAMs, and they can always verify the counter value for the last time that they performed a read, the server cannot roll back beyond that point. Two reads will never be performed with the cache in the same state.

figure b
figure c

4.1 Details

We detail the above ideas in two algorithms: Algorithm 1 shows the per client initialization procedure \(\mathsf {ClientInit}\), and Algorithm 2 describes the way a client performs an \(\mathsf {Access}\) with our multi-client secure square-root ORAM. The \(\mathsf {Init}\) algorithm is trivial in our case, as it initializes \(\varSigma \) to \(\phi \) empty arrays \(\mathsf {ORAM}_j\). Each array is of size \(N+2\cdot \sqrt{N}\) blocks, each block has size B bits.

Before explaining \(\mathsf {ClientAccess}\), we first introduce the notion of an epoch. In general, after \(\sqrt{N}\) accesses to a square-root ORAM, its cache is “full”, and the whole ORAM needs to be re-shuffled. Re-shuffling requires computing a new permutation \(\pi \). Per ORAM, a permutation can be used for \(\sqrt{N}\) operations, i.e., one epoch. The next \(\sqrt{N}\) operations, i.e., the next epoch, will use another permutation and so on. In the two algorithms, we use an epoch counter \(\gamma _i\). Therewith, \(\pi _{i,\gamma _i}\) denotes the permutation of client \(u_i\) in \(\mathsf {ORAM}_i\)’s epoch \(\gamma _i\). For any client, to be able to know the current epoch of \(\mathsf {ORAM}_i\), we store \(\gamma _i\) together with the ORAM’s cache on the server.

On a side note, we point out that there are various ways to generate pseudo-random permutations \(\pi _{i,\gamma _i}\) on N elements in a deterministic fashion. For example, in the a cloud context, one can use \(\mathrm {PRF}_\kappa (i||\gamma _i)\) as the seed in a PRG and therewith perform Knuth’s Algorithm P (Fisher-Yates shuffle) [14]. Alternatively, one can use the idea of random tags followed by oblivious sorting by Goldreich and Ostrovsky [9].

In addition to the epoch counter, we also introduce a per client cache counter \(\chi _i\). Using \(\chi _i\), client \(u_i\) counts the number of accesses of \(u_i\) to the main memory and cache of their own \(\mathsf {ORAM}_i\). After each access to \(\mathsf {ORAM}_i\) by client \(u_i\), \(\chi _i\) is incremented. Each client \(u_i\) keeps a local copy of \(\chi _i\) and therewith verifies freshness of data presented by the server. As we will see below, this method ensures multi-client ORAM security. Note in Algorithm 2 that a client \(u_j\) never increases \(\chi _i\) of another client \(u_i\). Only \(u_i\) ever updates \(\chi _i\).

In our algorithms, \(\mathsf {Enc}_\kappa \) is an IND-CPA encryption such as AES-CBC. For convenience, we only write \(\mathsf {Enc}_\kappa (\)main memory), although the main memory needs to be encrypted block by block to allow for the retrieval of specific blocks. Also, for the encryption of main memory blocks, \(\mathsf {Enc}_\kappa \) offers authenticated encryption such as encrypt-then-MAC.

A client can determine whether a cache is full in Algorithm 2 by the convention that empty blocks in the cache decrypt to \(\bot \). As long as there are blocks in the cache remaining with value \(\bot \), the cache is not full.

\(\mathsf {ClientInit}\): Each client runs the \(\mathsf {ClientInit}\) algorithm to initialize their ORAM. The server stores the ORAMs (with MACs) computed with a single key \(\kappa \). Each client receives their state from the \(\mathsf {ClientInit}\) algorithm, comprising the cache counter. Note that although not captured in the security definition, our scheme also allows for dynamic adding and removing of clients. Removing is as simple as just asking the server to delete one of the ORAMs, and adding could be done by running \(\mathsf {ClientInit}\), but instead of initializing the blocks to be empty, the client first downloads a copy of another client’s ORAM to get the most recent version of the database.

\(\mathsf {Access}\): After verifying the MAC for \(\mathsf {ORAM}_i\) and whether its cache is not from before \(u_i\)’s last access, \(u_i\) performs a standard \(\mathsf {Read}\) or \(\mathsf {Write}\) operation for block x on \(\mathsf {ORAM}_i\). If the cache is full, \(u_i\) re-shuffles \(\mathsf {ORAM}_i\) updating \(\pi \). In addition, \(u_i\) also adds block x to all other clients’ ORAMs. Note that for this, \(u_i\) does not read from the other ORAMs, but only completely downloads and re-encrypts their cache.

Our scheme is effectively running \(\phi \) traditional square-root ORAMs in parallel, making the overall complexity \(O(\phi \sqrt{N})\). Due to limited space, see the full version of this paper [1] for a detailed security analysis.

Fork Consistency: When a client makes an access, they add an element to the cache for all \(\phi \) clients. Therefore, at any given timestep, if the server is not maliciously changing caches, all caches will have the same number of elements in them. Since each cache is verified by a MAC, the server cannot remove individual elements from a cache. The only viable attack is to present an old view of a cache which was at one point valid, but does not contain new updates that have been added by other clients. If the server chooses to do this, he creates a fork between the views of the clients which have seen the update and those that have not. Since the server can never “merge” caches together, but only present entire caches that have been verified with a MAC by a legitimate client, there is no way to reconcile two forks that were created without a client finding out. This achieves fork consistency for our scheme.

Complexity: Making the square-root solution multi-client secure does not induce any additional asymptotic complexity, per client. Each access requires downloading the cache of size \(\sqrt{N}\) and accessing one block from the main memory. Every \(\sqrt{N}\) accesses, the main memory and cache must be shuffled, requiring N communication if the client has enough storage to temporarily hold the database locally. If not, then Goldreich and Ostrovsky [9] noticed that one can use a Batcher sorting network to obliviously shuffle the database with complexity \(O(N \log ^2 N)\), or the AKS algorithm with complexity \(O(N \log N)\). One can also reduce the hidden constant using a more efficient Zig-Zag sort [10]. In the first scenario, the amortized overall complexity is then \(O(\phi \sqrt{N})\), while the second is \(O(\phi \sqrt{N} \log N)\).

Goodrich et al. [11] also propose a way to deamortize the classical square-root ORAM such that it obtains a worst-case overhead factor of \(\sqrt{N} \cdot \log ^2(N)\). Their method involves dividing the work of shuffling over the \(\sqrt{N}\) operations during an epoch such that when the cache is full there is a newly shuffled main memory to swap in right away. Since the shuffling is completely oblivious (does not depend on any pattern of data accesses) and memoryless (the clients only need to know what step of the shuffle they are on in order to continue the shuffle), it can be considered a “non-critical” portion of the algorithm and no special protections need to be added for malicious security.

Note on Computational Complexity: While Algorithm 1 returns the whole updated state \(\varSigma \), in practice a client only needs to update the other clients’ caches (up to \(\sqrt{N}\) times). In addition to the communication complexity involved, there is also computation the client must perform in our scheme. Fortunately, the computation is exactly proportional to the communication and easily quantifiable. Every block of data retrieved from the server has a MAC that must be verified and a layer of encryption that must be removed. Since modern ciphers and hash functions are very efficient, and can even be done in hardware on many computers, communication is the clear bottleneck. For comparison, encryption and MACs are common on almost every secure network protocol, so we consider only the communication overhead in our analysis.

Unified Cache: A natural optimization to this scheme is to have one single shared cache instead of a separate one for each user. If the server behaves honestly, then all caches will contain the same blocks and be in the same state anyway, so a single cache can save some communication and storage. To still protect against a malicious server, one must be careful in this case to store \(\phi \) different counters with the cache and have each client only increment their counter when they do an access. This ensures that if a client inserts a block into the cache it cannot be “rolled back” past the point that they inserted that block without them noticing. Since the cost to reshuffle the ORAMs dominates complexity of our scheme, this optimization does not change asymptotic performance. Resulting in only a small constant improvement and making the presentation and proof unnecessary difficult, we omit full discussion of this technique.

4.2 Hierarchical Construction

In addition to the square-root ORAM, Goldreich and Ostrovsky [9] also propose a generalization which achieves poly-log overhead. In order to do this, it has a hierarchical series of caches instead of a single cache. Each cache has \(2^j\) slots in it, for j from 1 to \(\log N\), where each slot is a bucket holding \(O(\log N)\) blocks. At the bottom of the hierarchy is the main memory which has \(2\cdot {}N\) buckets.

The reader is encouraged to refer to the original paper [9] for full details, but the main idea is that each level of the cache is structured as a hash table. Up to \(2^{j-1}\) blocks can be stored in cache level j, half the space is reserved for dummies like in the previous construction. After accessing \(2^{j-1}\) blocks, the entire level is retrieved and shuffled into the next level. Shuffling involves generating a new hash function and rehashing all the blocks into their new locations in level \(j+1\), until the shuffling percolates all the way to the bottom, and the client must shuffle main memory to start again. Level j must be shuffled after \(2^{j-1}\) accesses, resulting in an amortized poly-logarithmic cost.

To actually access a block, a client queries the caches in order using the unique hash function at each level. When the block is found, the remainder of the queries will be on dummy blocks to hide that the block was already found. After reading, and potentially changing the value of the block, it is added back into the first level of the cache and the cache is shuffled as necessary.

Multi-client Security: As this scheme is a generalization of the square-root one, our modifications extend naturally to provide multi-client security. Again, each client should have their own ORAM which they read from. Writing to other clients’ ORAMs is done by inserting the block into the top level of their cache and then shuffling as necessary. The only difference this time is that each level of the cache must be independently authenticated. Since the cache levels are now hash tables, and computing a MAC over every level for each access would require downloading the whole data structure, we can instead use a Merkle tree [19]. This allows for efficient verification and updating of pieces of the cache without having access to the entire thing, and it maintains poly-logarithmic communication complexity. The root of the Merkle tree will contain the counter that is incremented by the ORAM owner when they perform an access.

Deamortizing: Other authors have proposed deamortized versions of the hierarchical construction that achieve worst-case poly-logarithmic complexity, such as Kushilevitz et al. [15] and Ostrovsky and Shoup [21]. We will use as an example the “warm-up” construction from Kushilevitz et al. [15], Sect. 6.1. This is a direct deamortization of the original hierarchical scheme described above. They deamortize by using three separate hash tables at each level of the ORAM, labelled “active”, “inactive”, and “output”. Instead of shuffling all at one time after \(2^{j-1}\) accesses (which would lead to worst case O(N) complexity), their approach is now different. When the cache fills up at level j, it is marked “inactive”, and the old “inactive” buffer is cleared and marked “active”. The idea will be that the “inactive” buffer is shuffled over time with each ORAM access, so that no worst-case O(N) operations are required. As it is shuffled, the contents are copied into the “output” buffer. Accesses can continue while the “inactive” buffer is being shuffled, as long as a read operation searches both the “active” and “inactive” buffers (since a block could be in either one).

When the shuffle completes, the “output” buffer will contain the newly shuffled contents that go into level \(j+1\). This buffer is marked as “active” for level \(j+1\), the “active” buffer on level j is marked “inactive” and the “inactive” buffer is cleared and marked “active”, restarting the whole process. Since the shuffle is spread out over \(2^{j-1}\) accesses, and the shuffling was the only part that was worst-case O(N), this makes a full construction that now has worst-case \(O(\log ^3 N)\) communication complexity.

In terms of multi-client security, the only important aspects of this process is that no elements be removed from “active” or “inactive” buffers that the owner of the ORAM has put there – until a shuffle is complete, starting a new epoch. The shuffling itself is automatically data oblivious and therewith “non-critical”, in the terms we have established in this paper. Using a Merkle tree and counters, as described in the amortized version, will assure that the server cannot roll back the cache to any state prior to the last access by the owner, guaranteeing security.

Kushilevitz et al. [15] also propose an improved hierarchical scheme that achieves \(O(\log ^2 N / \log \log N)\) complexity, which is substantially more involving. As the deamortized hierarchical ORAM as described above is sufficient for our main contribution in Sect. 5, we leave it to future research to adapt Kushilevitz et al. [15]’s scheme for multi-client security.

5 Tree-Based Construction

While pioneering the research, classical ORAMs have been outperformed by newer tree-based ORAMs which achieve better average and worst-case complexity and low constants in practice. We now proceed to show how these constructions can be modified to also support multiple clients. Our strategy will be similar to before, but with one major twist: in order to avoid linear worst case complexity, tree-based ORAMs do only small local “shuffling,” which turns out to make separating a client access into critical and non-critical parts much more difficult. When writing, one must not only add a new version of the block to the ORAM, but also explicitly mark the old version as obsolete, requiring a conditional access. This is in contrast with our previous construction where old versions of a block would simply be discarded during the shuffle.

5.1 Overview

For this section, we will use Path ORAM [25] as the basis for our multi-client scheme, but the concepts apply similarly to other tree-based schemes.

Although the interface exposed to the client by Path ORAM is the same as other ORAM protocols, it is easiest to understand the \(\mathsf {Access}\) operation as being broken down into three parts: \(\mathsf {ReadAndRemove}\), \(\mathsf {Add}\), and \(\mathsf {Evict}\) [23]. \(\mathsf {ReadAndRemove}\), as the name suggests, reads a block from the ORAM and removes it, while \(\mathsf {Add}\) adds it back to the ORAM, potentially with a different value. These two operations used together form the basis of the \(\mathsf {Access}\) operation, but it begins to illustrate the difficulty we have making this scheme multi-client secure: changing the value of a block implicitly requires reading it, meaning that both reading and writing are equally critical and not easily separated as our previous construction. The third operation, \(\mathsf {Evict}\), is a partial shuffling that is done after each access in order to maintain the integrity of the tree.

The RAM in Path ORAM is structured as a tree with N leaf nodes. Each node in the tree holds up to Z blocks, where Z is a small constant. Each block is tagged with a value uniform in the range [0, N). As an invariant, blocks will always be located on the path from the root of the tree to the leaf node corresponding to their tag. Over the lifecycle of the tree, blocks will enter at the root and filter their way down toward the leaves, making room for new blocks to in turn enter at the root. The client has a map storing for every block which leaf node the block is tagged for.

\(\mathsf {ReadAndRemove}\): To retrieve block x, the client looks up in the map which leaf node it is tagged for and retrieves all nodes from the root to that leaf node, denoted \(\mathcal {P}(x)\). By the tree invariant, block x will be found somewhere on the path \(\mathcal {P}(x)\). The client then removes block x from the node it was found in, reencrypts all the nodes and puts them back in the RAM.

\(\mathsf {Add}\): To put a block back in the ORAM, the client simply retrieves the root node and inserts the block into one of its free slots, reencrypting and writing the node back afterwards. The map is updated with a new random tag for this block in the interval [0, N). If there is not enough room in the root node, the client keeps the block locally in a “stash” of size \(Y=O(\log {}N)\), waiting for a later opportunity to insert the block into the tree.

\(\mathsf {Evict}\): So that the stash does not become too large, after every operation the client also performs an eviction which moves blocks down the tree to free up space. Eviction consists of picking a path in the tree (using reverse lexicographic order [6]) and moving blocks on that path as far down the tree as they can go, without violating the invariant. Additionally, the client inserts any matching block from the stash into the path.

\(\mathsf {Recursive}\) \(\mathsf {Map}\): Typically, the client’s map, which stores the tag for each block, has size \(O(N\cdot \log {}N)\) bit and is often too large to store locally. Yet, if block size B is at least \(2\cdot \log {N}\) bit, the map can itself be stored recursively in an ORAM on the server, inducing a total communication complexity of \(O(\log ^2{N})\) blocks. Additionally, Stefanov et al. [25] show that if \(B = \varOmega {}(\log ^2 N)\) bit, communication complexity can be reduced to \(O(\log {N})\) blocks.

Integrity: Because of its tree structure, it is straightforward to ensure integrity in Path ORAM. Similar to a Merkle tree, the client can store a MAC in every node of the tree that is computed over the contents of that node and the respective MACs of its two children. Since the client accesses entire paths in the tree at once, verifying and updating the MAC values when an access is done incurs minimal overhead. This is a common strategy with tree-based ORAMs, which we will make integral use of in our scheme. We will also include client \(u_i\)’s counter \(\chi _u\) in the root MAC as before, to prevent rollback attacks (see below).

Challenge. Looking at Path ORAM, there exist several additional challenges when trying to add multi-client capabilities with our previous strategy. First, recursively storing the map into ORAMs imposes a problem. To resolve a tag, each path accessed in the recursive ORAMs has to be different for each client. If we separate the map into \(\phi \) separate ORAMs (which we will do), the standard recursive lookup results in a large blowup in communication costs. At the top level of the recursion, we would have \(\phi \) ORAMs, one for each client. Yet, each of those will fan out to \(\phi \) ORAMs to obliviously support the next level of recursion, each of which will have \(\phi \) more, going down \(\log n\) levels. The overall communication complexity for the tag lookup would be \(\phi ^{\log N} \in \varOmega (N)\).

Second, an \(\mathsf {Add}\) in Path ORAM cannot be performed without \(\mathsf {ReadAndRemove}\), so we cannot easily split the access into critical and non-critical parts like before.

Rationale. To remedy these problems, we institute major changes to Path ORAM:

  1. 1.

    Unified Tagging: Instead of separately tagging blocks in each of the ORAMs and storing the tags recursively, we will have a unified tagging system where the tag for a block can be computed for any of the separate ORAMs from a common “base tag.” This is crucial to avoiding the O(N) communication overhead that would otherwise be induced by the recursive map as described above. For a block x, the map will resolve to a base tag value t. This same tag value is stored in every client’s recursive ORAM. Let h be a PRF mapping from \([0,2^\lambda ) \times [1,\phi ]\) to [0, N). The idea is now that the leaf that block x will be percolating to in the recursive ORAM tree differs for every ORAM of every client \(u_i\) and is pseudo-randomly determined by value h(ti). This way, (1) the paths accessed in all recursive map ORAMs for all clients differ for the same block x, and (2) only one lookup is necessary at each level of the recursive map to get the leaf node tag for all \(\phi \) ORAMs.

  2. 2.

    Secure Block Removal: The central problem with \(\mathsf {ReadAndRemove}\) is that it is required before every \(\mathsf {Add}\) so that the tree will not fill up with old, obsolete blocks which cannot be removed. Unlike the square-root ORAM, the shuffling process (eviction) happens locally and cannot know about other versions of a block which exist on different paths. We solve this problem by including metadata on each bucket. For every node in the tree, we include an encrypted array which indicates the ID of every block in that node. Removing a block from the tree can then be performed by simply changing the metadata to indicate that the slot is empty. It will be overwritten by the eviction routine with a real block if that slot is ever needed. If B is large, this metadata is substantially smaller than the real blocks. We can then store it in a less efficient classical ORAM described above which is itself multi-client secure. This allows us to take advantage of the better complexity provided by tree-based ORAMs for the majority of the data, while falling back on a simpler ORAM for the metadata which is independent of B.

We also note that Path ORAM’s stash concept cannot be used in a multi-client setting. Since the clients do not have a way of communicating with each other out of band, all shared state (which includes the stash) must be stored in the RAM. This has already been noted by Goodrich et al. [12], and since the size of the stash does not exceed \(\log N\), storing it in the RAM (encrypted and integrity protected) does not affect the overall complexity. As before, we also introduce an eviction counter e for each ORAM. Client \(u_i\) will verify whether, for each of their recursive ORAMs, this eviction counter is fresh.

5.2 Details

figure d
figure e
figure f

To initialize the multi-client ORAM (Algorithm 3), \(\phi \) separate ORAMs are created and the initial states (containing the shared key) are distributed to each client. For each client \(u_i\), the ORAM takes the form of a series of trees \(T_{j,i}\). The first tree stores the data blocks, while the remaining trees recursively store the map which relates block addresses to leaf nodes. In addition to this, as described above, each tree has its own sub-ORAM to keep track of block metadata. The stash of each (sub-)ORAM is called \(S_{0,i}\), and the metadata (classical) ORAM \(M_{j,i}\).

To avoid confusion between different ORAM initialization functions, \(\mathsf {MInit}\) is a reference to Algorithm 1, i.e., initialization of a multi-client secure classical ORAM.

For simplicity, we assume that \(\mathsf {Enc}_\kappa \) encrypts each node of a tree separately, therewith allowing individual node access. Also, we assume authenticated encryption, using the per node integrity protection previously mentioned.

As noted above, the functions \((\mathsf {ReadAndRemove}, \mathsf {Add})\) can be used to implement \((\mathsf {Read}, \mathsf {Write})\), which in turn can implement a simple interface \((\mathsf {Access})\). Because our construction introduces dependencies between \(\mathsf {ReadAndRemove}\) and \(\mathsf {Add}\), in Algorithm 4 we illustrate a unified \(\mathsf {Access}\) function for our scheme. The client starts with the root block and traverses the recursive map upwards, finds the address of block x, and finally retrieves it from the main tree. For each recursive tree, it retrieves a tag value t allowing to locate the correct block in the next tree. After retrieving a block in each tree, the client marks that block as free in the metadata ORAM so that it can be overwritten during a future eviction. This is necessary to maintain the integrity of the tree and ensure that it does not overflow. At the same time, the client also marks that block free in the metadata of each other client and inserts the new block value into the root of their trees. This is analogous to the previous scheme where a client reads from their own ORAM and writes back to the ORAMs of the other clients. We use a straightforward MAC technique for paths \(\mathsf {MACPath}\), which we present in the extended version of this paper [1].

Again, we avoid confusion between different ORAM access operations by referring to the multi-client secure classical ORAM access operation of Algorithm 2 as \(\mathsf {MAccess}\).

Algorithm 5 illustrates the eviction procedure. Since eviction does not take as input any client access, it is non-critical. The client simply downloads a path in the tree which is specified by eviction counter e and retrieves it in its entirety. The only modification that we make from the original Path ORAM scheme is that we read block metadata from the sub-ORAM that indicates which blocks in the path are free and can be overwritten by new blocks being pushed down the tree.

The overhead from the additional metadata ORAMs that we have in our construction is fortunately not dependent on the block size B. Therefore, if B is large enough, we can achieve as low as the overhead of single-client Path ORAM, \(O(\log N)\), or a total complexity of \(O(\phi \log N)\) for \(\phi \) users. However, this only applies if the block size B is sufficiently large, at least \(\varOmega (\log ^4 N)\). Otherwise, for smaller B, the complexity can be up to \(O(\phi \log ^5 N)\). Due to limited space, we refer to the extended version of this paper [1] for a detailed security analysis.

Complexity: The complexity of our scheme is dominated by the cost of an eviction. For a client to read a path in each of \(O(\log N)\) recursive trees, for each of the \(\phi \) different ORAMs, it takes \(O(\phi \cdot {}B \cdot \log ^2 N )\) bits of communication. Additionally, the client must make \(O(\phi \cdot \log ^2 N)\) accesses to a metadata ORAM. If \(\mu (N, B)\) denotes the cost of a single access in such a sub-ORAM, the overall communication complexity is then \(O(\phi \cdot \log ^2 N\cdot [B + \mu (N, \log N)])\) bit. The deamortized hierarchical ORAM by Kushilevitz et al. [15] has \(O(\log ^3{N})\) blocks communication complexity, where each block is of size \(\log N\) bit (the meta-data we need for our construction). Taking this hierarchical ORAM as a sub-ORAM, the total communication complexity computes to \(O(\phi \cdot \log ^2 N [B + \log ^4 N])\) bits. If \(B \in \varOmega (\log ^4 N)\) then the communication complexity, in terms of blocks, is \(O(\phi \log ^2 N)\), otherwise it is at most \(O(\phi \log ^5 N)\), i.e., with the assumption \(B\in \varOmega (\log {N})\) (the minimal possible block size for Path ORAM to work).

Additionally, if we use the recursive optimization trick from Stefanov et al. [25] to reduce the overhead from the Path ORAM part of the construction from \(O(\log ^2 N)\) to \(O(\log N)\), we can achieve a total complexity of \(O(\log N)\) for blocks of size \(\varOmega (\log ^4 N)\).

Although a complexity linear in \(\phi \) may seem at first to be expensive, we stress that this is a substantial improvement over naive solutions which achieve the same level of security. The only straightforward way to have multi-client security against malicious servers is for each client to append their updates to a master list, and for clients to scan this list to find the most updated version of a block during reads. This is not only linear in the size of the database, but in the number of operations performed over the entire life of the ORAM.

One notable difference in parameters from basic Path ORAM is that we require a block size of at least \(c \cdot \lambda \), where \(c \ge 2\). Path ORAM only needs \(c\cdot \log n\), and for security parameter \(\lambda \), \(\lambda >\log N\) holds. In our scheme, the map trees do not directly hold addresses, but t values which are of size \(\lambda \). In order for the map recursion to terminate in \(O(\log N)\) steps, blocks must be big enough to hold at least two t values of size \(\lambda \). If the block size is \(\varOmega (\lambda ^2)\), we can also take advantage of the asymmetric block optimization from Stefanov et al. [25] to reduce the complexity to \(O(\phi \cdot (\log ^6 n + B\cdot \log N)\). Then, if additionally \(B\in \varOmega (\log ^5 N)\), the total complexity is reduced to \(O(\log N)\) per client.

6 Conclusion

We have presented the first techniques that allow multi-client ORAM, specifically secure against fully malicious servers. Our multi-client ORAMs are reasonably efficient with communication complexities as low as \(O(\log N)\) per client. Future work will focus on efficiency improvements, including reducing worst-case complexity to sublinear in \(\phi \). Additionally, the question of whether tree-based constructions are more efficient than classical ones is not as clear in the multi-client setting as it is for a single client. Although tree ORAMs are more efficient for a number of parameter choices, they incur substantial overhead from using sub-ORAMs to hold tree metadata. This is not required for the classical constructions. Future research may focus on achieving a “pure” tree-based construction which does not depend on another ORAM as a subroutine. Finally, it may be interesting to investigate whether multiple clients can be supported with a more fine-grained access control, secure against fully malicious servers.