Forward-Secure Searchable Encryption on Labeled Bipartite Graphs

Lai, Russell W. F.; Chow, Sherman S. M.

doi:10.1007/978-3-319-61204-1_24

Russell W. F. Lai¹⁶ &
Sherman S. M. Chow¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10355))

Included in the following conference series:

International Conference on Applied Cryptography and Network Security

2957 Accesses

Abstract

Forward privacy is a trending security notion of dynamic searchable symmetric encryption (DSSE). It guarantees the privacy of newly added data against the server who has knowledge of previous queries. The notion was very recently formalized by Bost (CCS ’16) independently, yet the definition given is imprecise to capture how forward secure a scheme is. We further the study of forward privacy by proposing a generalized definition parametrized by a set of updates and restrictions on them. We then construct two forward private DSSE schemes over labeled bipartite graphs, as a generalization of those supporting keyword search over text files. The first is a generic construction from any DSSE, and the other is a concrete construction from scratch. For the latter, we designed a novel data structure called cascaded triangles, in which traversals can be performed in parallel while updates only affect the local regions around the updated nodes. Besides neighbor queries, our schemes support flexible edge additions and intelligent node deletions: The server can delete all edges connected to a given node, without having the client specify all the edges.

Sherman Chow is supported in part by General Research Fund (Grant No. 14201914) and the Early Career Award from Research Grants Council, Hong Kong; and Huawei Innovation Research Program (HIRP) 2015 (Project No. YB2015110147).

You have full access to this open access chapter, Download conference paper PDF

SecGDB: Graph Encryption for Exact Shortest Distance Queries with Efficient Updates

Verifiable and Forward Secure Dynamic Searchable Symmetric Encryption with Storage Efficiency

Partitioned Searchable Encryption

1 Introduction

In searchable symmetric encryption (SSE), an encrypted database can be queried with minimal leakage of information about the plaintext database to the hosting server. The client is additionally allowed to update the encrypted database in dynamic SSE (DSSE) without reencrypting from scratch. Since its introduction [16], many SSE schemes with different trade-offs between efficiency, security, and query expressiveness have been proposed [1]. Most earlier schemes were not dynamic. The first sublinear dynamic SSE scheme was proposed by Kamara et al. [11], but the query and update operations are inherently sequential. Some later schemes [8, 10, 17] feature parallelizable algorithms for queries and updates. Parallelism made DSSE an attractive solution for outsourcing data to cloud platform which fully leverages the multiprocessors.

1.1 Security and Forward Privacy of SSE Schemes

Ideally, the knowledge of an encrypted database, together with a sequence of adaptively issued queries and updates, should not reveal any information about the plaintext database and the query results to the server. Although this can be achieved theoretically through techniques involving obfuscation [4] or oblivious RAM [7], the resulting solutions are not particularly efficient. Typically, a practical DSSE scheme tolerates the leakage of search and access patterns during queries, and some internal structure of the encrypted database during updates. Formally, the security is parametrized by a set of leakage functions describing these leakages. While some leakages seem to pose no harm, some have been exploited in attacks [9, 18].

Forward privacy, advocated by Stefanov et al. [17], requires that newly added data remains private against the server, who has knowledge about previous queries. The property is arguably essential to all DSSE schemes, for otherwise, the ability to update in DSSE is somewhat useless as future data are less protected. Indeed, one of the recent attacks by Zhang et al. [18] exploits the leakage during updates in non-forward-private schemes.

Only a limited number of solutions [2, 7, 15, 17] in the literature claimed to have forward privacy. The notion is not well understood in the earlier works [7, 15, 17], and is only formally defined recently by Bost [2]. However, we argue that this definition cannot precisely describe in what sense a DSSE scheme is forward private. (See discussion in Sect. 3.2 for details.)

1.2 Our Formulation

We consider DSSE over labeled bipartite graphs, where nodes can be partitioned into two disjoint subsets X and Y, such that edges never connect two nodes from the same partition, and each edge is labeled with data from the set W. A neighbor query on node $x \in X$ (or $y \in Y$), returns a sequence of $(x,y,w) \in X \times Y \times W$ tuples if (x, y) is an edge on the graph labeled with w.

This abstract setting captures typical DSSE queries such as keyword searches over files (considering X and Y as the sets of keywords and files respectively), and labeled subgraphs queries over general graphs (considering X and Y as sets of nodes with outgoing and incoming edges respectively, and W as the set of edge labels). To generalize, we also consider neighbor queries over the entire bipartite graphs, i.e., both X and Y, which enables interesting bi-directional searching applications. Bi-directional search is useful to efficiently support update in DSSE [11] (since deleting a file in a DSSE supporting keyword searches implicitly requires finding all keywords the file contains). It also opens possibilities of interesting new queries such as related keyword search (which first searches for documents containing the queried keyword, then collects other keywords which are also contained in many of the matching documents).

1.3 Update Functionalities of DSSE Schemes

While the query functionality of SSE for keyword search is somewhat standard, the supported update types have large variations. Updates include additions and deletions, and can be edge-based or node-based. Schemes supporting only node additions are reasonable for some data type: e.g., x as keywords and y as text files. Yet, edge updates allow fine-grained modification of existing data. In particular, schemes supporting edge additions are superior to those supporting only node additions, as the latter can be simulated by the former.

The benefits of supporting only edge deletions are however questionable, as they require the client to know about the edge to be deleted. It is unrealistic for the motivating application of SSE for keyword search: The client needs to know all keywords of a given file to completely remove the file from the server. It is desirable for a DSSE scheme to support node deletions: upon provided a node y from the client, the server can intelligently remove all edges connecting y.

To the best of our knowledge, most existing schemes only support either edge-based updates or node-based updates^{Footnote 1}. Supporting edge additions and node deletions simultaneously, while confining leakage, poses some technical challenges.

1.4 SSE as a Data Structure Problem

With edge additions and node deletions in mind, it is not an easy task to devise a parallel and dynamic (let alone forward private) SSE scheme. Intuitively, for data structures supporting parallel traversal, maintaining the traversal efficiency after an update often requires some global adjustment of the data structure. Consider a balanced binary search tree. A series of deletions can degenerate the tree into a linked list which requires sequential access; and balancing the tree may require the rotation of multiple tree nodes. (That may explain why the first parallel DSSE [10] utilizing a red-black tree which only stores all the files in the leaf level, resulting in an efficiency loss when compared with storing some of them in internal nodes.) In the context of DSSE, delegating the maintenance work to the server often implies excessive leakage of the internal data structure.

A notable approach for (non-dynamic) SSE schemes is the invert-index used by Curtmola et al. [5, 6], in which the encrypted database consists of an index mapping hashed keywords to sets of files containing the keywords. This inverted-index allows the server to search in time linear in the number of matching files, which is optimal. Many subsequent works follow this framework (explicitly or implicitly), which utilize some data structure to represent the sets of data, pointed by the (hashed) queries in the index. The efficiency of queries and updates correspond to the efficiency of traversing and updating the sets respectively. On the other hand, the leakage of the internal structure of the encrypted database during updates corresponds to the amount of information required or changed to update the data structure storing the sets. Most efforts for designing (D)SSE schemes is dedicated to choosing or designing this data structure.

1.5 Our Results

This work furthers the study of forward privacy of DSSE schemes over labeled bipartite graphs. We present three technical results. First, we give another formal, generalized definition of forward privacy. Specifically, our definition is parametrized by a set of updates and a restriction function on these updates. It generalizes the only existing one by Bost [2], by increasing the number of classes of leakage functions allowed, yet making each class more specific. Our definition still captures the essence of forward privacy even though the leakage in different classes might vary substantially. Since different existing SSE schemes implicitly assumed different flavors of forward privacy, we believe that our generalized, parameterized definition of forward privacy is of particular interest.

Second, we propose a simple generic construction of forward private DSSE from any DSSE, which preserves the efficiency of the base scheme. The forward privacy obtained is for edge additions, such that the addition of an edge (x, y) does not leak both x and y, hence hiding the edge. This generic transformation provides insights of what constitutes forward privacy in DSSE. Since the result applies on any DSSE, again we believe it is of independent interest.

Lastly, we construct a DSSE scheme from scratch which achieves a stronger forward privacy for edge additions, such that the addition of an edge (x, y) does not leak either x or y. Our construction utilizes a specially crafted data structure named cascaded triangles^{Footnote 2}, which supports parallel queries and updates, and has the property that adding or deleting data only affects a constant amount of existing data. Thanks to cascaded triangles, our construction features minimal leakage, and optimal query and update complexity in terms of both computation and communication up to a constant factor.

Both of our constructions support flexible edge additions and intelligent node deletions: The server can delete all edges connected to a given node, without having the client specify all the edges. It is one of a few in the literature(See footnote 1).

2 Definitions

We present the necessary definitions for data representation and DSSE. For more detailed explanations, we refer the readers to the full version.

2.1 Data Representation

Let $\mathcal {X} $, $\mathcal {Y} $, and $\mathcal {W} $ be sets, where $\mathcal {X} $ and $\mathcal {Y} $ are disjoint, i.e., $\mathcal {X} \cap \mathcal {Y} = \phi $. We denote by $\mathcal {G} = \mathcal {G}(\mathcal {X},\mathcal {Y},\mathcal {W})$ a set of labeled bipartite graphs specified by these sets. For a labeled bipartite graph $G \in \mathcal {G}$, its edges are labeled with $w \in W \subseteq \mathcal {W} $, and are running across $X \subseteq \mathcal {X} $ and $Y \subseteq \mathcal {Y} $. Each edge can be uniquely represented by the tuple $(x,y,w) \in \mathcal {X} \times \mathcal {Y} \times \mathcal {W} $. The neighbor query function ${\mathsf {Qry}}$ maps a node $q = x$ (or $q = y$) and a graph G to a set of all edges connecting node x, denoted by $(x,*,*)$ (or a set of all edges connecting node y, denoted by $(*,y,*)$). Similarly, we use $(x,y,*)$ to denote the set of edges in the form of $(x,y,\cdot )$, which should be a singleton. The update function ${\mathsf {Udt}} $ maps an update u and a graph G to a new graph $G'$. The update $u = ({\mathsf {Op}},\cdot ,\cdot ,\cdot )$ where ${\mathsf {Op}} = {\mathsf {Add}} $ or ${\mathsf {Op}} = {\mathsf {Del}} $ takes one of the following forms:

1.
Edge Addition: $u = ({\mathsf {Add}},x,y,w)$ adds the edge (x, y, w) to G.
2.
Node Deletion: $u = ({\mathsf {Del}},q)$ deletes the set of edges $(q,*,*)$ (or $(*,q,*)$) from G. Since $\mathcal {X} $ and $\mathcal {Y} $ are disjoint, there is no ambiguity.

2.2 Dynamic Searchable Symmetric Encryption (DSSE)

We present a definition of DSSE for the labeled bipartite graphs defined above, and briefly describe its security against adaptive chosen query attack (CQA2).

Definition 1

A dynamic symmetric searchable encryption (DSSE) scheme for the space of labeled bipartite graphs specified by $\mathcal {G}$ is a tuple of algorithms and interactive protocols ${\mathsf {DSSE}}.({\mathsf {Setup}},{\mathsf {Qry}}_e,{\mathsf {Udt}} _e)$ such that:

$(K, {\mathsf {EDB}}) \leftarrow {\mathsf {Setup}} (1^{\lambda })$: In the setup algorithm, the user inputs the security parameter $\lambda $. It outputs a secret key K, and an (initially empty) encrypted database ${\mathsf {EDB}} $ to be outsourced to the server. Alternatively, one can define a setup algorithm which takes as input the security parameter $\lambda $, and a graph G. In this case, the algorithm outputs a key K and an encrypted database ${\mathsf {EDB}} $ encrypting G.
$((K', R), ({\mathsf {EDB}} ', R) \leftarrow {\mathsf {Qry}}_e((K,q),{\mathsf {EDB}})$: In the query protocol, the user inputs a secret key K and a query $q \in \mathcal {X} \cup \mathcal {Y} $. The server inputs the encrypted database ${\mathsf {EDB}} $. The user outputs a possibly updated key $K'$, while the server outputs a possibly updated encrypted database ${\mathsf {EDB}} '$. Both the user and the server output a sequence of responses R. For non-interactive schemes, the user first runs $(K',\tau _q) \leftarrow {\mathsf {QryTkn}}(K,q)$ to generate a query token $\tau _q$. The server then runs $({\mathsf {EDB}} ', R) \leftarrow {\mathsf {Qry}}_e(\tau _q, {\mathsf {EDB}})$ and outputs the query results R.
$(K', {\mathsf {EDB}} ') \leftarrow {\mathsf {Udt}} _e((K,u), {\mathsf {EDB}})$: In the update protocol, the user inputs a secret key K and an update $u \in \{{\mathsf {Add}},{\mathsf {Del}} \} \times (\mathcal {X} \cup \mathcal {Y})$. The server inputs the encrypted database ${\mathsf {EDB}} $. The user outputs a possibly updated key $K'$. The server outputs an updated encrypted database ${\mathsf {EDB}} '$. For non-interactive schemes, the user first runs $(K',\tau _u) \leftarrow {\mathsf {UdtTkn}}(K,u)$ to generate an update token $\tau _u$. The server then runs ${\mathsf {EDB}} ' \leftarrow {\mathsf {Udt}} _e(\tau _u, {\mathsf {EDB}})$ and outputs the updated database ${\mathsf {EDB}} '$.

A ${\mathsf {DSSE}} $ scheme for the space $\mathcal {G}$ is said to be correct if, for all $\lambda \in \mathbb {N}$, all K and ${\mathsf {EDB}} $ output by ${\mathsf {Setup}} (1^{\lambda })$, and all sequences of queries and updates, the responses to the plaintext queries equal those to the corresponding encrypted queries.

Let ${\mathcal {L}}_{e}$, ${\mathcal {L}}_{q}$, and ${\mathcal {L}}_{u}$ be stateful leakage algorithms. A DSSE scheme is said to be (${\mathcal {L}}_{e}$, ${\mathcal {L}}_{q}$, ${\mathcal {L}}_{u}$)-secure against adaptive dynamic chosen-query attacks (CQA2), if there exists a simulator which, when given the leakages specified by the leakage algorithms, indistinguishably simulates the encrypted database, the query results, and the updates. The formal definition can be found in the full version.

3 Forward Privacy in DSSE

Intuitively, forward privacy ensures that newly added data remains hidden to the server who might have learned some secrets during previous queries, until it must be revealed by a later query. To formalize, instead of tinkering with the CQA2-security definition of DSSE, we define forward privacy based on the property of the leakage function ${\mathcal {L}}_{u}$. This is more convenient since the information leaked by ${\mathcal {L}}_{u}$ is sufficient for the simulator in the CQA2-security definition to simulate the updates. Similar to the semantic security of encryption schemes, we require that the leakages ${\mathcal {L}}_{u}$ on a pair of updates are indistinguishable, capturing the idea that not even a single bit about the update is leaked to the server. Note that the definition does not limit the types of the update u. Indeed, we can consider forward privacy for not only additions, but also deletions. In layman terms, suppose that the server learns during the query protocol about the association of q with some data, which is then deleted by the client. The server should not notice that the data are deleted until q is queried again: By that time the server can compare the query results and discover the deletion.

3.1 Our Definition

We first give a general definition of forward privacy parametrized by a set of updates and a restriction function, then discuss useful ways to parameterize it.

Definition 2

(Forward Privacy). Let ${\mathsf {DSSE}} $ be a $({\mathcal {L}}_{e},{\mathcal {L}}_{q},{\mathcal {L}}_{u})$-CQA2 secure DSSE scheme for labeled bipartite graphs specified by $\mathcal {G}$. Let $\mathcal {U} $ be a set of updates and restriction $p: \mathcal {U} ^2 \rightarrow \{0,1\}^{} $ be a predicate function. We say that ${\mathsf {DSSE}} $ is: $(\mathcal {U}, p)$-forward private, if for any $u_b \in \mathcal {U} $ where $b \in \{0,1\}^{} $ such that $p(u_0, u_1) = 1$, and any PPT distinguisher $\mathcal {D}$, it holds that

$$\begin{aligned} |\Pr [\mathcal {D}({\mathcal {L}}_{u}(u_{0})) = 1] - \Pr [\mathcal {D}({\mathcal {L}}_{u}({u_{1}})) = 1]| \le \mathsf {negl}(\lambda ). \end{aligned}$$

Table 1 lists some useful combinations of $\mathcal {U} $ and p, denoted by $\mathcal {U} _i$ and $p_i$ respectively for $i \in [6]$. One may also consider a set $\mathcal {U} $ which is a union of some of the (disjoint) $\mathcal {U} _i$’s, and a restriction p which is a composition of the corresponding $p_i$’s: For $p_i: \mathcal {U} ^2_i \rightarrow \{0,1\}^{} $, define $p = p_i + p_j: (\mathcal {U} _i \cup \mathcal {U} _j)^2 \rightarrow \{0,1\}^{} $ such that $(p_i+p_j)(u) = 1$ if $u \in \mathcal {U} _i$ and $p_i(u) = 1$, or $u \in \mathcal {U} _j$ and $p_j(u) = 1$.

Note that there might exist schemes which are $(\mathcal {U} _i, p_i)$- and $(\mathcal {U} _j, p_j)$-forward private but not $(\mathcal {U} _i \cup \mathcal {U} _j, p_i+p_j)$-forward private. For example, if $\mathcal {U} _i$ and $\mathcal {U} _j$ are sets of additions and deletions respectively, while the distinguisher cannot tell which addition or deletion is chosen, it can separate additions from deletions.

Table 1. Useful combinations of $\mathcal {U} $ and p as parameters for forward privacy. (See Sect. 2.1 and the full version for details about the update operations.)

Full size table

3.2 Bost’s Definition

The forward privacy definition of Bost [2] requires that the leakage function of $u = ({\mathsf {Op}}, x, y, w)$ can be written as a function of the operation ${\mathsf {Op}} $, the node y, and $|(*,y,*)|$, i.e., the number of edges connected to y, where ${\mathsf {Op}} $ can be addition or deletion. We argue that the range of leakage functions which satisfy this requirement is so wide, such that the definition does not precisely describe in what sense a DSSE scheme is forward secure. At one extreme, consider a scheme of which the leakage function is given by ${\mathcal {L}}_{u}({\mathsf {Op}}, x, y, w) = ({\mathsf {Op}}, y, |(*,y,*)|)$. If ${\mathcal {L}}_{u}$ accurately (not overly) captures the leakage, then adding or removing edges to or from a node y leaks the identity of the node y itself. Such scheme is vulnerable to frequency attacks: The attacker keeps a table mapping each y to the number of times y is updated. With the aid of external information, it can possibly extract information about y, such as its importance. Similar attacks have been demonstrated using search patterns [14]. At another extreme, Bost’s construction [2] leaks nothing during updates, i.e., ${\mathcal {L}}_{u}({\mathsf {Op}}, x, y, w) = \phi $. On the other hand, Bost’s definition is also not general enough. There are other types of leakage functions, e.g., ${\mathcal {L}}_{u}({\mathsf {Op}}, x, y, w) = ({\mathsf {Op}}, x, |(x,*,*)|)$, which intuitively capture forward privacy, but are not covered by this definition. In contrast, our definition of $(\mathcal {U}, p)$-forward privacy classified different types of update in a more fine-grained manner.

In the perspective of our model, Bost’s definition can be regarded as special cases of $(\mathcal {U} _1, p_1)$ - (achieved by our generic construction in Sect. 4) and $(\mathcal {U} _4, p_4)$-forward privacy. Intuitively, the restrictions $p_1$ and $p_4$ mean that an update $u = ({\mathsf {Op}}, x, y, w)$ is protected by hiding one end of the connection between x and y. Therefore, similar to the above, it might be the case that the updates $u_0$ and $u_1$, where $u_b = ({\mathsf {Add}}, x_b, y, w)$ for $b \in \{0,1\}^{} $, are linkable as they correspond to the same node y, making the scheme vulnerable to the same frequency attack. On the other hand, $(\mathcal {U} _1, 1)$ - (achieved by our concrete construction in Sect. 5) and $(\mathcal {U} _4, 1)$-forward privacy completely hides the relation (x, y), making the addition and deletion of an edge (x, y, w) oblivious respectively (since w can be hidden simply by symmetric key encryption). Whether or not the stronger forward privacy is needed depends on the specific application scenarios.

3.3 Forward Privacy for Deletions

In the rest of this work, we focus on $\mathcal {U} = \mathcal {U} _1 = \{({\mathsf {Add}}, x, y, w)\}$, with and without restrictions, i.e., $p = p_1$ and $p \equiv 1$, respectively. In other words, we do not consider forward privacy for deletions. To argue for this design decision, we observe that while the scheme of Bost [2] performs “lazy edge deletion” (adding “deleted” edges rather than actually deleting), ours perform actual node deletion. The former increases the size of the encrypted database (by one edge), hence it is possible to make (edge) additions and deletions indistinguishable. The latter allows immediate space reclamation. This makes edge additions (which usually increase the size of the encrypted database) and node deletions easily distinguishable. Furthermore, since actual deletions of different nodes may result in a shrink of the database size in varying degrees, they are also easily distinguishable. In effect, we trade “forward privacy for (lazy) deletions” for efficiency. If forward privacy for deletions is a concern and lazy deletions are acceptable, using a similar technique of maintaining another instance of encrypted database for lazy deletions [2], schemes which are forward private for edge additions can be generically transformed to provide forward privacy for both edge additions and node deletions (simultaneously): To delete a node $y \in Y$, the client adds an edge connecting y to a special “deleted” node in X. We leave the details to the full version of this paper. In this sense, forward privacy for additions is a key property. We also believe that it is sufficient for practical applications by itself.

4 Forward Privacy from Any DSSE

In this section, we will show that a DSSE scheme with $(\mathcal {U} _1,p_1)$-forward privacy can be constructed from any DSSE scheme $\mathcal {E} $, where $\mathcal {U} _1$ and $p_1$ are defined in Table 1. For simplicity of our description below, we assume the base scheme to be non-interactive, so that the resulting scheme is also non-interactive^{Footnote 3}. Our transformation can be easily adapted to interactive schemes.

Our construction is inspired by that of Rizomiliotis and Gritzalis [15] which uses fresh keys for newly added data. The main idea is to locally maintain a table $\gamma $ of pseudorandom function (PRF) keys $K_x$ and counters $c_x$ for each query $q = x$, so that adding an edge (x, y, w) is translated to adding another edge $(F(K_x,c_x),y,w)$. Our scheme also adopts the technique of Hahn and Kerschbaum [8], who observe that when the set $(x,*,*)$ is leaked upon querying on x, there is no need to protect the set by encryption any longer. To speed up subsequent queries, the server should thus transfer the set encrypted in the scheme to a plaintext bipartite graph ${\hat{G}}$. We assume an efficient data structure for representing the graph ${\hat{G}}$, so that neighbor queries, edge additions, and node deletions in ${\hat{G}}$ are parallelizable and have time complexity linear in the number of affected nodes only. There might be many ways to construct such a data structure. Cascaded triangles introduced in Sect. 5 is one example.

4.1 Our Construction

Let $\mathcal {E} $ be a DSSE scheme for $\mathcal {G} = \mathcal {G}(\{0,1\}^{\lambda },\mathcal {Y},\mathcal {W})$, and $F: \{0,1\}^{\lambda } \times \mathcal {X} \rightarrow \{0,1\}^{\lambda } $ be a pseudorandom function (PRF). We construct a DSSE scheme for $\mathcal {G}' = \mathcal {G}'(\mathcal {X},\mathcal {Y},\mathcal {W})$. The resulting scheme supports queries over $\mathcal {X} $, assuming the base scheme supports queries over $\{0,1\}^{\lambda } $. Figures 1 and 2 formally describe the construction.

The setup algorithm initializes the base scheme $\mathcal {E} $, which yields a secret key ${\tilde{K}}$ and encrypted database ${\tilde{{\mathsf {EDB}}}}$. It also initializes an empty dictionary $\gamma $ and an empty bipartite graph ${\hat{G}} \in \mathcal {G}'$. The new secret key K consists of ${\tilde{K}}$ and $\gamma $, while ${\tilde{{\mathsf {EDB}}}}$ and ${\hat{G}}$ are outsourced to the server. The dictionary $\gamma $ maps a query $q = x \in \mathcal {X} $ to a PRF key $K_x$ and a counter $c_x$.

To perform an update $u = ({\mathsf {Add}},x,y,w)$, the client increments $c_x \leftarrow c_x + 1$, and transforms the update into ${\tilde{u}} = ({\mathsf {Add}},F(K_x,c_x),y,w)$ of the base scheme. To perform an update $u = ({\mathsf {Del}},x)$, the client removes the x-th row of $\gamma $, and sends to the server the update tokens for $({\mathsf {Del}},F(K_x,i))$ for $i \in [c_x]$. The update $u = ({\mathsf {Del}},y)$ is processed as in the base scheme.

Finally, to query $q = x \in \mathcal {X} $, the client removes the x-th row of $\gamma $, and sends to the server the query tokens of $F(K_x,i)$ for $i \in [c_x]$. Given these tokens, the server retrieves the intended response R. Additionally, the client also sends the update tokens for $({\mathsf {Del}},F(K_x,i))$ for $i \in [c_x]$. The server uses these tokens to collect and remove the set of edges $R = (x,*,*)$ from the base scheme, and merge R to the plaintext graph ${\hat{G}}$.

The correctness follows directly from that of the base scheme $\mathcal {E} $.

4.2 Analysis

Efficiency. Our generic transformation almost preserves the efficiency of the underlying DSSE scheme. For most algorithms, the preservation is apparent. We highlight the slightly more complicated cases, namely, the query on x and the update $({\mathsf {Del}},x)$. In the former, $c_x$ queries on $F(K_x, i)$ for $i \in [c_x]$ are executed, while in both cases $c_x$ deletions $({\mathsf {Del}}, F(K_x, i))$ for $i \in [c_x]$ are required. We analyze their efficiency assuming the following operations of the underlying DSSE scheme each takes constant time: the computation of the query token for each $F(K_x, i)$; the server computation for querying on each $F(K_x, i)$ (since by the pseudorandomness of F, the query should only return a single edge); the computation of each delete token; and the server computation for deleting each set $(F(K_x, i), *, *)$ (since each set is actually a singleton). Overall, the resulting scheme incurs $O(c_x)$ computation and communication costs for both the client and the server where $c_x$ is the number of newly matched data item. These are extra costs on top of the costs for retrieving the previously matched data (in plaintext), i.e., constant computation cost of the client, sublinear computation cost of the server, and sublinear communication cost of both. Since $c_x$ is reset to zero whenever x is queried, the amortized extra costs of both the client and the server are low.

Security. Let $\mathcal {E} $ be an $({\tilde{{\mathcal {L}}_{e}}}, {\tilde{{\mathcal {L}}_{q}}}, {\tilde{{\mathcal {L}}_{u}}})$-CQA2 secure DSSE scheme for labeled bipartite graphs specified by $\mathcal {G}$; and $F: \{0,1\}^{\lambda } \times \mathbb {N} \rightarrow \{0,1\}^{\lambda } $ be a pseudorandom function. Our construction is CQA2-secure and forward private with respect to the following leakage functions. The proof can be found in the full version.

${\mathcal {L}}_{e}(\mathcal {G}') = {\tilde{{\mathcal {L}}_{e}}}(\mathcal {G})$,
${\mathcal {L}}_{u}({\mathsf {Add}},x,y,w) = ({\tilde{x}}, {\tilde{{\mathcal {L}}_{u}}}({\mathsf {Add}},{\tilde{x}},y,w))$ for dummy node ${\tilde{x}} \leftarrow \{0,1\}^{\lambda } $,
${\mathcal {L}}_{u}({\mathsf {Del}},x) = (x, \{{\tilde{{\mathcal {L}}_{u}}}({\mathsf {Del}},{\tilde{x}}_i)\}_{i=1}^{c_x})$,
${\mathcal {L}}_{u}({\mathsf {Del}},y) = (y, {\tilde{{\mathcal {L}}_{u}}}({\mathsf {Del}},y))$,
${\mathcal {L}}_{q}(x) = (x, {\mathsf {AP}}_t(x), \{{\tilde{{\mathcal {L}}_{q}}}({\tilde{x}}_i), {\tilde{{\mathcal {L}}_{u}}}({\mathsf {Del}},{\tilde{x}}_i)\}_{i=1}^{c_x})$, for dummy nodes ${\tilde{x}}_i$ defined by ${\mathcal {L}}_{u}({\mathsf {Add}},x,y,\cdot )$ for updates after the previous query on x.

Theorem 1

Assume that $\mathcal {E} $ is $({\tilde{{\mathcal {L}}_{e}}}, {\tilde{{\mathcal {L}}_{q}}}, {\tilde{{\mathcal {L}}_{u}}})$-CQA2 secure, and F is pseudorandom, then the above construction is $({\mathcal {L}}_{e}, {\mathcal {L}}_{q}, {\mathcal {L}}_{u})$-CQA2 secure. Furthermore, let $p'_1$ be a function such that $p'_1(u_0,u_1) = 1$ if and only if $y_0 = y_1$ and $w_0 = w_1$ ^{Footnote 4}. Then the construction is $(\mathcal {U} _1,p'_1)$-forward private, where $\mathcal {U} _1$ is defined in Table 1.

5 Forward Privacy from Scratch

We next construct (interactive) DSSE which achieves forward privacy directly. First, a new data structure, named cascaded triangles, is designed to represent labeled bipartite graphs which supports neighbor queries, edge additions, and node deletions efficiently. We then transform it into its encrypted version.

The construction of cascaded triangles is motivated by the following. Since the neighbor queries and node deletions require traversing the sets $(x,*,*)$ and $(*,y,*)$, their data structure representations are critical for the efficiency, and later the security of the resulting DSSE scheme. In particular, they determine whether the desired operations can be executed in parallel, and how much information has to be leaked to the server for performing such operations. For example, linked list [11] traversal and updates are inherently sequential, yet updating only has a local effect (on the previous and next nodes). However, random binary search tree [12] exhibits parallel traversal and updates, yet updating affects (or leaks) the subtree rooted from the altered node. Thus, cascaded triangles is designed to support parallel traversal and local updates simultaneously.

5.1 Warm Up: Plaintext Cascaded Triangles

Overview. Our goal is to store the bipartite graph G so that neighbor queries and deletions over X or Y can be executed in sublinear time. We can do so by pre-computing the set of edges connected to each x (and y), and storing the set by a data structure which allows efficient traversal. For any x, consider the set of edges connecting x. We pack this set into multiple perfect binary trees, called triangles, by first forming the largest triangle possible, subtracting the edges which are already packed, then continuing to form the next largest triangle. The resulting triangles thus have strictly decreasing (cascading) heights, except for the last two which may have equal heights. This invariant is to be maintained in any later updates. To add an edge connecting x, we check if the two shortest triangles have the same height. If so, we add a new node representing the new edge on top of the two triangles, merging them into one larger triangle. Otherwise, the new node is added as a new triangle of height 1. To delete an edge, we delete the node representing this edge by replacing it with the root of the shortest triangle, splitting the latter into two smaller triangles. We can see that the invariant is still maintained after each addition and each deletion. Finally, to traverse the data structure, one may use any (parallel) tree traversal algorithms.

Setup. Concretely, cascaded triangles consists of dictionaries $\gamma $, $\delta $, and $\eta $. We can think of $\gamma $ as local states stored at the client side, while $\delta $ and $\eta $ are outsourced to the server. The dictionary $\eta $ is the one to store the actual data. It maps an address ${\mathsf {addr}}$ to a tuple (a, b), where $b = ({\mathsf {chd}}_0, {\mathsf {chd}}_1)$ specifies the addresses of the left and right child respectively. b is maintained so that nodes in $\eta $ form perfect binary trees (triangles). This means either both addresses $({\mathsf {chd}}_0, {\mathsf {chd}}_1)$ are empty ($\bot $) or both are valid addresses occupied in $\eta $. To store an edge (x, y, w) into $\eta $, the edge is copied twice into ${a}^{\vartriangle } = {a}^{\triangledown } = (x,y,w)$. The tuples $({a}^{\vartriangle }, {b}^{\vartriangle })$ and $({a}^{\triangledown }, {b}^{\triangledown })$ are stored at random addresses ${{\mathsf {addr}}}^{\vartriangle }$ and ${{\mathsf {addr}}}^{\triangledown }$ in $\eta $ respectively. The addresses ${{\mathsf {addr}}}^{\vartriangle }$ and ${{\mathsf {addr}}}^{\triangledown }$ are said to be duals of each other, and are registered in $\delta $, i.e., $\delta [{{\mathsf {addr}}}^{\vartriangle }] = {{\mathsf {addr}}}^{\triangledown }$ and $\delta [{{\mathsf {addr}}}^{\triangledown }] = {{\mathsf {addr}}}^{\vartriangle }$.

Globally, we describe how the nodes in $\eta $ are connected to each other via the addresses stored in b. We collect all $\eta [{\mathsf {addr}}] = (a,b)$ corresponding to the edges in $(q,*,*)$ (or $(*,q,*)$). Let $n_q=|(q,*,*)|$ (or $|(*,q,*)|$). We pack these tuples into triangles of cascading heights $h_1 \le h_2< h_3< \ldots < h_k$, where $k \le \lceil \lg n_q \rceil + 1$. Note the possible equality between $h_1$ and $h_2$ but not the others. The ordering of the nodes is implicitly determined by the update algorithms, but does not matter here. Using the procedures described in the overview, given the size $n_q$, the heights $h_1,\ldots ,h_k$ are uniquely determined. It is possible to represent the heights compactly by a trinary string $h \in \{0,1,2\}^{\lceil \lg n_q \rceil }$, such that the i-th trit (trinary digit) is set to t, if there are t triangles of height i. Due to the constraints on the heights, only the least significant non-zero trit can be set to 2. Finally, the addresses of the roots and the heights of these triangles are stored in $\gamma [q] = ({\mathsf {addr}}_1,\ldots ,{\mathsf {addr}}_k, h)$.

Queries and Traversal. Traversing the sets $(q,*,*)$ and $(*,q,*)$ are straightforward with the above structure: First, retrieve the roots of the triangles from $\gamma [q] = ({\mathsf {addr}}_1,\ldots ,{\mathsf {addr}}_k,h)$. Then, use parallel tree-traversal algorithms to traverse the trees from the root starting at each of these addresses. Notice that the neighbor query function ${\mathsf {Qry}}$ on x and y are supported by traversing the sets $\gamma [x] = (x,*,*)$ and $\gamma [y] = (*,y,*)$ respectively.

Add. To add a new edge (x, y, w), we first retrieve $\gamma [x] = ({{\mathsf {addr}}}^{\vartriangle }_1,\ldots ,{{\mathsf {addr}}}^{\vartriangle }_k, {h}^{\vartriangle })$ and check whether the triangles rooted at ${{\mathsf {addr}}}^{\vartriangle }_1$ and ${{\mathsf {addr}}}^{\vartriangle }_2$ have the same height.

To do so, we take a detour to describe the $+1$ operation in $h + 1$. Recall that only the least significant non-zero trit, say the i-th trit, in h can be set to 2. The operation $h+1$ adds 1 to the i-th trit (instead of the least significant trit as in normal addition), which sets the i-th trit to 0 and carries 1 to the $(i+1)$-trit. Denote this event by $\mathsf {Carry}(h+1) = 1$. Otherwise, $h+1$ simply adds 1 to the least significant trit as in normal addition, denoted by $\mathsf {Carry}(h+1) = 0$. Later, for deletion, we would need the $-1$ operation which is the “reverse” of $+1$. Concretely, $h-1$ subtracts 1 from the least significant non-zero trit, say the i-th trit, and set the $(i-1)$-th trit to 2 if $i > 1$.

With the above procedures, checking the heights of the first two triangles can be done by simply checking whether the least significant non-zero trit in ${h}^{\vartriangle }$ equals 2, or equivalently whether $\mathsf {Carry}({h}^{\vartriangle }+1) = 1$. If that is the case, we add $({a}^{\vartriangle },{b}^{\vartriangle })$ where ${a}^{\vartriangle } = (x,y,w)$ and ${b}^{\vartriangle } = ({{\mathsf {addr}}}^{\vartriangle }_1,{{\mathsf {addr}}}^{\vartriangle }_2)$ to a random address ${{\mathsf {addr}}}^{\vartriangle }$ in $\eta $. We then update where the two addresses ${{\mathsf {addr}}}^{\vartriangle }_1$ and ${{\mathsf {addr}}}^{\vartriangle }_2$ are replaced by the new ${{\mathsf {addr}}}^{\vartriangle }$. Otherwise, we add $({a}^{\vartriangle },{b}^{\vartriangle })$ where ${a}^{\vartriangle } = (x,y,w)$ and ${b}^{\vartriangle } = (\bot ,\bot )$ to a random address ${{\mathsf {addr}}}^{\vartriangle }$ in $\eta $, and update The difference is highlighted in red.

Similarly, we retrieve $({{\mathsf {addr}}}^{\triangledown }_1,\ldots ,{{\mathsf {addr}}}^{\triangledown }_k, {h}^{\triangledown }) \leftarrow \gamma [y]$ and check whether $\mathsf {Carry}({h}^{\triangledown }+1) = 1$. If so, we add $((x,y,w), ({{\mathsf {addr}}}^{\triangledown }_1, {{\mathsf {addr}}}^{\triangledown }_2))$ to a random address ${{\mathsf {addr}}}^{\triangledown }$ in $\eta $, and update Otherwise, we add $((x,y,w), (\bot ,\bot ))$ to ${{\mathsf {addr}}}^{\triangledown }$, and update

Delete. To delete node x, or equivalently the set of edges $(x,*,*)$, we first traverse the set $(x,*,*)$ using the above traversal algorithm. We delete all the traversed nodes in $\eta $ as well as the row $\gamma [x]$. It remains to delete the dual nodes of the traversed nodes. To do so, for each traversed address ${{\mathsf {addr}}}^{\vartriangle }$ with $a = (x,y,w)$, look up $\delta [{{\mathsf {addr}}}^{\vartriangle }] = {{\mathsf {addr}}}^{\triangledown }$ and $\gamma [y] = ({{\mathsf {addr}}}^{\triangledown }_1,\ldots ,{{\mathsf {addr}}}^{\triangledown }_k, {h}^{\triangledown })$. We wish to replace the content of $\eta [{{\mathsf {addr}}}^{\triangledown }] = ({a}^{\triangledown }, {b}^{\triangledown })$ located in the middle of some triangle by the content of $\eta [{{\mathsf {addr}}}^{\triangledown }_1]$, the root of the smallest triangle, which splits the smallest triangle into two smaller ones. In this way, the heights of the resulting triangles still satisfy the required constraints.

Concretely, we perform the following steps. (1) Look up $\delta [{{\mathsf {addr}}}^{\triangledown }_1] = {{\mathsf {addr}}}^{\vartriangle }_1$. (2) Delete $\delta [{{\mathsf {addr}}}^{\vartriangle }]$ and $\delta [{{\mathsf {addr}}}^{\triangledown }_1]$. (3) Update $\delta [{{\mathsf {addr}}}^{\vartriangle }_1] \leftarrow {{\mathsf {addr}}}^{\triangledown }$ and $\delta [{{\mathsf {addr}}}^{\triangledown }] \leftarrow {{\mathsf {addr}}}^{\vartriangle }_1$. (4) Look up $\eta [{{\mathsf {addr}}}^{\triangledown }_1] = ({a}^{\triangledown }_1, {b}^{\triangledown }_1)$, where ${b}^{\triangledown }_1 = ({{\mathsf {addr}}}^{\blacktriangledown }_0, {{\mathsf {addr}}}^{\blacktriangledown }_1)$. (5) Update $\eta [{{\mathsf {addr}}}^{\triangledown }] \leftarrow ({a}^{\triangledown }_1, {b}^{\triangledown }_1)$ and delete $\eta [{{\mathsf {addr}}}^{\triangledown }_1]$. (6) Update $\gamma [y] = ({{\mathsf {addr}}}^{\blacktriangledown }_0, {{\mathsf {addr}}}^{\blacktriangledown }_1,{{\mathsf {addr}}}^{\triangledown }_2,\ldots ,{{\mathsf {addr}}}^{\triangledown }_k, {h}^{\triangledown } - 1)$, where ${h}^{\triangledown }-1$ is the reverse of ${h}^{\triangledown }+1$. We omit the deletion of $(*,y,*)$ which is similar to the above.

Efficiency. The storage cost of cascaded triangles is $O(|X|+|Y|+|G|) = O(|G|)$. The complexity of querying (or deleting) x and y are $O(|(x,*,*)|)$ and $O(|(*,y,*)|)$ respectively. Addition of an edge can be computed in constant time.

5.2 Our Construction: Encrypted Cascaded Triangles

We now transform the plaintext cascaded triangles into its encrypted version. Recall that the goal of the client is to encrypt a labeled bipartite graph G into an encrypted database ${\mathsf {EDB}} $ which still supports neighbor queries, edge additions, and node deletions. To do so, the client represents G using cascaded triangles, stores $\gamma $ locally, and outsources $\delta $ and the encrypted $\eta $ to the server. The encryption should be non-committing, such that there exists a simulator which can simulate the ciphertexts for new data without any leakage. When some data is to be returned upon queries or is deleted, the simulator is given enough leakage so that it can “explain” the dummy ciphertexts. Furthermore, when the sets $(x,*,*)$ and $(*,y,*)$ are leaked upon querying x and y respectively, there is no need to protect the sets by encryption any longer [8]. To speed up subsequent queries, the server should transfer the set from the encrypted $\eta $ to a plaintext labeled bipartite graph ${\hat{G}}$. Thus, we can conceptually split G into two disjoint subgraphs $G = {\tilde{G}} \cup {\hat{G}}$, where ${\tilde{G}}$ is encrypted and ${\hat{G}}$ is in plaintext.

Encrypted cascaded triangles are similar to the plaintext counterparts. We highlight the differences and omit the identical parts.

Setup and Overview. Let ${\mathsf {NCE}}.({\mathsf {KGen}},{\mathsf {Enc}},{\mathsf {Dec}})$ be a symmetric-key non-committing encryption scheme^{Footnote 5}. The correctness of our scheme will follow directly from that of ${\mathsf {NCE}} $. For each edge (x, y, w) in the encrypted subgraph ${\tilde{G}}$, $\eta [{{\mathsf {addr}}}^{\vartriangle }]$ stores the tuple $({c}^{\vartriangle }_a, {c}^{\vartriangle }_b)$, where ${c}^{\vartriangle }_a$ and ${c}^{\vartriangle }_b$ are non-committing ciphertexts of ${a}^{\vartriangle } = (x,y,w)$ and ${b}^{\vartriangle } = ({\mathsf {chd}}_0, {\mathsf {chd}}_1)$ under the keys ${{\mathsf {ak}}}^{\vartriangle }$ and ${{\mathsf {bk}}}^{\vartriangle }$ respectively. The keys ${{\mathsf {ak}}}^{\vartriangle }$ and ${{\mathsf {bk}}}^{\vartriangle }$ are independently generated for each x. Similarly, $\eta [{{\mathsf {addr}}}^{\triangledown }]$ stores the tuple $({c}^{\triangledown }_a, {c}^{\triangledown }_b)$ encrypted under ${{\mathsf {ak}}}^{\triangledown }$ and ${{\mathsf {bk}}}^{\triangledown }$ respectively. The keys ${{\mathsf {ak}}}^{\triangledown }$ and ${{\mathsf {bk}}}^{\triangledown }$ are independently generated for each y. For each x, $\gamma [x]$ additionally stores secret keys ${{\mathsf {ak}}}^{\vartriangle }$ and ${{\mathsf {bk}}}^{\vartriangle }$ associated to x. Similarly, for each y, $\gamma [y]$ additionally stores secret keys ${{\mathsf {ak}}}^{\triangledown }$ and ${{\mathsf {bk}}}^{\triangledown }$ associated to y. Formally, the setup protocol in Fig. 3 shows the initialization of these dictionaries.

Queries. Queries are similar to the plaintext case. Apart from the addresses, the client also sends ${\mathsf {ak}}$ and ${\mathsf {bk}}$ retrieved from $\gamma [q]$ to the server. Using ${\mathsf {bk}}$, the server decrypts all $b = ({\mathsf {chd}}_0, {\mathsf {chd}}_1)$ and traverses the sub-trees. Using ${\mathsf {ak}}$, it decrypts all $a = (x,y,w)$ which are returned as query results. The server also returns the previous query results ${\hat{R}}$ stored in ${\hat{G}}$.

As mentioned before, for the efficiency of subsequent queries, the server should remove revealed entries from $\eta $ and add them to the plaintext subgraph ${\hat{G}}$. Thus, the client and the server cooperates to delete the set $(q,*,*)$ or $(*,q,*)$ from $\eta $. This conceptually removes the set from the encrypted subgraph ${\tilde{G}}$. Finally, the server adds the set to ${\hat{G}}$. Formally, the query protocol is shown in Fig. 4, which utilizes the subroutine ${\mathsf {Trav}}$ and subprotocol $\mathsf {DelDual}$ shown in Figs. 3 and 5 respectively.

Add. Instead of sending $({a}^{\vartriangle }, {b}^{\vartriangle })$ in the clear, the client sends their ciphertexts $({c}^{\vartriangle }_a, {c}^{\vartriangle }_b)$, encrypted under ${{\mathsf {ak}}}^{\vartriangle }$ and ${{\mathsf {bk}}}^{\vartriangle }$ retrieved from $\gamma [x]$, to the server respectively. In the case where $\gamma [x] = \bot $, the client generates new secret keys ${{\mathsf {ak}}}^{\vartriangle }$ and ${{\mathsf {bk}}}^{\vartriangle }$ using the key generation algorithm of the non-committing encryption scheme. Sending $({a}^{\triangledown }, {b}^{\triangledown })$ requires a similar treatment. Figure 6 formally describes the addition protocol.

Delete. Deletion of $(x,*,*)$ is almost identical to querying x, except that the client does not send out ${{\mathsf {ak}}}^{\vartriangle }$. Instead, the server returns all ${c}^{\vartriangle }_a$ so that the client can decrypt them locally. After obtaining all the edges (x, y, w), the client and the server cooperate to delete $(x,*,*)$ from $\eta $ as in the query algorithm. This conceptually removes $(x,*,*)$ from the encrypted subgraph ${\tilde{G}}$. Finally, the server also removes $(x,*,*)$ from the plaintext subgraph ${\hat{G}}$. Deletion of $(*,y,*)$ is done similarly. Figure 7 shows the deletion protocol, which also utilizes the subroutines ${\mathsf {Trav}}$ and $\mathsf {DelDual}$ in Figs. 3 and 5 respectively.

5.3 Analysis

Storage Cost. For each x, if $n_x = |(x,*,*) \cap {\tilde{G}}| > 0$, the client stores two $\lambda $-bit keys of non-committing encryption, one $\lceil \lg n_x \rceil $-trit string, and at most $\lceil \lg n_x \rceil + 1$ $\lambda $-bit addresses. Similar storage is required for each y. In the extreme case where the client adds all possible data and never queries or deletes, the storage cost is $O({\mathsf {poly}}\cdot (|\mathcal {X} |\lg |\mathcal {Y} | + |\mathcal {Y} | \lg |\mathcal {X} |))$. However, querying and deleting x (or y) removes $(x,*,*)$ (or $(*,y,*)$) from ${\tilde{G}}$, which waives the client local storage for x (or y). Thus, the storage of a reasonable client would be much smaller.

The storage cost of the server is linear in the number of edges (x, y, w) added to the server, which is optimal. Furthermore, if an edge (x, y, w) is revealed due to a previous query, it is stored in plaintext instead of ciphertext, where the former is much better in terms of locality [3].

Computation and Communication Cost. Both the client and the server perform essentially no work during setup: They just initialize empty dictionaries.

During a query on q, the client first looks up its dictionary $\gamma [q]$, which consists of two $\lambda $-bit keys of ${\mathsf {NCE}} $, one $\lceil \lg n_q \rceil $-trit string, and at most $(\lceil \lg n_q \rceil + 1)$ $\lambda $-bit addresses. The query q, the keys, and the addresses are sent to the server. The server traverses the set $(x,*,*)$ if $q = x \in \mathcal {X} $, or the set $(*,y,*)$ if $q = y \in \mathcal {Y} $. In the former, the server needs to perform $O(|(x,*,*) \cap {\tilde{G}}|)$ decryption, and execute ${\mathsf {Qry}}(q,{\hat{G}})$ which takes time $|(x,*,*) \cap {\hat{G}}|$. It then returns the query result of size $|(x,*,*)| = |(x,*,*) \cap {\tilde{G}}| + |(x,*,*) \cap {\hat{G}}|$ to the client. The client looks up and sends $|(x,*,*) \cap {\tilde{G}}|$ addresses to the server, which performs the same amount of I/O tasks, and returns $O(|(x,*,*) \cap {\tilde{G}}|)$ ciphertexts to the client. The client finalizes by performing $O(|(x,*,*) \cap {\tilde{G}}|)$ decryption and the same amount of I/O tasks. Overall, both computation and communication complexities for both the client and the server are in the order of $O({\mathsf {poly}}\cdot |(x,*,*)|)$, which is optimal for the server. In contrast to the presentation in the formal construction, the number of rounds can be compressed into 4.

Since deletion is almost identical to querying except for local decryption, their overall computation and communication complexities are identical.

For addition, the client performs a constant amount of I/O tasks to update $\gamma $, while sending 2 $\lambda $-bit addresses and 4 ciphertexts to the server. The server simply writes the ciphertexts to the specified addresses. Therefore, the overall computation, round, and communication complexities for both the client and the server are constant, which is again optimal.

Note that our scheme supports batch operations (querying, addition, and deletion) straightforwardly. In such case, the computation and communication complexities increase linearly while the round complexity remains unchanged.

Security. In the full version, we prove that our scheme is secure against adaptive chosen query attack with very minimal leakage. In particular, our scheme achieves $(\mathcal {U} _1,1)$-forward privacy, where $\mathcal {U} _1$ is defined in Table 1. We begin by defining the leakage functions. For setup, ${\mathcal {L}}_{u}$ only leaks the sizes of the spaces, i.e., $|\mathcal {X} |$, $|\mathcal {Y} |$, and $|\mathcal {W} |$. For addition, ${\mathcal {L}}_{u}$ only leaks the update type and the time ${\mathsf {Time}}(x,y)$ of the addition as the new addresses are truly random and the ciphertexts can be simulated by the simulator of the non-committing encryption scheme. For deletion of $(x,*,*)$, ${\mathcal {L}}_{u}$ leaks the update type, x, and the time ${\mathsf {Time}}(x,y)$ when each of the edges $(x,y,w) \in (x,*,*)$ is added. It also leaks, for each y such that $(x,y,w) \in (x,*,*)$, the time ${\mathsf {Time}}(*,y)$ when the last edge in $(*,y,*)$ is added. Leakage for deleting $(*,y,*)$ is defined similarly. Finally, for queries on q, ${\mathcal {L}}_{q}$ leaks all information leaked by ${\mathcal {L}}_{u}$ upon deletion of $(x,*,*)$ if $q = x \in \mathcal {X} $, or $(*,y,*)$ if $q = y \in \mathcal {Y} $, and the access patterns ${\mathsf {AP}}_t(q)$ of q, assuming it is sorted by the time each response is added. Formally, we define:

${\mathcal {L}}_{e}(\mathcal {G}) = (|\mathcal {X} |, |\mathcal {Y} |, |\mathcal {W} |)$
${\mathcal {L}}_{u}({\mathsf {Add}},x,y,w) = ({\mathsf {Add}}, {\mathsf {Time}}(x,y))$
${\mathcal {L}}_{u}({\mathsf {Del}},x) = ({\mathsf {Del}}, x, \{({\mathsf {Time}}(x,y),{\mathsf {Time}}(*,y)): (x,y,\cdot ) \in (x, *, *)\})$
${\mathcal {L}}_{u}({\mathsf {Del}},y) = ({\mathsf {Del}}, y, \{({\mathsf {Time}}(x,y),{\mathsf {Time}}(x,*)): (x,y,\cdot ) \in (*, y, *)\})$
${\mathcal {L}}_{q}(x) = (x, {\mathsf {AP}}_t(x), \{({\mathsf {Time}}(x,y),{\mathsf {Time}}(*,y)): (x,y,\cdot ) \in (x, *, *)\})$
${\mathcal {L}}_{q}(y) = (y, {\mathsf {AP}}_t(y), \{({\mathsf {Time}}(x,y),{\mathsf {Time}}(x,*)): (x,y,\cdot ) \in (*, y, *)\})$

The simulation is sketched as follows. The simulator first initializes empty dictionaries $\delta $ and $\eta $, and the empty plaintext graph ${\hat{G}}$. It maintains a table T mapping time t to time-address tuples $((t_0, {\mathsf {addr}}), (t_1, {{\mathsf {addr}}}^{\triangledown })$.

For addition at time t, it samples random addresses ${\mathsf {addr}}$ and ${{\mathsf {addr}}}^{\triangledown }$, and registers them in $\delta $. It sets $T[t] \leftarrow ((t, {\mathsf {addr}}), (t, {{\mathsf {addr}}}^{\triangledown }))$. It simulates the ciphertexts in $\eta [{\mathsf {addr}}]$ and $\eta [{{\mathsf {addr}}}^{\triangledown }]$ using the simulator of ${\mathsf {NCE}} $.

For deletion of $(x,*,*)$, it is given x, which allows it to delete $(x,*,*)$ from ${\hat{G}}$. It is also given the time ${\mathsf {Time}}(x,y)$ when each of the edges $(x,y,w) \in (x,*,*)$ is added, and for each y such that $(x,y,w) \in (x,*,*)$ the time ${\mathsf {Time}}(*,y)$ when the last edge in $(*,y,*)$ is added. It recalls from the table T all ${\mathsf {addr}}$ and ${{\mathsf {addr}}}^{\triangledown }$ pairs which are created at time ${\mathsf {Time}}(x,y)$ and ${\mathsf {Time}}(*,y)$. With the knowledge of these addresses, the simulator can maintain $\delta $ and $\eta $ as in the real scheme. It must also output simulated ${\mathsf {bk}}$ and explain the ciphertexts $c_b$ encrypting the addresses. To achieve this, the simulator passes the ciphertexts and the corresponding addresses to the simulator of the non-committing encryption scheme, where the latter outputs the simulated ${\mathsf {bk}}$. Deletion of $(*,y,*)$ is simulated similarly.

Queries are simulated almost identically as in the simulation of deletions, except that the simulator must now also output simulated ${\mathsf {ak}}$ and explain the ciphertexts $c_a$ encrypting the query result. Similar to the above, this can be done by calling the simulator of the non-committing encryption scheme.

Theorem 2

Assume that ${\mathsf {NCE}}.({\mathsf {KGen}},{\mathsf {Enc}},{\mathsf {Dec}})$ is a symmetric-key non-committing encryption scheme with message space $\{0,1\}^{\max (\lg |\mathcal {X} | + \lg |\mathcal {Y} | + \lg |\mathcal {W} |, 2\lambda )} $, the above construction is $({\mathcal {L}}_{e}, {\mathcal {L}}_{q}, {\mathcal {L}}_{u})$-CQA2 secure. Furthermore, the above construction is $(\mathcal {U} _1,1)$-forward private, where $\mathcal {U} _1$ is defined in Table 1.

Notes

1.
A few exceptions include Lai-Chow [13] and a modified version of the one by Kamara et al. [11]. However, these schemes leak substantial information during updates.
2.
While the design of cascaded triangles is original, we do not rule out the possibility that there are similar data structures outside the literature of SSE. To the best of our knowledge, we are unaware of any common similar data structure. There are false relatives such as fractional cascading which solves totally different problems.
3.
Candidate base schemes include [13] and a modified version of [11].
4.
We can drop the restriction $w_0 = w_1$ by simply encrypting w during additions.
5.
For example, a ciphertext for message m with randomness r can be computed as $c = (r, {\mathsf {PRF}}(K,r) \oplus m)$, where ${\mathsf {PRF}}$, modeling a random oracle, is a pseudorandom function with secret key K. In practice, one may substitute ${\mathsf {PRF}}$ with an HMAC.

References

Bösch, C., Hartel, P., Jonker, W., Peter, A.: A survey of provably secure searchable encryption. ACM Comput. Surv. 47(2), 18:1–18:51 (2014)
Article Google Scholar
Bost, R.: $\sum $o$\varphi $o$\varsigma $: forward secure searchable encryption. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016, pp. 1143–1154 (2016)
Google Scholar
Cash, D., Tessaro, S.: The locality of searchable symmetric encryption. In: Nguyen, P.Q., Oswald, E. (eds.) EUROCRYPT 2014. LNCS, vol. 8441, pp. 351–368. Springer, Heidelberg (2014). doi:10.1007/978-3-642-55220-5_20
Chapter Google Scholar
Chen, Y.-C., Chow, S.S.M., Chung, K.-M., Lai, R.W.F., Lin, W.-K., Zhou, H.-S.: Cryptography for parallel RAM from indistinguishability obfuscation. In: Sudan, M. (ed.) ITCS 2016, Cambridge, MA, USA, 14–16 January 2016, pp. 179–190. ACM (2016)
Google Scholar
Curtmola, R., Garay, J.A., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: improved definitions and efficient constructions. In: Juels, A., Wright, R.N., Sabrina De Capitani di Vimercati, (eds.) ACM CCS 2006, Alexandria, Virginia, USA, 30 October–3 November 2006, pp. 79–88. ACM Press (2006)
Google Scholar
Curtmola, R., Garay, J.A., Kamara, S., Ostrovsky, R.: Searchable symmetric encryption: Improved definitions and efficient constructions. J. Comput. Secur. 19(5), 895–934 (2011)
Article Google Scholar
Garg, S., Mohassel, P., Papamanthou, C.: TWORAM: efficient oblivious RAM in two rounds with applications to searchable encryption. In: Robshaw, M., Katz, J. (eds.) CRYPTO 2016. LNCS, vol. 9816, pp. 563–592. Springer, Heidelberg (2016). doi:10.1007/978-3-662-53015-3_20
Chapter Google Scholar
Hahn, F., Kerschbaum, F.: Searchable encryption with secure and efficient updates. In: Ahn, G.-J., Yung, M., Li, N. (eds.) ACM CCS 2014, Scottsdale, AZ, USA, 3–7 November 2014, pp. 310–320. ACM Press (2014)
Google Scholar
Islam, S.M., Kuzu, M., Kantarcioglu, M.: Access pattern disclosure on searchable encryption: ramification, attack and mitigation. In: NDSS 2012, San Diego, CA, USA, 5–8 February 2012. The Internet Society (2012)
Google Scholar
Kamara, S., Papamanthou, C.: Parallel and dynamic searchable symmetric encryption. In: Sadeghi, A.-R. (ed.) FC 2013. LNCS, vol. 7859, pp. 258–274. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39884-1_22
Chapter Google Scholar
Kamara, S., Papamanthou, C., Roeder, T.: Dynamic searchable symmetric encryption. In: Yu, T., Danezis, G., Gligor, V.D. (eds.) ACM CCS 2012, Raleigh, NC, USA, 16–18 October 2012, pp. 965–976. ACM Press (2012)
Google Scholar
Lai, R.W.F., Chow, S.S.M.: Structured encryption with non-interactive updates and parallel traversal. In: 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015, Columbus, OH, USA, 29 June–2 July 2015, pp. 776–777 (2015)
Google Scholar
Lai, R.W.F., Chow, S.S.M.: Parallel and dynamic structured encryption. In: SECURECOMM 2016 (2016, to appear)
Google Scholar
Liu, C., Zhu, L., Wang, M., Tan, Y.: Search pattern leakage in searchable encryption: attacks and new construction. Inf. Sci. 265, 176–188 (2014)
Article Google Scholar
Rizomiliotis, P., Gritzalis, S.: ORAM based forward privacy preserving dynamic searchable symmetric encryption schemes. In: Proceedings of the 2015 ACM Workshop on Cloud Computing Security Workshop, CCSW 2015, Denver, Colorado, USA, 16 October 2015, pp. 65–76 (2015)
Google Scholar
Song, D.X., Wagner, D., Perrig, A.: Practical techniques for searches on encrypted data. In: 2000 IEEE Symposium on Security and Privacy, Oakland, CA, USA, pp. 44–55. IEEE Computer Society Press, May 2000
Google Scholar
Stefanov, E., Papamanthou, C., Shi, E.: Practical dynamic searchable encryption with small leakage. In: NDSS 2014, San Diego, CA, USA, 23–26 February 2014. The Internet Society (2014)
Google Scholar
Zhang, Y., Katz, J., Papamanthou, C.: All your queries are belong to us: the power of file-injection attacks on searchable encryption. In: 25th USENIX Security Symposium, USENIX Security 16, Austin, TX, USA, 10–12 August 2016, pp. 707–720 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Russell W. F. Lai & Sherman S. M. Chow

Authors

Russell W. F. Lai
View author publications
You can also search for this author in PubMed Google Scholar
Sherman S. M. Chow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Russell W. F. Lai .

Editor information

Editors and Affiliations

Hamburg University of Technology, Hamburg, Germany
Dieter Gollmann
Graduate School of Engineering, Osaka University, Suita, Osaka, Japan
Atsuko Miyaji
Department of Frontier Media Science, Meiji University, Tokyo, Japan
Hiroaki Kikuchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lai, R.W.F., Chow, S.S.M. (2017). Forward-Secure Searchable Encryption on Labeled Bipartite Graphs. In: Gollmann, D., Miyaji, A., Kikuchi, H. (eds) Applied Cryptography and Network Security. ACNS 2017. Lecture Notes in Computer Science(), vol 10355. Springer, Cham. https://doi.org/10.1007/978-3-319-61204-1_24

Download citation

DOI: https://doi.org/10.1007/978-3-319-61204-1_24
Published: 26 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61203-4
Online ISBN: 978-3-319-61204-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Forward-Secure Searchable Encryption on Labeled Bipartite Graphs