Private large-scale databases with distributed searchable symmetric encryption

Y Ishai, E Kushilevitz, S Lu, R Ostrovsky - Cryptographers' Track at the …, 2016 - Springer
Cryptographers' Track at the RSA Conference, 2016Springer
With the growing popularity of remote storage, the ability to outsource a large private
database yet be able to search on this encrypted data is critical. Searchable symmetric
encryption (SSE) is a practical method of encrypting data so that natural operations such as
searching can be performed on this data. It can be viewed as an efficient private-key
alternative to powerful tools such as fully homomorphic encryption, oblivious RAM, or secure
multiparty computation. The main drawbacks of existing SSE schemes are the limited types …
Abstract
With the growing popularity of remote storage, the ability to outsource a large private database yet be able to search on this encrypted data is critical. Searchable symmetric encryption (SSE) is a practical method of encrypting data so that natural operations such as searching can be performed on this data. It can be viewed as an efficient private-key alternative to powerful tools such as fully homomorphic encryption, oblivious RAM, or secure multiparty computation. The main drawbacks of existing SSE schemes are the limited types of search available to them and their leakage. In this paper, we present a construction of a private outsourced database in the two-server model (e.g. two cloud services) which can be thought of as an SSE scheme on a B-tree that allows for a wide variety of search features such as range queries, substring queries, and more. Our solution can hide all leakage due to access patterns (“metadata”) between queries and features a tunable parameter that provides a smooth tradeoff between privacy and efficiency. This allows us to implement a solution that supports databases which are terabytes in size and contain millions of records with only a slowdown compared to MySQL when the query result size is around 10 % of the database, though the fixed costs dominate smaller queries resulting in over relative slowdown (under 1 s actual).
In addition, our solution also provides a mechanism for allowing data owners to set filters that prevent prohibited queries from returning any results, without revealing the filtering terms. Finally, we also present the benchmarks of our prototype implementation.
Springer