CryptDB: processing queries on an encrypted database

RA Popa, CMS Redfield, N Zeldovich… - Communications of the …, 2012 - dl.acm.org
Communications of the ACM, 2012dl.acm.org
Theft of private information is a significant problem for online applications. For example, a
recent investigation found that at least eight million people's medical records were stolen as
a result of data breaches between 2009 and 2011, 13 and in a recent attack on the Sony
Playstation Network, attackers apparently gained access to about 77 million personal user
profiles, some of which included credit card information. 20 Such large-scale data thefts
make the popular press, but smaller-scale compromises occur on a nearly daily basis …
Theft of private information is a significant problem for online applications. For example, a recent investigation found that at least eight million people’s medical records were stolen as a result of data breaches between 2009 and 2011, 13 and in a recent attack on the Sony Playstation Network, attackers apparently gained access to about 77 million personal user profiles, some of which included credit card information. 20 Such large-scale data thefts make the popular press, but smaller-scale compromises occur on a nearly daily basis, according to organizations devoted to studying consumer and data privacy (eg, the Privacy Rights Clearinghouse). Sensitive data can leak from online data repositories for a variety of reasons: an adversary can exploit software vulnerabilities to gain unauthorized access to servers, 15 curious or malicious administrators at a hosting pro vider can snoop on private data, 3 and attackers with physical access to servers can steal data from disk and memory. 11 One approach to reduce the damage caused by server compromises is to encrypt all sensitive data stored on the servers. However, many important applications, including database-backed Web services that process SQL queries, as well as analytic applications that compute results over large quantities of data, require servers to not just store data, but also perform computations on the data. One solution could be to store the data encrypted at the server, but to perform all computation at a trusted client on plaintext by downloading and decrypting all needed data for every computation; this approach, however, is usually untenable because there might be too much data to move around, or because clients may have significantly less computation or storage resources than the server.
An ideal solution to satisfying the dual goals of protecting data confidentiality and running computations is to enable a server to compute over encrypted data, without the server ever decrypting the data to plaintext. The server would produce results in an encrypted form, decryptable only by a trusted client. This approach would preserve the architecture of running much of the application’s computation at the server.
ACM Digital Library