Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3456727.3463769acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

The case for adding privacy-related offloading to smart storage

Published: 14 June 2021 Publication History

Abstract

It is important to ensure that personally identifiable information (PII) is protected within large distributed systems and is used only for intended purposes. Achieving this is challenging and several techniques have been proposed for privacy-preserving analytics, but they typically focus on the end hosts only. We argue that future storage solutions should include, in addition to emerging compute offload, also privacy-related operators. Since many privacy operators, such as perturbation and anonymization, take place as the very first step before other computations, query offload to a Smart Storage device might be only feasible in the future if privacy-related operators can also be offloaded.
In this work we demonstrate that privacy-preserving operators can be implemented in hardware without reducing read bandwidths. We focus on perturbations and extend an FPGA-based network-attached Smart Storage solution to show that it is possible to provide these operations at 10Gbps line-rate while using only a small amount of additional FPGA real-estate. We also discuss how future faster smart storage nodes should look like in the light of these additional requirements.

References

[1]
Advanced query accelerator (aqua) for amazon redshift (accessed 03/06/2020). https://pages.awscloud.com/AQUA-Preview.html.
[2]
Samsung smartssd computational storage drive (accessed 03/06/2020). https://samsungsemiconductor-us.com/smartssd/index.html.
[3]
A. Acharya, M. Uysal, and J. Saltz. Active disks: Programming model, algorithms and evaluation. ACM SIGPLAN Notices, 33(11), 1998.
[4]
C. C. Aggarwal and P. S. Yu. A condensation approach to privacy preserving data mining. In Advances in Database Technology - EDBT 2004, pages 183--199, Berlin, Heidelberg, 2004. Springer.
[5]
G. Alonso, Z. Istvan, K. Kara, M. Owaida, and D. Sidler. doppiodb 1.0: Machine learning inside a relational engine. IEEE DE Bull, 42(2), 2019.
[6]
E. Arfelt, D. Basin, and S. Debois. Monitoring the gdpr. In European Symposium on Research in Computer Security, pages 681--699. Springer, 2019.
[7]
K. Chapman, M. Nik, B. Robatmili, S. Mirkhani, and M. Lavasani. Computational storage for big data analytics. In Tenth International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures (ADMS'19), 2019.
[8]
K. Chen and L. Liu. A random rotation perturbation approach to privacy preserving data classification. 2005.
[9]
K. Chen and L. Liu. Geometric data perturbation for privacy preserving outsourced data mining. Knowledge and information systems, 29(3):657--695, 2011.
[10]
B. Dageville, T. Cruanes, M. Zukowski, V. Antonov, A. Avanes, J. Bock, J. Claybaugh, D. Engovatov, M. Hentschel, J. Huang, et al. The snowflake elastic data warehouse. In Proceedings of the 2016 International Conference on Management of Data, pages 215--226, 2016.
[11]
B. Ding, J. Kulkarni, and S. Yekhanin. Collecting telemetry data privately. In Advances in Neural Information Processing Systems 30, December 2017.
[12]
J. Do, Y.-S. Kee, J. M. Patel, et al. Query processing on smart SSDs: opportunities and challenges. In SIGMOD'13.
[13]
J. Domingo-Ferrer and J. M. Mateo-Sanz. Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189--201, 2002.
[14]
R. Durstenfeld. Algorithm 235: Random permutation. Communications of the ACM, 7(7):420, 1964.
[15]
C. Dwork. Differential privacy. Automata, Languages and Programming, pages 265--284, 2006.
[16]
C. Dwork, A. Roth, et al. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4):211--407, 2014.
[17]
U. Erlingsson, V. Pihur, and A. Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 1054--1067, 2014.
[18]
IBM. IBM Netezza data warehouse appliances, 2012. http://www.ibm.com/software/data/netezza/.
[19]
Z. István, G. Alonso, and A. Singla. Providing multi-tenant services with FPGAs: Case study on a key-value store. In FPL'18, pages 119--124, 2018.
[20]
Z. István, D. Sidler, and G. Alonso. Caribou: intelligent distributed storage. PVLDB, 10(11):1202--1213, 2017.
[21]
I. Jo, D.-H. Bae, A. S. Yoon, J.-U. Kang, S. Cho, D. D. Lee, and J. Jeong. Yoursql: a high-performance database system leveraging in-storage computing. PVLDB, 9(12), 2016.
[22]
S. Jun, M. Liu, S. Lee, J. Hicks, J. Ankcorn, M. King, and S. X. and. Bluedbm: An appliance for big data analytics. In ISCA'15, pages 1--13, 2015.
[23]
H. Kargupta, S. Datta, Q. Wang, and K. Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Third IEEE international conference on data mining, pages 99--106. IEEE, 2003.
[24]
R. Krahn, B. Trach, A. Vahldiek-Oberwagner, T. Knauth, P. Bhatotia, and C. Fetzer. Pesos: Policy enhanced secure object store. In Proceedings of the Thirteenth EuroSys Conference, pages 1--17, 2018.
[25]
L. Kuhring, E. Garcia, and Z. István. Specialize in moderation---building application-aware storage services using fpgas in the datacenter. In 11th {USENIX} Workshop on Hot Topics in Storage and File Systems (HotStorage 19), 2019.
[26]
B. Li, Z. Ruan, W. Xiao, Y. Lu, Y. Xiong, A. Putnam, E. Chen, and L. Zhang. Kv-direct: high-performance in-memory key-value store with programmable nic. In SOSP'17, pages 137--152, 2017.
[27]
F. D. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 19--30, 2009.
[28]
A. Mehta, E. Elnikety, K. Harvey, D. Garg, and P. Druschel. Qapla: Policy compliance for database-backed systems. In 26th {USENIX} Security Symposium ({USENIX} Security 17), pages 1463--1479, 2017.
[29]
P. Mohan, A. Thakurta, E. Shi, D. Song, and D. Culler. Gupt: privacy preserving data analysis made easy. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 349--360, 2012.
[30]
K. Muralidhar, R. Parsa, and R. Sarathy. A general additive data perturbation method for database security. Management Science, 45(10):1399--1415, 1999.
[31]
S. R. Oliveira and O. R. Zaiane. Privacy preserving clustering by data transformation. Journal of Information and Data Management, 1(1):37--37, 2010.
[32]
A. Pretschner, M. Hilty, and D. Basin. Distributed usage control. Communications of the ACM, 49(9):39--44, 2006.
[33]
B. Salami, G. A. Malazgirt, O. Arcas-Abella, A. Yurdakul, and N. Sonmez. AxleDB: A novel programmable query processing platform on FPGA. Microprocessors and Microsystems, 51:142--164, 2017.
[34]
S. Upadhyay, C. Sharma, P. Sharma, P. Bharadwaj, and K. Seeja. Privacy preserving data mining with 3-d rotation transformation. Journal of King Saud University-Computer and Information Sciences, 30(4):524--530, 2018.
[35]
K. Vissers. Versal: The xilinx adaptive compute acceleration platform (acap). In Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pages 83--83, 2019.
[36]
F. Wang, R. Ko, and J. Mickens. Riverbed: enforcing user-defined privacy constraints in distributed web services. In 16th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 19), pages 615--630, 2019.
[37]
S. Xu, S. Lee, S.-W. Jun, M. Liu, J. Hicks, et al. BlueCache: A scalable distributed flash-based key-value store. PVLDB, 10(4):301--312, 2016.
[38]
J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on information theory, 23(3), 1977.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SYSTOR '21: Proceedings of the 14th ACM International Conference on Systems and Storage
June 2021
226 pages
ISBN:9781450383981
DOI:10.1145/3456727
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Technion: Israel Institute of Technology
  • USENIX Assoc: USENIX Assoc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SYSTOR '21
Sponsor:

Acceptance Rates

SYSTOR '21 Paper Acceptance Rate 18 of 63 submissions, 29%;
Overall Acceptance Rate 108 of 323 submissions, 33%

Upcoming Conference

SYSTOR '24
The 17th ACM International Systems and Storage Conference
September 23 - 24, 2024
Virtual , Israel

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 175
    Total Downloads
  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media