Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3375708.3380316acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Privacy-preserving SVM on Outsourced Genomic Data via Secure Multi-party Computation

Published: 16 March 2020 Publication History

Abstract

Machine learning methods are employed in many areas, such as medical data research, for their efficient and powerful data mining ability. However, submitting unprotected data to a third party, which attempts to train a machine learning model, may suffer from data leakage and privacy violation when the third party is compromised by an adversary. Hence, designing a protocol to execute encrypted computation is inevitably indispensable. In order to address this problem, we propose protocols based on secure multi-party computation to train a support vector machine model privately. Utilizing the semi-honest adversary model and oblivious transfer, the proposed protocols enable the training of a non-linear support vector machine on the combined data from various sources without sacrificing the privacy of individuals. The protocols are applied to train a support vector machine model with the radial basis function kernel on HIV sequence data to predict the efficacy of a certain antiviral drug, which only works if the viruses can only use the human CCR5 coreceptor for cell entry. Benchmarked on synthesized data with 10 data sources that consist of randomly generated integers, containing 100 labeled samples each, the protocol has consumed online time 2991.386/166.912 ms on average in arithmetic/boolean circuits, respectively. The cross-validation has reached 0.5819 F1-score on average on training data with the optimized parameters, which have reached 0.7058 F1-score afterwards on testing data set, which consists of protein sequence of CCR5 and its subtypes. The complete training and testing process on the real data, which contains in total 766 samples having 924 features after encoding, has consumed 43.75/15.84 seconds on average using arithmetic/boolean circuits, respectively, which shows the effectiveness and efficiency of our protocols compared to some of the existing studies in the literature.

References

[1]
Amin Allahyar and Jeroen De Ridder. 2015. FERAL: Network-based classifier with application to breast cancer outcome prediction . Bioinformatics, Vol. 31, 12 (2015), i311--i319. https://doi.org/10.1093/bioinformatics/btv255
[2]
Donald Beaver. 1991. Efficient Multiparty Protocols Using Circuit Randomization. In Advances in Cryptology - CRYPTO '91, 11th Annual International Cryptology Conference, Santa Barbara, California, USA, August 11--15, 1991, Proceedings . 420--432. https://doi.org/10.1007/3--540--46766--1_34
[3]
Joseph A Cruz and David S Wishart. 2006. Applications of machine learning in cancer prediction and prognosis. Cancer informatics, Vol. 2 (2006), 117693510600200030.
[4]
Daniel Demmler, Thomas Schneider, and Michael Zohner. 2015. ABY - A Framework for Efficient Mixed-Protocol Secure Two-Party Computation . February (2015), 8--11. https://doi.org/10.14722/ndss.2015.23113
[5]
T. Dierks and E. Rescorla. 2008. The Transport Layer Security (TLS) Protocol Version 1.2 . RFC 5246 (Proposed Standard). http://www.ietf.org/rfc/rfc5246.txt Updated by RFCs 5746, 5878, 6176.
[6]
Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin E. Lauter, Michael Naehrig, and John Wernsing. 2016. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19--24, 2016 . 201--210. http://proceedings.mlr.press/v48/gilad-bachrach16.html
[7]
O. Goldreich, S. Micali, and A. Wigderson. 1987. How to Play ANY Mental Game. In Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing (New York, New York, USA) (STOC '87). ACM, New York, NY, USA, 218--229. https://doi.org/10.1145/28395.28420
[8]
Thore Graepel, Kristin E. Lauter, and Michael Naehrig. 2012. ML Confidential: Machine Learning on Encrypted Data. In Information Security and Cryptology - ICISC 2012 - 15th International Conference, Seoul, Korea, November 28--30, 2012, Revised Selected Papers. 1--21. https://doi.org/10.1007/978--3--642--37682--5_1
[9]
Shunjie Han, Cao Qubo, and Han Meng. 2012. Parameter selection in SVM with RBF kernel function. In World Automation Congress 2012 . IEEE, 1--4.
[10]
Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. 2017. CryptoDL: Deep Neural Networks over Encrypted Data. CoRR, Vol. abs/1711.05189 (2017). arxiv: 1711.05189 http://arxiv.org/abs/1711.05189
[11]
Seny Kamara and Mariana Raykova. 2011. Secure outsourced computation in a multi-tenant cloud. In IBM Workshop on Cryptography and Security in Clouds. 15--16.
[12]
Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and Thomas Schneider. 2013. A systematic approach to practically efficient general two-party secure function evaluation protocols and their modular design. Journal of Computer Security, Vol. 21 (2013), 283--315.
[13]
Payman Mohassel and Peter Rindal. 2018. ABY(^mbox3 ): A Mixed Protocol Framework for Machine Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15--19, 2018. 35--52. https://doi.org/10.1145/3243734.3243760
[14]
Payman Mohassel and Yupeng Zhang. 2017. SecureML: A System for Scalable Privacy-Preserving Machine Learning . Proceedings - IEEE Symposium on Security and Privacy (2017), 19--38. https://doi.org/10.1109/SP.2017.12
[15]
Erman Pattuk, Murat Kantarcioglu, Huseyin Ulusoy, and Bradley Malin. 2016. CheapSMC: A framework to minimize secure multiparty computation cost in the cloud. In IFIP Annual Conference on Data and Applications Security and Privacy. Springer, 285--294.
[16]
Michael O. Rabin. 1981. How To Exchange Secrets with Oblivious Transfer. IACR Cryptology ePrint Archive, Vol. 2005 (1981), 187.
[17]
Thomas Schneider and Oleksandr Tkachenko. 2019 a. EPISODE: Efficient Privacy-PreservIng Similar Sequence Queries on Outsourced Genomic DatabasEs. ASIACCS.
[18]
Thomas Schneider and Oleksandr Tkachenko. 2019 b. EPISODE: Efficient Privacy-PreservIng Similar Sequence Queries on Outsourced Genomic DatabasEs. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security (Auckland, New Zealand) (Asia CCS '19). ACM, New York, NY, USA, 315--327. https://doi.org/10.1145/3321705.3329800
[19]
Tobias Sing, Valentina Svicher, Niko Beerenwinkel, Francesca Ceccherini-Silberstein, Martin D"aumer, Rolf Kaiser, Hauke Walter, Klaus Korn, Daniel Hoffmann, Mark Oette, et almbox. 2005. Characterization of novel HIV drug resistance mutations using clustering, multidimensional scaling and SVM-based feature ranking. In European Conference on Principles of Data Mining and Knowledge Discovery. Springer, 285--296.
[20]
Harry Chandra Tanuwidjaja, Rakyong Choi, and Kwangjo Kim. 2019. A Survey on Deep Learning Techniques for Privacy-Preserving. In Machine Learning for Cyber Security - Second International Conference, ML4CS 2019, Xi'an, China, September 19--21, 2019, Proceedings. 29--46. https://doi.org/10.1007/978--3-030--30619--9_4
[21]
Ali Burak Ünal, Mete Akgü n, and Nico Pfeifer. 2019. A framework for a fast privacy preserving calculation of non-linear kernels for machine learning applications in precision medicine. In Cryptology and Network Security - 17th International Conference, CANS 2019, Fuzhou, China, October 25 - 27, 2019, Proceedings .
[22]
Jaideep Vaidya, Hwanjo Yu, and Xiaoqian Jiang. 2008. Privacy-preserving SVM classification . Knowledge and Information Systems, Vol. 14, 2 (2008), 161--178. https://doi.org/10.1007/s10115-007-0073--7
[23]
Sameer Wagh, Divya Gupta, and Nishanth Chandran. 2019. SecureNN: 3-Party Secure Computation for Neural Network Training . Proceedings on Privacy Enhancing Technologies, Vol. 2019, 3 (2019), 26--49. https://doi.org/10.2478/popets-2019-0035
[24]
Andrew C Yao. 1982. Protocols for secure computations. In 23rd annual symposium on foundations of computer science (sfcs 1982). IEEE, 160--164.
[25]
Hwanjo Yu, Jaideep Vaidya, and Xiaoqian Jiang. 2006. Privacy-preserving SVM classification on vertically partitioned data . Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 3918 LNAI (2006), 647--656. https://doi.org/10.1007/11731139_74
[26]
Jun Zhang, Xin Wang, Siu-Ming Yiu, Zoe L. Jiang, and Jin Li. 2017. Secure Dot Product of Outsourced Encrypted Vectors and its Application to SVM . (2017), 75--82. https://doi.org/10.1145/3055259.3055270

Cited By

View all
  • (2024)Private pathological assessment via machine learning and homomorphic encryptionBioData Mining10.1186/s13040-024-00379-917:1Online publication date: 10-Sep-2024
  • (2024)Research on federal learning privacy protection based on secure multi-party computingProceedings of the 2024 3rd International Conference on Cyber Security, Artificial Intelligence and Digital Economy10.1145/3672919.3672947(142-147)Online publication date: 1-Mar-2024
  • (2024)A rational and reliable model for outsourcing polynomial two-party computationComputers and Electrical Engineering10.1016/j.compeleceng.2024.109829120(109829)Online publication date: Dec-2024
  • Show More Cited By

Index Terms

  1. Privacy-preserving SVM on Outsourced Genomic Data via Secure Multi-party Computation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IWSPA '20: Proceedings of the Sixth International Workshop on Security and Privacy Analytics
    March 2020
    84 pages
    ISBN:9781450371155
    DOI:10.1145/3375708
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hiv co-receptor prediction
    2. privacy preserving machine learning
    3. secure dot product computation
    4. secure multi-party computation
    5. support vector machine

    Qualifiers

    • Research-article

    Conference

    CODASPY '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 18 of 58 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)83
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Private pathological assessment via machine learning and homomorphic encryptionBioData Mining10.1186/s13040-024-00379-917:1Online publication date: 10-Sep-2024
    • (2024)Research on federal learning privacy protection based on secure multi-party computingProceedings of the 2024 3rd International Conference on Cyber Security, Artificial Intelligence and Digital Economy10.1145/3672919.3672947(142-147)Online publication date: 1-Mar-2024
    • (2024)A rational and reliable model for outsourcing polynomial two-party computationComputers and Electrical Engineering10.1016/j.compeleceng.2024.109829120(109829)Online publication date: Dec-2024
    • (2024)Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and LimitationsDynamics of Disasters10.1007/978-3-031-74006-0_4(87-121)Online publication date: 25-Sep-2024
    • (2022)Internet Financial Data Security and Economic Risk Prevention for Android Application Privacy Leakage DetectionComputational Intelligence and Neuroscience10.1155/2022/67822812022Online publication date: 1-Jan-2022
    • (2022)Privacy-Preserving Multi-Class Support Vector Machine Model on Medical DiagnosisIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2022.315759226:7(3342-3353)Online publication date: Jul-2022
    • (2022)Generic server-aided secure multi-party computation in cloud computingComputer Standards & Interfaces10.1016/j.csi.2021.10355279:COnline publication date: 1-Jan-2022
    • (2021)An Intelligent Detection Method of Personal Privacy Disclosure for Social NetworksSecurity and Communication Networks10.1155/2021/55182202021Online publication date: 1-Jan-2021
    • (2021)Oblivious Linear Group Actions and ApplicationsProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security10.1145/3460120.3484584(630-650)Online publication date: 12-Nov-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media