Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474123.3486759acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Secure Featurization and Applications to Secure Phishing Detection

Published: 15 November 2021 Publication History

Abstract

Secure inference allows a server holding a machine learning (ML) inference algorithm with private weights, and a client with a private input, to obtain the output of the inference algorithm, without revealing their respective private inputs to one another. While this problem has received plenty of attention, existing systems are not applicable to a large class of ML algorithms (such as in the domain of Natural Language Processing) that perform featurization as their first step. In this work, we address this gap and make the following contributions:
We initiate the formal study of secure featurization and its use in conjunction with secure inference protocols.
We build secure featurization protocols in the one/two/three-server settings that provide a tradeoff between security and efficiency.
Finally, we apply our algorithms in the context of secure phishing detection and evaluate our end-to-end protocol on models that are commonly used for phishing detection.

Supplementary Material

MP4 File (CCSW-fp60.mp4)
We present our paper titled "Secure Featurization and Applications to Secure Phishing Detection". We initiate the formal study of secure featurization and its use in conjunction with secure inference protocols. We build secure featurization protocols in one/two/three server settings that provide a tradeoff between security and efficiency.

References

[1]
Apple and Google. Exposure notification privacy-preserving analytics (enpa) white paper. https://covid19-static.cdn-apple.com/applications/covid19/current/static/contact-tracing/pdf/ENPA_White_Paper.pdf.
[2]
Toshinori Araki, Jun Furukawa, Yehuda Lindell, Ariel Nof, and Kazuma Ohara. High-throughput semi-honest secure three-party computation with an honest majority. In CCS, 2016.
[3]
Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. Machine Learning Classification over Encrypted Data. In NDSS 2015.
[4]
Elette Boyle, Niv Gilboa, and Yuval Ishai. Function secret sharing. In EUROCRYPT. Springer, 2015.
[5]
Elette Boyle, Niv Gilboa, and Yuval Ishai. Function secret sharing: Improvements and extensions. In CCS, 2016.
[6]
Ran Canetti. Universally composable security: A new paradigm for cryptographic protocols. In FOCS. IEEE Computer Society, 2001.
[7]
David Cash, Joseph Jaeger, Stanislaw Jarecki, Charanjit S. Jutla, Hugo Krawczyk, Marcel-Catalin Rosu, and Michael Steiner. Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation. In NDSS. The Internet Society, 2014.
[8]
Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul Sharma, and Shardul Tripathi. EzPC: Programmable and Efficient Secure Two-Party Computation for Machine Learning. In IEEE EuroS&P 2019.
[9]
Nishanth Chandran, Divya Gupta, and Akash Shah. Circuit-PSI with linear complexity via relaxed batch OPPRF. PoPETs, 2022(1), 2022.
[10]
Edward J. Chou, Arun Gururajan, Kim Laine, Nitin Kumar Goel, Anna Bertiger, and Jack W. Stokes. Privacy-preserving phishing web page classification via fully homomorphic encryption. In 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. IEEE, 2020.
[11]
Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. Natural language processing (almost) from scratch. J. Mach. Learn. Res., 2011.
[12]
Henry Corrigan-Gibbs and Dan Boneh. Prio: Private, robust, and scalable computation of aggregate statistics. In 14th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2017, Boston, MA, USA, March 27-29, 2017, pages 259--282. USENIX Association, 2017.
[13]
Sanchari Das, Andrew Kim, Zachary Tingle, and Christena Nippert-Eng. All about phishing: Exploring user research through a systematic literature review. CoRR, abs/1908.05897, 2019.
[14]
Steven Englehardt. Next steps in privacy-preserving telemetry with prio. https://blog.mozilla.org/security/2019/06/06/next-steps-in-privacy-preserving-telemetry-with-prio/.
[15]
Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin E. Lauter, Michael Naehrig, and John Wernsing. CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy. In ICML 2016.
[16]
Oded Goldreich, Silvio Micali, and Avi Wigderson. How to Play any Mental Game or A Completeness Theorem for Protocols with Honest Majority. In STOC 1987.
[17]
Yan Huang, David Evans, and Jonathan Katz. Private set intersection: Are garbled circuits better than custom protocols? In NDSS, 2012.
[18]
B. Issac, R. Chiong, and S. M. Jacob. Analysis of phishing attacks and countermeasures, 2014.
[19]
Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. GAZELLE: A Low Latency Framework for Secure Neural Network Inference. In USENIX Security 2018.
[20]
Seny Kamara, Payman Mohassel, Mariana Raykova, and Seyed Saeed Sadeghian. Scaling private set intersection to billion-element sets. In FC. Springer, 2014.
[21]
Jonathan Katz and Yehuda Lindell. Introduction to Modern Cryptography, Second Edition. Chapman & Hall/CRC, 2nd edition, 2014.
[22]
Adam Kirsch, Michael Mitzenmacher, and Udi Wieder. More robust hashing: Cuckoo hashing with a stash. SIAM J. Comput., 2009.
[23]
Lea Kissner and Dawn Xiaodong Song. Privacy-preserving set operations. In CRYPTO. Springer, 2005.
[24]
Vladimir Kolesnikov, Naor Matania, Benny Pinkas, Mike Rosulek, and Ni Trieu. Practical multi-party private set intersection from symmetric-key techniques. In CCS, 2017.
[25]
Nishant Kumar, Mayank Rathee, Nishanth Chandran, Divya Gupta, Aseem Rastogi, and Rahul Sharma. CrypTFlow: Secure TensorFlow Inference. In IEEE S&P 2020.
[26]
Hung Le, Quang Pham, Doyen Sahoo, and Steven C. H. Hoi. Urlnet: Learning a URL representation with deep learning for malicious URL detection. CoRR, abs/1802.03162, 2018.
[27]
Yehuda Lindell. How to simulate it - a tutorial on the simulation proof technique. Cryptology ePrint Archive, Report 2016/046, 2016. https://eprint.iacr.org/2016/046.
[28]
Pranav Maneriker, Jack W. Stokes, Edir Garcia Lazo, Diana Carutasu, Farid Tajaddodianfar, and Arun Gururajan. Urltran: Improving phishing URL detection using transformers. CoRR, abs/2106.05256, 2021.
[29]
Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srinivasan, Wenting Zheng, and Raluca Ada Popa. Delphi: A Cryptographic Inference Service for Neural Networks. In USENIX Security 2020.
[30]
Payman Mohassel and Peter Rindal. ABY(^mbox3 ): A Mixed Protocol Framework for Machine Learning. In CCS 2018.
[31]
Payman Mohassel, Peter Rindal, and Mike Rosulek. Fast database joins and PSI for secret shared data. In CCS, 2020.
[32]
Payman Mohassel, Mike Rosulek, and Ye Zhang. Fast and secure three-party computation: The garbled circuit approach. In CCS, 2015.
[33]
Payman Mohassel and Yupeng Zhang. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In IEEE S&P 2017.
[34]
mpc-msri. EzPC. https://github.com/mpc-msri/EzPC.
[35]
mpc msri. 2pc-circuit-psi, 2021.
[36]
Oleksandr-Tkachenko. HashingTables. https://github.com/Oleksandr-Tkachenko/HashingTables.
[37]
osu-crypto. libOTe. https://github.com/osu-crypto/libOTe. Accessed: 2020-10-07.
[38]
Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. In Algorithms - ESA 2001, 9th Annual European Symposium, Aarhus, Denmark, August 28-31, 2001, Proceedings. Springer, 2001.
[39]
Benny Pinkas, Thomas Schneider, Gil Segev, and Michael Zohner. Phasing: Private set intersection using permutation-based hashing. In USENIX, 2015.
[40]
Benny Pinkas, Thomas Schneider, Oleksandr Tkachenko, and Avishay Yanai. Efficient circuit-based PSI with linear communication. In EUROCRYPT. Springer, 2019.
[41]
Benny Pinkas, Thomas Schneider, Christian Weinert, and Udi Wieder. Efficient circuit-based PSI via cuckoo hashing. In EUROCRYPT. Springer, 2018.
[42]
Benny Pinkas, Thomas Schneider, and Michael Zohner. Scalable private set intersection based on OT extension. ACM Trans. Priv. Secur., 2018.
[43]
Deevashwer Rathee, Mayank Rathee, Rahul Kranti Kiran Goli, Divya Gupta, Rahul Sharma, Nishanth Chandran, and Aseem Rastogi. SIRNN: A math library for secure inference of RNNs. In IEEE S&P 2020, 2020.
[44]
Deevashwer Rathee, Mayank Rathee, Nishant Kumar, Nishanth Chandran, Divya Gupta, Aseem Rastogi, and Rahul Sharma. Cryptflow2: Practical 2-party secure inference. In CCS, 2020.
[45]
M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, and Farinaz Koushanfar. Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications. In AsiaCCS 2018.
[46]
Peter Rindal and Mike Rosulek. Malicious-secure private set intersection via dual execution. In CCS, 2017.
[47]
Peter Rindal and Phillipp Schoppmann. VOLE-PSI: fast OPRF and circuit-psi from vector-ole. In EUROCRYPT, 2021.
[48]
Phillipp Schoppmann, Adrià Gascón, Leonie Reichert, and Mariana Raykova. Distributed vector-ole: Improved constructions and implementation. In CCS, 2019.
[49]
Adi Shamir. How to share a secret. Commun. ACM, 1979.
[50]
The OpenSSL Project. OpenSSL Cryptography and SSL/TLS Toolkit, https://www.openssl.org/. https://www.openssl.org/.
[51]
Sameer Wagh, Divya Gupta, and Nishanth Chandran. SecureNN: 3-Party Secure Computation for Neural Network Training. PoPETs 2019.
[52]
Andrew Chi-Chih Yao. How to Generate and Exchange Secrets (Extended Abstract). In FOCS 1986.
[53]
Huaping Yuan, Zhenguo Yang, Xu Chen, Yukun Li, and Wenyin Liu. Url2vec: Url modeling with character embeddings for fast and accurate phishing website detection. In 2018 IEEE Intl Conf on Parallel Distributed Processing with Applications, Ubiquitous Computing Communications, Big Data Cloud Computing, Social Computing Networking, Sustainable Computing Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), 2018.
[54]
Yin Zhang, Rong Jin, and Zhi-Hua Zhou. Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern., 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCSW '21: Proceedings of the 2021 on Cloud Computing Security Workshop
November 2021
161 pages
ISBN:9781450386531
DOI:10.1145/3474123
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. phishing detection
  2. private set intersection
  3. secure multi-party computation

Qualifiers

  • Research-article

Conference

CCS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 37 of 108 submissions, 34%

Upcoming Conference

CCS '24
ACM SIGSAC Conference on Computer and Communications Security
October 14 - 18, 2024
Salt Lake City , UT , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)4
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media