Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3488932.3517402acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

InfoCensor: An Information-Theoretic Framework against Sensitive Attribute Inference and Demographic Disparity

Published: 30 May 2022 Publication History

Abstract

Deep learning sits at the forefront of many on-going advances in a variety of learning tasks. Despite its supremacy in accuracy under benign environments, Deep learning suffers from adversarial vulnerability and privacy leakage (e.g., sensitive attribute inference) in adversarial environments. Also, many deep learning systems exhibit discriminatory behaviors against certain groups of subjects (e.g., demographic disparity). In this paper, we propose a unified information-theoretic framework to defend against sensitive attribute inference and mitigate demographic disparity in deep learning for the model partitioning scenario, by minimizing two mutual information terms. We prove that as one mutual information term decreases, an upper bound on the chance for any adversary to infer the sensitive attribute from model representations will decrease. Also, the extent of demographic disparity is bounded by the other mutual information term. Since direct optimization on the mutual information is intractable, we also propose a tractable Gaussian mixture based method and a gumbel-softmax trick based method for estimating the two mutual information terms. Extensive evaluations in a variety of application domains, including computer vision and natural language processing, demonstrate our framework's overall better performance than the existing baselines.

Supplementary Material

MP4 File (ASIA-CCS22-fp204.mp4)
In this presentation, we present theoretical connections between mutual information and attribute inference and demographic disparity. Based on the theoretical results, we introduce InfoCensor, an information-theoretic framework against sensitive attribute inference and demographic disparity. We introduce the advantages of InfoCensor compared to the previous baselines. Extensive evaluations on varied datasets and network architectures demonstrate the effectiveness of InfoCensor.

References

[1]
[n.d.]. Health Heritage. https://www.kaggle.com/c/hhp.
[2]
[n.d.]. Twitter (PAN). https://pan.webis.de/clef21/pan21-web/author-profiling. html.
[3]
[n.d.]. UTKFace. https://susanqq.github.io/UTKFace/.
[4]
Toon Calders and Sicco Verwer. 2010. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, Vol. 21, 2 (2010), 277--292.
[5]
Jianfeng Chi, Emmanuel Owusu, Xuwang Yin, Tong Yu, William Chan, Patrick Tague, and Yuan Tian. 2018. Privacy partitioning: Protecting user data during the deep learning inference phase. arXiv preprint arXiv:1812.02863 (2018).
[6]
Maximin Coavoux, Shashi Narayan, and Shay B Cohen. 2018. Privacy-preserving Neural Representations of Text. In 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1--10.
[7]
Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa Weis, Kevin Swersky, Toniann Pitassi, and Richard Zemel. 2019. Flexibly fair representation learning by disentanglement. In International Conference on Machine Learning. PMLR, 1436--1445.
[8]
Sever S Dragomir, ML Scholz, and J Sunde. 2000. Some upper bounds for relative entropy and applications. Computers & Mathematics with Applications, Vol. 39, 9--10 (2000), 91--100.
[9]
Harrison Edwards and Amos Storkey. 2015. Censoring representations with an adversary. arXiv preprint arXiv:1511.05897 (2015).
[10]
Ziv Goldfeld, Ewout Van Den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, and Yury Polyanskiy. 2019. Estimating Information Flow in Deep Neural Networks. In International Conference on Machine Learning. PMLR, 2299--2308.
[11]
Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications, Vol. 116 (2018), 1--8.
[12]
Zecheng He, Tianwei Zhang, and Ruby B Lee. 2019. Model inversion attacks against collaborative inference. In Proceedings of the 35th Annual Computer Security Applications Conference. 148--162.
[13]
Yusuke Iwasawa, Kotaro Nakayama, Ikuko Yairi, and Yutaka Matsuo. 2017. Privacy Issues Regarding the Application of DNNs to Activity-Recognition using Wearables and Its Countermeasures by Use of Adversarial Training. In IJCAI. 1930--1936.
[14]
Matthew Jagielski, Nicholas Carlini, David Berthelot, Alex Kurakin, and Nicolas Papernot. 2020. High accuracy and high fidelity extraction of neural networks. In 29th USENIX Security Symposium (USENIX Security 20). 1345--1362.
[15]
Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. \ arXiv preprint arXiv:1611.01144 (2016).
[16]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980 (2014).
[17]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[18]
Alexander Kraskov, Harald Stögbauer, and Peter Grassberger. 2004. Estimating mutual information. Physical review E, Vol. 69, 6 (2004), 066138.
[19]
Nicholas D Lane and Petko Georgiev. 2015. Can deep learning revolutionize mobile sensing?. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications. 117--122.
[20]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (1998), 2278--2324.
[21]
Ang Li, Yixiao Duan, Huanrui Yang, Yiran Chen, and Jianlei Yang. 2020. TIPRDC: task-independent privacy-respecting data crowdsourcing framework for deep learning with anonymized intermediate representations. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 824--832.
[22]
Meng Li, Liangzhen Lai, Naveen Suda, Vikas Chandra, and David Z Pan. 2017. Privynet: A flexible framework for privacy-preserving deep neural network training. arXiv preprint arXiv:1709.06161 (2017).
[23]
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard S Zemel. 2016. The Variational Fair Autoencoder. In ICLR.
[24]
Cuicui Luo, Desheng Wu, and Dexiang Wu. 2017. A deep learning approach for credit scoring using credit default swaps. Engineering Applications of Artificial Intelligence, Vol. 65 (2017), 465--470.
[25]
Chris J Maddison, Andriy Mnih, and Yee Whye Teh. 2016. The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016).
[26]
David Madras, Elliot Creager, Toniann Pitassi, and Richard Zemel. 2018. Learning adversarially fair and transferable representations. In International Conference on Machine Learning. PMLR, 3384--3393.
[27]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273--1282.
[28]
Daniel Moyer, Shuyang Gao, Rob Brekelmans, Greg Ver Steeg, and Aram Galstyan. 2018. Invariant representations without adversarial training. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 9102--9111.
[29]
Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. 2019. Knockoff nets: Stealing functionality of black-box models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4954--4963.
[30]
Jonathan Scarlett and Volkan Cevher. 2019. An Introductory Guide to Fano's Inequality with Applications in Statistical Estimation. arXiv preprint arXiv:1901.00555 (2019).
[31]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 3--18.
[32]
Congzheng Song and Ananth Raghunathan. 2020. Information leakage in embedding models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 377--390.
[33]
Congzheng Song and Vitaly Shmatikov. 2020. Overlearning Reveals Sensitive Attributes. In 8th International Conference on Learning Representations, ICLR 2020.
[34]
Jean-Baptiste Truong, Pratyush Maini, Robert Walls, and Nicolas Papernot. 2020. Data-Free Model Extraction. arXiv preprint arXiv:2011.14779 (2020).
[35]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[36]
Qizhe Xie, Zihang Dai, Yulun Du, Eduard Hovy, and Graham Neubig. 2017. Controllable invariance through adversarial feature learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 585--596.
[37]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P Gummadi. 2017. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics. PMLR, 962--970.
[38]
Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International conference on machine learning. PMLR, 325--333.
[39]
Ligeng Zhu and Song Han. 2020. Deep leakage from gradients. In Federated Learning. Springer, 17--31.
[40]
Sicheng Zhu, Xiao Zhang, and David Evans. 2020. Learning adversarially robust representations via worst-case mutual information maximization. In International Conference on Machine Learning. PMLR, 11609--11618.

Cited By

View all
  • (2024)Learning to Prevent Input Leakages in the Mobile Cloud InferenceIEEE Transactions on Mobile Computing10.1109/TMC.2023.334033823:7(7650-7663)Online publication date: Jul-2024
  • (2024)Protecting Activity Sensing Data Privacy Using Hierarchical Information Dissociation2024 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS62487.2024.10735551(1-9)Online publication date: 30-Sep-2024

Index Terms

  1. InfoCensor: An Information-Theoretic Framework against Sensitive Attribute Inference and Demographic Disparity

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ASIA CCS '22: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security
        May 2022
        1291 pages
        ISBN:9781450391405
        DOI:10.1145/3488932
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 30 May 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. attribute inference
        2. demographic disparity
        3. information theory

        Qualifiers

        • Research-article

        Conference

        ASIA CCS '22
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 418 of 2,322 submissions, 18%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)28
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 23 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Learning to Prevent Input Leakages in the Mobile Cloud InferenceIEEE Transactions on Mobile Computing10.1109/TMC.2023.334033823:7(7650-7663)Online publication date: Jul-2024
        • (2024)Protecting Activity Sensing Data Privacy Using Hierarchical Information Dissociation2024 IEEE Conference on Communications and Network Security (CNS)10.1109/CNS62487.2024.10735551(1-9)Online publication date: 30-Sep-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media