Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

SMINT: Toward Interpretable and Robust Model Sharing for Deep Neural Networks

Published: 03 May 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Sharing a pre-trained machine learning model, particularly a deep neural network via prediction APIs, is becoming a common practice on machine learning as a service (MLaaS) platforms nowadays. Although deep neural networks (DNN) have shown remarkable successes in many tasks, they are also criticized for the lack of interpretability and transparency. Interpreting a shared DNN model faces two additional challenges compared with interpreting a general model. (1) Limited training data can be disclosed to users. (2) The internal structure of the models may not be available. These two challenges impede the application of most existing interpretability approaches, such as saliency maps or influence functions, for DNN models. Case-based reasoning methods have been used for interpreting decisions; however, how to select and organize the data points under the constraints of shared DNN models is not discussed. Moreover, simply providing cases as explanations may not be sufficient for supporting instance level interpretability. Meanwhile, existing interpretation methods for DNN models generally lack the means to evaluate the reliability of the interpretation. In this article, we propose a framework named Shared Model INTerpreter (SMINT) to address the above limitations. We propose a new data structure called a boundary graph to organize training points to mimic the predictions of DNN models. We integrate local features, such as saliency maps and interpretable input masks, into the data structure to help users to infer the model decision boundaries. We show that the boundary graph is able to address the reliability issues in many local interpretation methods. We further design an algorithm named hidden-layer aware p-test to measure the reliability of the interpretations. Our experiments show that SMINT is able to achieve above 99% fidelity to corresponding DNN models on both MNIST and ImageNet by sharing only a tiny fraction of training data to make these models interpretable. The human pilot study demonstrates that SMINT provides better interpretability compared with existing methods. Moreover, we demonstrate that SMINT is able to assist model tuning for better performance on different user data.

    References

    [1]
    Agnar Aamodt and Enric Plaza. 1994. Case-based reasoning: Foundational issues, methodological variations, and system approaches. AI Commun. 7, 1 (1994), 39--59.
    [2]
    Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. 2018. Sanity checks for saliency maps. In Advances in Neural Information Processing Systems. 9505--9515.
    [3]
    David Alvarez-Melis and Tommi S. Jaakkola. 2018. On the robustness of interpretability methods. https://arxiv.org/pdf/1806.08049.pdf.
    [4]
    Amazon. 2017. Machine Learning on AWS. Retrieved July 7, 2017 from https://aws.amazon.com/machine-learning/.
    [5]
    Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn, and Ludwig Schmidt. 2015. Practical and optimal LSH for angular distance. In Advances in Neural Information Processing Systems. 1225--1233.
    [6]
    Alexandr Andoni and Ilya Razenshteyn. 2015. Optimal data-dependent hashing for approximate near neighbors. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing. ACM, 793--801.
    [7]
    Franz Aurenhammer. 1991. Voronoi diagrams: A survey of a fundamental geometric data structure. ACM Comput. Surv. 23, 3 (1991), 345--405.
    [8]
    Jacob Bien and Robert Tibshirani. 2011. Prototype selection for interpretable classification. Ann. Appl. Stat. 5, 4 (2011), 2403--2424.
    [9]
    Jean-Daniel Boissonnat, Olivier Devillers, and Monique Teillaud. 1993. A semidynamic construction of higher-order voronoi diagrams and its randomized analysis. Algorithmica 9, 4 (1993), 329--356.
    [10]
    Olcay Boz. 2002. Extracting decision trees from trained neural networks. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 456--461.
    [11]
    Nicholas Carlini and David Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. ACM, 3--14.
    [12]
    Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP’17). IEEE, 39--57.
    [13]
    Zhengping Che, Sanjay Purushotham, Robinder Khemani, and Yan Liu. 2016. Interpretable deep models for icu outcome prediction. In Proceedings of the AMIA Annual Symposium Proceedings, Vol. 2016. American Medical Informatics Association, 371.
    [14]
    Daqing Chen and Phillip Burrell. 2001. Case-based reasoning system and artificial neural networks: A review. Neur. Comput. Appl. 10, 3 (2001), 264--276.
    [15]
    Marvin S. Cohen, Jared T. Freeman, and Steve Wolf. 1996. Metarecognition in time-stressed decision making: Recognizing, critiquing, and correcting. Hum. Factors 38, 2 (1996), 206--219.
    [16]
    Mark Craven and Jude W. Shavlik. 1996. Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems. 24--30.
    [17]
    Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the 20th Annual Symposium on Computational Geometry. ACM, 253--262.
    [18]
    Ruth Fong and Andrea Vedaldi. 2017. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE International Conference on Computer Vision. 3429--3437.
    [19]
    Nicholas Frosst and Geoffrey Hinton. 2017. Distilling a neural network into a soft decision tree. https://arxiv.org/pdf/1711.09784.pdf.
    [20]
    Amirata Ghorbani, Abubakar Abid, and James Zou. 2019. Interpretation of neural networks is fragile. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3681--3688.
    [21]
    Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. https://arxiv.org/abs/1412.6572.pdf.
    [22]
    Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’13). IEEE, 6645--6649.
    [23]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
    [24]
    Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, et al. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Sign. Process. Mag. 29, 6 (2012), 82--97.
    [25]
    Qiang Huang, Jianlin Feng, Qiong Fang, Wilfred Ng, and Wei Wang. 2017. Query-aware locality-sensitive hashing scheme for lp norm. VLDB J. 26, 5 (2017), 683--708.
    [26]
    Mayank Kabra, Alice Robie, and Kristin Branson. 2015. Understanding classifier errors by examining influential neighbors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3917--3925.
    [27]
    Been Kim, Rajiv Khanna, and Oluwasanmi O. Koyejo. 2016. Examples are not enough, learn to criticize! criticism for interpretability. In Advances in Neural Information Processing Systems. 2280--2288.
    [28]
    Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al. 2018. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In Proceedings of the International Conference on Machine Learning. 2673--2682.
    [29]
    Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP'14). Association for Computational Linguistics, 1746--1751.
    [30]
    Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Maximilian Alber, Kristof T. Schütt, Sven Dähne, Dumitru Erhan, and Been Kim. 2017. The (Un) reliability of saliency methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. Springer, 267--280.
    [31]
    Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. JMLR. org, 1885--1894.
    [32]
    J. Kolodner. 2014. Selecting the best case for a case-based reasoner. In Proceedings of the 11th Annual Conference Cognitive Science Society Pod. 155--162.
    [33]
    Janet L. Kolodner. 1992. An introduction to case-based reasoning. Artif. Intell. Rev. 6, 1 (1992), 3--34.
    [34]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (May 2017), 84--90.
    [35]
    Brian Kulis and Kristen Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. IEEE, 2130--2137.
    [36]
    Der-Tsai Lee and Bruce J. Schachter. 1980. Two algorithms for constructing a Delaunay triangulation. International J. Comput. Inf. Sci. 9, 3 (1980), 219--242.
    [37]
    Oscar Li, Hao Liu, Chaofan Chen, and Cynthia Rudin. 2018. Deep learning for case-based reasoning through prototypes: A neural network that explains its predictions. In Thirty-Second AAAI Conference on Artificial Intelligence.
    [38]
    Xingjun Ma, Bo Li, Yisen Wang, Sarah M. Erfani, Sudanthi Wijewickrema, Michael E. Houle, Grant Schoenebeck, Dawn Song, and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. Proceedings of the International Conference on Learning Representations (ICLR’18).
    [39]
    Charles Mathy, Nate Derbinsky, José Bento, Jonathan Rosenthal, and Jonathan S. Yedidia. 2015. The boundary forest algorithm for online supervised and unsupervised learning. In Proceedings of the AAAI Conference on Artifical Intelligence (AAAI’15). 2864--2870.
    [40]
    Microsoft. 2017. Microsoft Azure Machine Learning Studio. Retrieved July 7, 2017 from https://studio.azureml.net/.
    [41]
    Tim Miller. 2018. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1--38.
    [42]
    Allen Newell, Herbert Alexander Simon, et al. 1972. Human Problem Solving. Vol. 104. Prentice-Hall, Englewood Cliffs, NJ.
    [43]
    Richard Nock, Marc Sebban, and Didier Bernard. 2003. A simple locally adaptive nearest neighbor rule with Aapplication to pollution forecasting. Int. J. Pattern Recogn. Artif. Intell. 7, 8 (2003), 1369--1382.
    [44]
    Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging neural networks with coverage-guided fuzzing. In International Conference on Machine Learning. 4901--4911.
    [45]
    Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 115--124.
    [46]
    Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. 2018. Technical report on the CleverHans v2.1.0 adversarial examples library. https://arxiv.org/pdf/1610.00768.pdf.
    [47]
    Nicolas Papernot and Patrick McDaniel. 2018. Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. https://arxiv.org/abs/1803.04765.pdf.
    [48]
    Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS8P’16). IEEE, 372--387.
    [49]
    Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.
    [50]
    Google Cloud Platform. 2018. Cloud Machine Learning Engine. Retrieved February 1, 2018 from https://cloud.google.com/ml-engine/.
    [51]
    Jim Prentzas and Ioannis Hatzilygeroudis. 2009. Combinations of case-based reasoning with other intelligent methods. Int. J. Hybrid Intell. Syst. 6, 4 (2009), 189--209.
    [52]
    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135--1144.
    [53]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 3 (2015), 211--252.
    [54]
    Gregor P. J. Schmitz, Chris Aldrich, and Francois S. Gouws. 1999. ANN-DT: An algorithm for extraction of decision trees from artificial neural networks. IEEE Trans. Neur. Netw. 10, 6 (1999), 1392--1401.
    [55]
    Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the International Conference on Computer Vision (ICCV’17). 618--626.
    [56]
    David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484--489.
    [57]
    J. Springenberg, Alexey Dosovitskiy, Thomas Brox, and M. Riedmiller. 2015. Striving for simplicity: The all convolutional net. In Proceedings of the International Conference on Learning Representations (ICLR’15).
    [58]
    Pierre Stock and Moustapha Cisse. 2017. Convnets and imagenet beyond accuracy: Explanations, bias detection, adversarial examples and model criticism. https://arxiv.org/abs/1711.11443.pdf.
    [59]
    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. 3319--3328.
    [60]
    Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A. Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’17), Vol. 4. 12.
    [61]
    Tensorflow. 2017. Deep MNIST for Experts. Retrieved July 7, 2017 from https://www.tensorflow.org/get_started/mnist/pros.
    [62]
    Csaba D. Toth, Joseph O’Rourke, and Jacob E. Goodman. 2004. Handbook of Discrete and Computational Geometry. CRC Press.
    [63]
    Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. 2016. Stealing machine learning models via prediction apis. In Proceedings of the USENIX Security Conference.
    [64]
    Huijun Wu, Chen Wang, Jie Yin, Kai Lu, and Liming Zhu. 2018. Sharing deep neural network models with interpretation. In Proceedings of the 2018 World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 177--186.
    [65]
    Mike Wu, Michael C. Hughes, Sonali Parbhoo, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. 2018. Beyond sparsity: Tree regularization of deep models for interpretability. In Thirty-Second AAAI Conference on Artificial Intelligence.
    [66]
    Weilin Xu, David Evans, and Yanjun Qi. 2017. Feature squeezing: Detecting adversarial examples in deep neural networks. Proceedings of the Network and Distributed System Security Symposium (NDSS’18).
    [67]
    Pierre Stock and Moustapha Cisse. 2018. Convnets and imagenet beyond accuracy: Understanding mistakes and uncovering biases. In Proceedings of the European Conference on Computer Vision (ECCV'18). 498--512.
    [68]
    Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European Conference on Computer Vision. Springer, 818--833.
    [69]
    Quanshi Zhang, Ying Nian Wu, and Song-Chun Zhu. 2018. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8827--8836.

    Cited By

    View all
    • (2023)Flood hazard potential evaluation using decision tree state‐of‐the‐art modelsRisk Analysis10.1111/risa.1417944:2(439-458)Online publication date: 25-Jun-2023
    • (2022)Do Auto-Regressive Models Protect Privacy? Inferring Fine-Grained Energy Consumption From Aggregated Model ParametersIEEE Transactions on Services Computing10.1109/TSC.2021.310049815:6(3198-3209)Online publication date: 1-Nov-2022
    • (2022) A systematic literature review of intrusion detection systems in the cloud‐based IoT environments Concurrency and Computation: Practice and Experience10.1002/cpe.682234:10Online publication date: 10-Feb-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on the Web
    ACM Transactions on the Web  Volume 14, Issue 3
    August 2020
    126 pages
    ISSN:1559-1131
    EISSN:1559-114X
    DOI:10.1145/3398019
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 May 2020
    Accepted: 01 February 2020
    Revised: 01 October 2019
    Received: 01 April 2019
    Published in TWEB Volume 14, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep neural networks
    2. decision boundary
    3. interpretability
    4. model sharing

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • National High-level Personnel for Defense Technology Program
    • NSF China
    • D2DCRC
    • ARC DPs
    • HUNAN Province Science Foundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Flood hazard potential evaluation using decision tree state‐of‐the‐art modelsRisk Analysis10.1111/risa.1417944:2(439-458)Online publication date: 25-Jun-2023
    • (2022)Do Auto-Regressive Models Protect Privacy? Inferring Fine-Grained Energy Consumption From Aggregated Model ParametersIEEE Transactions on Services Computing10.1109/TSC.2021.310049815:6(3198-3209)Online publication date: 1-Nov-2022
    • (2022) A systematic literature review of intrusion detection systems in the cloud‐based IoT environments Concurrency and Computation: Practice and Experience10.1002/cpe.682234:10Online publication date: 10-Feb-2022
    • (2021)Finding Representative Interpretations on Convolutional Neural Networks2021 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV48922.2021.00138(1325-1334)Online publication date: Oct-2021
    • (2021)Intrusion detection systems in the cloud computing: A comprehensive and deep literature reviewConcurrency and Computation: Practice and Experience10.1002/cpe.664634:4Online publication date: 27-Oct-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media