Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3512023acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article
Open access

Explainable Neural Rule Learning

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Although neural networks have achieved great successes in various machine learning tasks, people can hardly know what neural networks learn from data due to their black-box nature. The lack of such explainability is one of the limitations of neural networks when applied in domains, e.g., healthcare and finance, that demand transparency and accountability. Moreover, explainability is beneficial for guiding a neural network to learn the causal patterns that can extrapolate out-of-distribution (OOD) data, which is critical in real-world applications and has surged as a hot research topic.
    In order to improve the explainability of neural networks, we propose a novel method—Explainable Neural Rule Learning (denoted as ENRL), with the aim to integrate the expressiveness of neural networks and the explainability of rule-based systems. Specifically, we first design several operator modules and guide them to behave as certain relational operators via self-supervised learning. With input feature fields and learnable context values serving as arguments, these operator modules are used as predicates to constitute the atomic propositions. Then we employ neural logical operations to combine atomic propositions into a collection of rules. Finally, we design a voting mechanism for these rules so that they collaboratively make up our predictive model. Thus, rule learning is transformed to neural architecture search, that is, to choose the appropriate arrangements of feature fields and operator modules. After searching for a specific architecture and learning the involved modules, the resulting neural network explicitly expresses some rules and thus possesses explainability. Therefore, we can predict for each input instance according to rules it satisfies, which at the same time explains how the neural network makes that decision. We conduct a series of experiments on both synthetic and real-world datasets to evaluate ENRL. Compared with conventional neural networks, ENRL achieves competitive in-distribution performance while providing the extra benefits of explainability. Meanwhile, ENRL significantly alleviates performance drop on OOD test data, implying the effectiveness of rule learning. Codes are provided at https://github.com/Shuriken13/ENRL.

    References

    [1]
    Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140.
    [2]
    Leo Breiman. 2001. Random Forests. Mach. Learn. 45, 1 (2001), 5–32.
    [3]
    Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.
    [4]
    Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient Architecture Search by Network Transformation. In AAAI. AAAI Press, 2787–2794.
    [5]
    Han Cai, Jiacheng Yang, Weinan Zhang, Song Han, and Yong Yu. 2018. Path-Level Network Transformation for Efficient Architecture Search. In ICML(Proceedings of Machine Learning Research, Vol. 80). PMLR, 677–686.
    [6]
    Jianlong Chang, Xinbang Zhang, Yiwen Guo, Gaofeng Meng, Shiming Xiang, and Chunhong Pan. 2019. DATA: Differentiable ArchiTecture Approximation. In NeurIPS. 874–884.
    [7]
    Hanxiong Chen, Shaoyun Shi, Yunqi Li, and Yongfeng Zhang. 2021. Neural Collaborative Reasoning. In WWW. ACM / IW3C2, 1516–1527.
    [8]
    Linda A Clark and Daryl Pregibon. 2017. Tree-based models. In Statistical models in S. Routledge, 377–419.
    [9]
    Dawei Cong, Yanyan Zhao, Bing Qin, Yu Han, Murray Zhang, Alden Liu, and Nat Chen. 2019. Hierarchical Attention based Neural Network for Explainable Recommendation. In ICMR. ACM, 373–381.
    [10]
    Arun Das and Paul Rad. 2020. Opportunities and Challenges in Explainable Artificial Intelligence (XAI): A Survey. CoRR abs/2006.11371(2020).
    [11]
    Wei Deng, Junwei Pan, Tian Zhou, Deguang Kong, Aaron Flores, and Guang Lin. 2021. DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR Predictions in Ad Serving. In WSDM. ACM, 922–930.
    [12]
    Evelyn Fix and Joseph Lawson Hodges. 1989. Discriminatory analysis. Nonparametric discrimination: Consistency properties. International Statistical Review/Revue Internationale de Statistique 57, 3(1989), 238–247.
    [13]
    David A Freedman. 2009. Statistical models: theory and practice. cambridge university press.
    [14]
    Matt W Gardner and SR Dorling. 1998. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment 32, 14-15 (1998), 2627–2636.
    [15]
    Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In IJCAI. ijcai.org, 1725–1731.
    [16]
    Dan Hendrycks and Thomas G. Dietterich. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. In ICLR (Poster). OpenReview.net.
    [17]
    Ronghang Hu, Jacob Andreas, Trevor Darrell, and Kate Saenko. 2018. Explainable Neural Computation via Stack Neural Module Networks. In ECCV (7)(Lecture Notes in Computer Science, Vol. 11211). Springer, 55–71.
    [18]
    Mark Ibrahim, Melissa Louie, Ceena Modarres, and John W. Paisley. 2019. Global Explanations of Neural Networks: Mapping the Landscape of Predictions. In AIES. ACM, 279–287.
    [19]
    Sarthak Jain and Byron C. Wallace. 2019. Attention is not Explanation. In NAACL-HLT (1). Association for Computational Linguistics, 3543–3556.
    [20]
    Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical Reparameterization with Gumbel-Softmax. In ICLR (Poster). OpenReview.net.
    [21]
    Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster).
    [22]
    Ron Kohavi. 1996. Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In KDD. AAAI Press, 202–207.
    [23]
    Peter Kontschieder, Madalina Fiterau, Antonio Criminisi, and Samuel Rota Bulò. 2016. Deep Neural Decision Forests. In IJCAI. IJCAI/AAAI Press, 4190–4194.
    [24]
    Moshe Leshno, Vladimir Ya. Lin, Allan Pinkus, and Shimon Schocken. 1993. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Networks 6, 6 (1993), 861–867.
    [25]
    Heyi Li, Yunke Tian, Klaus Mueller, and Xin Chen. 2019. Beyond saliency: Understanding convolutional neural networks from saliency prediction on layer-wise relevance propagation. Image Vis. Comput. 83-84(2019), 70–86.
    [26]
    Yi-Ju Lu and Cheng-Te Li. 2020. GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media. In ACL. Association for Computational Linguistics, 505–514.
    [27]
    Gary Marcus. 2018. Deep Learning: A Critical Appraisal. CoRR abs/1801.00631(2018).
    [28]
    Vaishnavh Nagarajan, Anders Andreassen, and Behnam Neyshabur. 2021. Understanding the failure modes of out-of-distribution generalization. In International Conference on Learning Representations.
    [29]
    Michael A Nielsen. 2015. Neural networks and deep learning. Vol. 25. Determination press San Francisco, CA.
    [30]
    Ahmet Murat Özbayoglu, Mehmet Ugur Gudelek, and Omer Berat Sezer. 2020. Deep learning for financial applications : A survey. Appl. Soft Comput. 93(2020), 106384.
    [31]
    Ramakanth Pasunuru and Mohit Bansal. 2019. Continual and Multi-Task Architecture Search. In ACL (1). Association for Computational Linguistics, 1911–1922.
    [32]
    Ali Payani and Faramarz Fekri. 2019. Learning Algorithms via Neural Logic Networks. CoRR abs/1904.01554(2019).
    [33]
    Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. In ICML(Proceedings of Machine Learning Research, Vol. 80). PMLR, 4092–4101.
    [34]
    Forough Poursabzi-Sangdeh, Daniel G. Goldstein, Jake M. Hofman, Jennifer Wortman Vaughan, and Hanna M. Wallach. 2021. Manipulating and Measuring Model Interpretability. In CHI. ACM, 237:1–237:52.
    [35]
    Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. 2019. Do ImageNet Classifiers Generalize to ImageNet?. In ICML(Proceedings of Machine Learning Research, Vol. 97). PMLR, 5389–5400.
    [36]
    Pengzhen Ren, Yun Xiao, Xiaojun Chang, Poyao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. 2021. A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions. ACM Comput. Surv. 54, 4 (2021), 76:1–76:34.
    [37]
    Irina Rish 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. 41–46.
    [38]
    Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61(2015), 85–117.
    [39]
    Nida Shahid, Tim Rappon, and Whitney Berta. 2019. Applications of artificial neural networks in health care organizational decision-making: A scoping review. PloS one 14, 2 (2019), e0212356.
    [40]
    Wen Shen, Zhihua Wei, Shikun Huang, Binbin Zhang, Jiaqi Fan, Ping Zhao, and Quanshi Zhang. 2021. Interpretable Compositional Convolutional Neural Networks. In IJCAI. ijcai.org, 2971–2978.
    [41]
    Shaoyun Shi, Hanxiong Chen, Weizhi Ma, Jiaxin Mao, Min Zhang, and Yongfeng Zhang. 2020. Neural Logic Reasoning. In CIKM. ACM, 1365–1374.
    [42]
    Shaoyun Shi, Min Zhang, Xinxing Yu, Yongfeng Zhang, Bin Hao, Yiqun Liu, and Shaoping Ma. 2019. Adaptive Feature Sampling for Recommendation with Missing Content Feature Values. In CIKM. ACM, 1451–1460.
    [43]
    Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, and Jian Tang. 2019. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. In CIKM. ACM, 1161–1170.
    [44]
    Walter Denis Wallis. 2011. A beginner’s guide to discrete mathematics. Springer Science & Business Media.
    [45]
    Alvin Wan, Lisa Dunlap, Daniel Ho, Jihan Yin, Scott Lee, Suzanne Petryk, Sarah Adel Bargal, and Joseph E. Gonzalez. 2021. NBDT: Neural-Backed Decision Tree. In ICLR. OpenReview.net.
    [46]
    Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, and Heng Tao Shen. 2019. Template-Based Math Word Problem Solvers with Recursive Neural Networks. In AAAI. AAAI Press, 7144–7151.
    [47]
    Zhuo Wang, Wei Zhang, Ning Liu, and Jianyong Wang. 2020. Transparent Classification with Multilayer Logical Perceptrons and Random Binarization. In AAAI. AAAI Press, 6331–6339.
    [48]
    Sarah Wiegreffe and Yuval Pinter. 2019. Attention is not not Explanation. In EMNLP/IJCNLP (1). Association for Computational Linguistics, 11–20.
    [49]
    Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. 2019. SNAS: stochastic neural architecture search. In ICLR (Poster). OpenReview.net.
    [50]
    Yuexiang Xie, Zhen Wang, Yaliang Li, Bolin Ding, Nezihe Merve Gürel, Ce Zhang, Minlie Huang, Wei Lin, and Jingren Zhou. 2021. FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data. In KDD. ACM, 3795–3805.
    [51]
    Linyi Yang, Zheng Zhang, Su Xiong, Lirui Wei, James Ng, Lina Xu, and Ruihai Dong. 2018. Explainable Text-Driven Neural Network for Stock Prediction. In CCIS. IEEE, 441–445.
    [52]
    Yongxin Yang, Irene Garcia Morillo, and Timothy M. Hospedales. 2018. Deep Neural Decision Trees. CoRR abs/1806.06988(2018).
    [53]
    Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2021. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64, 3 (2021), 107–115.

    Cited By

    View all
    • (2023)Sequential recommendation with probabilistic logical reasoningProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/270(2432-2440)Online publication date: 19-Aug-2023

    Index Terms

    1. Explainable Neural Rule Learning
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '22: Proceedings of the ACM Web Conference 2022
        April 2022
        3764 pages
        ISBN:9781450390965
        DOI:10.1145/3485447
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 April 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. explainable neural networks
        2. out of distribution
        3. rule learning

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        • Natural Science Foundation of China

        Conference

        WWW '22
        Sponsor:
        WWW '22: The ACM Web Conference 2022
        April 25 - 29, 2022
        Virtual Event, Lyon, France

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)542
        • Downloads (Last 6 weeks)50
        Reflects downloads up to 27 Jul 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Sequential recommendation with probabilistic logical reasoningProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/270(2432-2440)Online publication date: 19-Aug-2023

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media