Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3589883.3589884acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmltConference Proceedingsconference-collections
research-article

AI2: a novel explainable machine learning framework using an NLP interface

Published: 27 June 2023 Publication History

Abstract

This paper proposes a novel machine learning framework that encapsulates recent concerns of the data scientists community: accessibility and explainability. This framework, called AI2, proposes a natural language interface, making the framework accessible even to a non-expert. Traditionally, machine learning frameworks are accessible using a programming language. Python is one of the most common programming language for coding different machine learning methods. The AI2 framework, although made with Python scripts, is made to be accessed in a natural language, namely, English. Hence, the first contribution is about accessibility, allowing a non-data scientist to exploit a machine learning framework without knowing how to code. For decades, the data scientists community has known that one of the drawbacks in the machine learning field is the black-box problem. Data scientists have to create different methods to explain their results. The second contribution of this paper is to encapsulate the principle of explainability in the framework, systematically proposing not only the results but also the explanations of the results for every included machine learning algorithm.

References

[1]
Mohiuddin Ahmed, Raihan Seraj, and Syed Mohammed Shamsul Islam. 2020. The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics 9, 8 (2020), 1295. https://doi.org/10.3390/electronics9081295 Number: 8 Publisher: Multidisciplinary Digital Publishing Institute.
[2]
Gérard Biau and Erwan Scornet. 2016. A random forest guided tour. TEST 25, 2 (2016), 197–227.
[3]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 1 (2002), 321–357.
[4]
Jean-Sébastien Dessureault and Daniel Massicotte. 2022a. [2206.08974]DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction. https://arxiv.org/abs/2206.08974
[5]
Jean-Sébastien Dessureault and Daniel Massicotte. 2022b. [2206.08980] Explainable Global Error Weighted on Feature Importance: The xGEWFI metric to evaluate the error of data imputation and data augmentation. https://arxiv.org/abs/2206.08980
[6]
Jean-Sébastien Dessureault and Daniel Massicotte. 2022c. [2206.08982] ck-means, a novel unsupervised learning method that combines fuzzy and crispy clustering methods to extract intersecting data. https://arxiv.org/abs/2206.08982
[7]
Jean-Sébastien Dessureault and Daniel Massicotte. 2022d. DPDRC, a Novel Machine Learning Method about the Decision Process for Dimensionality Reduction before Clustering. AI 3, 1 (2022), 1–21. https://doi.org/10.3390/ai3010001 Number: 1 Publisher: Multidisciplinary Digital Publishing Institute.
[8]
Lisa R. Goldberg. 2019. The Book of Why: The New Science of Cause and Effect. Vol. 19. Routledge, Routledge. 1945–1949 pages. https://doi.org/10.1080/14697688.2019.1655928
[9]
Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal, and Heimo Müller. 2019. Causability and explainability of artificial intelligence in medicine. Wiley Interdisciplinary Reviews. Data Mining and Knowledge Discovery 9, 4 (2019), e1312.
[10]
Ian T. Jolliffe and Jorge Cadima. 2016. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374, 2065 (2016), 20150202. https://doi.org/10.1098/rsta.2015.0202 arXiv:https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2015.0202
[11]
Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, and Wangyou Zhang. 2019. A Comparative Study on Transformer vs RNN in Speech Applications. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, Singapore, 449–456. https://doi.org/10.1109/ASRU46091.2019.9003750
[12]
Keras Team. 2023. Keras: the Python deep learning API. Keras. https://keras.io/
[13]
Will Knight. 2023. One of the fathers of AI is worried about its future | MIT Technology Review. MIT Technology review. https://www.technologyreview.com/2018/11/17/66372/one-of-the-fathers-of-ai-is-worried-about-its-future/
[14]
The Institute for Ethical Ai \& Machine Learning. 2023. The Institute for Ethical AI & Machine Learning. The Institute for Ethical AI. https://ethical.institute
[15]
Bioinformatics Laboratory Ljubljana, University of. 2023. Orange Data Mining. Orange Data Mining. https://orangedatamining.com/
[16]
Matlab Team. 2023. MATLAB - MathWorks. Mathworks. https://www.mathworks.com/products/matlab.html
[17]
Microsoft Team. 2023. The Microsoft Cognitive Toolkit - Cognitive Toolkit - CNTK. Microsoft. https://docs.microsoft.com/en-us/cognitive-toolkit/
[18]
Giang Nguyen, Stefan Dlugolinsky, Martin Bobák, Viet Tran, Álvaro López García, Ignacio Heredia, Peter Malík, and Ladislav Hluchý. 2019. Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review 52, 1 (2019), 77–124.
[19]
Sebastian Palacio, Adriano Lucieri, Mohsin Munir, Sheraz Ahmed, Jörn Hees, and Andreas Dengel. 2021. XAI Handbook: Towards a Unified Framework for Explainable AI. IEEE.
[20]
PyTorch Team. 2023. PyTorch. PyTorch. https://www.pytorch.org
[21]
Hassan Ramchoun, Youssef Ghanou, Mohamed Ettaouil, and Mohammed Amine Janati Idrissi. 2016. Multilayer Perceptron: Architecture Optimization and Training. International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI) 4, 1 (2016), 26–30. https://doi.org/10.9781/ijimai.2016.415
[22]
Denis Rothman. 2021. Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more. Packt Publishing Ltd, Birmingham, UK.
[23]
Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
[24]
Scikit learn Team. 2023. scikit-learn Tutorials — scikit-learn 0.24.1 documentation. Scikit-learn. https://scikit-learn.org/stable/tutorial/index.html
[25]
Xiaogang Su, Xin Yan, and Chih-Ling Tsai. 2012. Linear regression. WIREs Computational Statistics 4, 3 (2012), 275–294. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/wics.1198.
[26]
Tensorflow Team. 2023. TensorFlow. Tensorflow. https://www.tensorflow.org/?hl=fr
[27]
Tensorforce Team. 2023. Tensorforce: a TensorFlow library for applied reinforcement learning — Tensorforce 0.6.5 documentation. Tensorforce. https://tensorforce.readthedocs.io/
[28]
Olga G. Troyanskaya, David Botstein, and Russ B. Altman. 2003. Missing Value Estimation. In A Practical Approach to Microarray Data Analysis, Daniel P. Berrar, Werner Dubitzky, and Martin Granzow (Eds.). Springer US, Springer US, 65–75.
[29]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.
[30]
Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, and Jan S. Rellermeyer. 2020. A Survey on Distributed Machine Learning. Comput. Surveys 53, 2 (2020), 30:1–30:33. https://doi.org/10.1145/3377454
[31]
Zhaobin Wang, Ke Liu, Jian Li, Ying Zhu, and Yaonan Zhang. 2019. Various Frameworks and Libraries of Machine Learning and Deep Learning: A Survey. Archives of Computational Methods in Engineering 0, 0 (2019), 0. https://doi.org/10.1007/s11831-018-09312-w
[32]
Deway Lonzo Whaley. 2022. The interquartile range: Theory and estimation.
[33]
XAI Team. 2023. Welcome to the XAI docs - eXplainable machine learning — xai - eXplainable AI 0.1 documentation. XAI. https://ethicalml.github.io/xai/index.html
[34]
Jaehong Yu, Hua Zhong, and Seoung Bum Kim. 2020. An Ensemble Feature Ranking Algorithm for Clustering Analysis. Journal of Classification 37, 2 (2020), 462–489. https://doi.org/10.1007/s00357-019-09330-8
[35]
Kuo Zhang, Salem Alqahtani, and Murat Demirbas. 2017. A Comparison of Distributed Machine Learning Platforms. In 2017 26th International Conference on Computer Communication and Networks (ICCCN) (2017-07). IEEE, IEEE, Vancouver, Canada, 1–9. https://doi.org/10.1109/ICCCN.2017.8038464
[36]
Xingzhou Zhang, Yifan Wang, and Weisong Shi. 2018. pCAMP: Performance Comparison of Machine Learning Packages on the Edges. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18). USENIX Association, Boston, MA. https://www.usenix.org/conference/hotedge18/presentation/zhang

Cited By

View all
  • (2024)Explainable Machine Learning Method for Aesthetic Prediction of Doors and Home DesignsInformation10.3390/info1504020315:4(203)Online publication date: 5-Apr-2024
  • (2023): the next leap toward native language-based and explainable machine learning frameworkAutomated Software Engineering10.1007/s10515-023-00399-530:2Online publication date: 24-Sep-2023

Index Terms

  1. AI2: a novel explainable machine learning framework using an NLP interface

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLT '23: Proceedings of the 2023 8th International Conference on Machine Learning Technologies
    March 2023
    293 pages
    ISBN:9781450398329
    DOI:10.1145/3589883
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 June 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. NLP
    2. accessibility
    3. explainability
    4. framework
    5. machine learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    ICMLT 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)66
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 01 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Explainable Machine Learning Method for Aesthetic Prediction of Doors and Home DesignsInformation10.3390/info1504020315:4(203)Online publication date: 5-Apr-2024
    • (2023): the next leap toward native language-based and explainable machine learning frameworkAutomated Software Engineering10.1007/s10515-023-00399-530:2Online publication date: 24-Sep-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media