research-article

AI2: a novel explainable machine learning framework using an NLP interface

Authors:

Jean-Sébastien Dessureault,

Daniel MassicotteAuthors Info & Claims

ICMLT '23: Proceedings of the 2023 8th International Conference on Machine Learning Technologies

Pages 1 - 7

https://doi.org/10.1145/3589883.3589884

Published: 27 June 2023 Publication History

Abstract

This paper proposes a novel machine learning framework that encapsulates recent concerns of the data scientists community: accessibility and explainability. This framework, called AI2, proposes a natural language interface, making the framework accessible even to a non-expert. Traditionally, machine learning frameworks are accessible using a programming language. Python is one of the most common programming language for coding different machine learning methods. The AI2 framework, although made with Python scripts, is made to be accessed in a natural language, namely, English. Hence, the first contribution is about accessibility, allowing a non-data scientist to exploit a machine learning framework without knowing how to code. For decades, the data scientists community has known that one of the drawbacks in the machine learning field is the black-box problem. Data scientists have to create different methods to explain their results. The second contribution of this paper is to encapsulate the principle of explainability in the framework, systematically proposing not only the results but also the explanations of the results for every included machine learning algorithm.

References

[1]

Mohiuddin Ahmed, Raihan Seraj, and Syed Mohammed Shamsul Islam. 2020. The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics 9, 8 (2020), 1295. https://doi.org/10.3390/electronics9081295 Number: 8 Publisher: Multidisciplinary Digital Publishing Institute.

[2]

Gérard Biau and Erwan Scornet. 2016. A random forest guided tour. TEST 25, 2 (2016), 197–227.

[3]

Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 1 (2002), 321–357.

[4]

Jean-Sébastien Dessureault and Daniel Massicotte. 2022a. [2206.08974]DPDR: A novel machine learning method for the Decision Process for Dimensionality Reduction. https://arxiv.org/abs/2206.08974

[5]

Jean-Sébastien Dessureault and Daniel Massicotte. 2022b. [2206.08980] Explainable Global Error Weighted on Feature Importance: The xGEWFI metric to evaluate the error of data imputation and data augmentation. https://arxiv.org/abs/2206.08980

[6]

Jean-Sébastien Dessureault and Daniel Massicotte. 2022c. [2206.08982] ck-means, a novel unsupervised learning method that combines fuzzy and crispy clustering methods to extract intersecting data. https://arxiv.org/abs/2206.08982

[7]

Jean-Sébastien Dessureault and Daniel Massicotte. 2022d. DPDRC, a Novel Machine Learning Method about the Decision Process for Dimensionality Reduction before Clustering. AI 3, 1 (2022), 1–21. https://doi.org/10.3390/ai3010001 Number: 1 Publisher: Multidisciplinary Digital Publishing Institute.

[8]

Lisa R. Goldberg. 2019. The Book of Why: The New Science of Cause and Effect. Vol. 19. Routledge, Routledge. 1945–1949 pages. https://doi.org/10.1080/14697688.2019.1655928

[9]

Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal, and Heimo Müller. 2019. Causability and explainability of artificial intelligence in medicine. Wiley Interdisciplinary Reviews. Data Mining and Knowledge Discovery 9, 4 (2019), e1312.

[10]

Ian T. Jolliffe and Jorge Cadima. 2016. Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 374, 2065 (2016), 20150202. https://doi.org/10.1098/rsta.2015.0202 arXiv:https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2015.0202

[11]

Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, and Wangyou Zhang. 2019. A Comparative Study on Transformer vs RNN in Speech Applications. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, Singapore, 449–456. https://doi.org/10.1109/ASRU46091.2019.9003750

[12]

Keras Team. 2023. Keras: the Python deep learning API. Keras. https://keras.io/

[13]

Will Knight. 2023. One of the fathers of AI is worried about its future | MIT Technology Review. MIT Technology review. https://www.technologyreview.com/2018/11/17/66372/one-of-the-fathers-of-ai-is-worried-about-its-future/

[14]

The Institute for Ethical Ai \& Machine Learning. 2023. The Institute for Ethical AI & Machine Learning. The Institute for Ethical AI. https://ethical.institute

[15]

Bioinformatics Laboratory Ljubljana, University of. 2023. Orange Data Mining. Orange Data Mining. https://orangedatamining.com/

[16]

Matlab Team. 2023. MATLAB - MathWorks. Mathworks. https://www.mathworks.com/products/matlab.html

[17]

Microsoft Team. 2023. The Microsoft Cognitive Toolkit - Cognitive Toolkit - CNTK. Microsoft. https://docs.microsoft.com/en-us/cognitive-toolkit/

[18]

Giang Nguyen, Stefan Dlugolinsky, Martin Bobák, Viet Tran, Álvaro López García, Ignacio Heredia, Peter Malík, and Ladislav Hluchý. 2019. Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review 52, 1 (2019), 77–124.

Digital Library

[19]

Sebastian Palacio, Adriano Lucieri, Mohsin Munir, Sheraz Ahmed, Jörn Hees, and Andreas Dengel. 2021. XAI Handbook: Towards a Unified Framework for Explainable AI. IEEE.

[20]

PyTorch Team. 2023. PyTorch. PyTorch. https://www.pytorch.org

[21]

Hassan Ramchoun, Youssef Ghanou, Mohamed Ettaouil, and Mohammed Amine Janati Idrissi. 2016. Multilayer Perceptron: Architecture Optimization and Training. International Journal of Interactive Multimedia and Artificial Intelligence (IJIMAI) 4, 1 (2016), 26–30. https://doi.org/10.9781/ijimai.2016.415

[22]

Denis Rothman. 2021. Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more. Packt Publishing Ltd, Birmingham, UK.

[23]

Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7

Digital Library

[24]

Scikit learn Team. 2023. scikit-learn Tutorials — scikit-learn 0.24.1 documentation. Scikit-learn. https://scikit-learn.org/stable/tutorial/index.html

[25]

Xiaogang Su, Xin Yan, and Chih-Ling Tsai. 2012. Linear regression. WIREs Computational Statistics 4, 3 (2012), 275–294. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/wics.1198.

Digital Library

[26]

Tensorflow Team. 2023. TensorFlow. Tensorflow. https://www.tensorflow.org/?hl=fr

[27]

Tensorforce Team. 2023. Tensorforce: a TensorFlow library for applied reinforcement learning — Tensorforce 0.6.5 documentation. Tensorforce. https://tensorforce.readthedocs.io/

[28]

Olga G. Troyanskaya, David Botstein, and Russ B. Altman. 2003. Missing Value Estimation. In A Practical Approach to Microarray Data Analysis, Daniel P. Berrar, Werner Dubitzky, and Martin Granzow (Eds.). Springer US, Springer US, 65–75.

[29]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.

Digital Library

[30]

Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, and Jan S. Rellermeyer. 2020. A Survey on Distributed Machine Learning. Comput. Surveys 53, 2 (2020), 30:1–30:33. https://doi.org/10.1145/3377454

Digital Library

[31]

Zhaobin Wang, Ke Liu, Jian Li, Ying Zhu, and Yaonan Zhang. 2019. Various Frameworks and Libraries of Machine Learning and Deep Learning: A Survey. Archives of Computational Methods in Engineering 0, 0 (2019), 0. https://doi.org/10.1007/s11831-018-09312-w

[32]

Deway Lonzo Whaley. 2022. The interquartile range: Theory and estimation.

[33]

XAI Team. 2023. Welcome to the XAI docs - eXplainable machine learning — xai - eXplainable AI 0.1 documentation. XAI. https://ethicalml.github.io/xai/index.html

[34]

Jaehong Yu, Hua Zhong, and Seoung Bum Kim. 2020. An Ensemble Feature Ranking Algorithm for Clustering Analysis. Journal of Classification 37, 2 (2020), 462–489. https://doi.org/10.1007/s00357-019-09330-8

Digital Library

[35]

Kuo Zhang, Salem Alqahtani, and Murat Demirbas. 2017. A Comparison of Distributed Machine Learning Platforms. In 2017 26th International Conference on Computer Communication and Networks (ICCCN) (2017-07). IEEE, IEEE, Vancouver, Canada, 1–9. https://doi.org/10.1109/ICCCN.2017.8038464

[36]

Xingzhou Zhang, Yifan Wang, and Weisong Shi. 2018. pCAMP: Performance Comparison of Machine Learning Packages on the Edges. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18). USENIX Association, Boston, MA. https://www.usenix.org/conference/hotedge18/presentation/zhang

Cited By

Dessureault JClément FBa SMeunier FMassicotte D(2024)Explainable Machine Learning Method for Aesthetic Prediction of Doors and Home DesignsInformation10.3390/info1504020315:4(203)Online publication date: 5-Apr-2024
https://doi.org/10.3390/info15040203
Dessureault JMassicotte D(2023): the next leap toward native language-based and explainable machine learning frameworkAutomated Software Engineering10.1007/s10515-023-00399-530:2Online publication date: 24-Sep-2023
https://dl.acm.org/doi/10.1007/s10515-023-00399-5

Index Terms

AI2: a novel explainable machine learning framework using an NLP interface
1. Computing methodologies
  1. Machine learning

Recommendations

Explainable machine learning in deployment
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency

Explainable machine learning offers the potential to provide stakeholders with insights into model behavior by using various methods such as feature importance scores, counterfactual explanations, or influential training data. Yet there is little ...
$A I^{2}$ : the next leap toward native language-based and explainable machine learning framework
Abstract
The machine learning frameworks flourished in the last decades, allowing artificial intelligence to get out of academic circles to be applied to enterprise domains. This field has significantly advanced, but there is still some meaningful ... $^{}$ $^{}$ $^{}$
A Framework for Form Applications that Use Machine Learning
Intelligent Data Engineering and Automated Learning – IDEAL 2018
Abstract
Machine Learning (ML) has been used efficiently in applications across multiple domains. As a consequence, there is a growing interest in ML techniques and artifacts that facilitate its use. However, most of them are aimed at researchers and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLT '23: Proceedings of the 2023 8th International Conference on Machine Learning Technologies

March 2023

293 pages

ISBN:9781450398329

DOI:10.1145/3589883

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada

Conference

ICMLT 2023

ICMLT 2023: 2023 8th International Conference on Machine Learning Technologies

March 10 - 12, 2023

Stockholm, Sweden

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
74
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)1

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Dessureault JClément FBa SMeunier FMassicotte D(2024)Explainable Machine Learning Method for Aesthetic Prediction of Doors and Home DesignsInformation10.3390/info1504020315:4(203)Online publication date: 5-Apr-2024
https://doi.org/10.3390/info15040203
Dessureault JMassicotte D(2023): the next leap toward native language-based and explainable machine learning frameworkAutomated Software Engineering10.1007/s10515-023-00399-530:2Online publication date: 24-Sep-2023
https://dl.acm.org/doi/10.1007/s10515-023-00399-5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents