research-article

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

Authors:

David S. EbertAuthors Info & Claims

IEEE Transactions on Visualization and Computer Graphics, Volume 25, Issue 1

Pages 364 - 373

https://doi.org/10.1109/TVCG.2018.2864499

Published: 01 January 2019 Publication History

Abstract

Interpretation and diagnosis of machine learning models have gained renewed interest in recent years with breakthroughs in new approaches. We present Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models in a more transparent and interactive manner. Conventional techniques usually focus on visualizing the internal logic of a specific model type (i.e., deep neural networks), lacking the ability to extend to a more complex scenario where different model types are integrated. To this end, Manifold is designed as a generic framework that does not rely on or access the internal logic of the model and solely observes the input (i.e., instances or features) and the output (i.e., the predicted result and probability distribution). We describe the workflow of Manifold as an iterative process consisting of three major phases that are commonly involved in the model development and diagnosis process: inspection (hypothesis), explanation (reasoning), and refinement (verification). The visual components supporting these tasks include a scatterplot-based visual summary that overviews the models' outcome and a customizable tabular view that reveals feature discrimination. We demonstrate current applications of the framework on the classification and regression tasks and discuss other potential machine learning use scenarios where Manifold can be applied.

References

[1]

Luma.gl: High-performance webgl2 components for gpu-powered data visualization and computation. [Online]. Available: https://uber.github.io/luma.gl/. Accessed: 2018-03-08.

[2]

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:, 2016.

[3]

B. Alsallakh, A. Hanbury, H. Hauser, S. Miksch, and A. Rauber. Visual methods for analyzing probabilistic classification data. IEEE transactions on visualization and computer graphics, 20 (12) pp. 1703–1712, 2014.

[4]

S. Amershi, M. Cakmak, W.B. Knox, and T. Kulesza. Power to the people: The role of humans in interactive machine learning. AI Magazine, 35 (4) pp. 105–120, 2014.

[5]

S. Amershi, M. Chickering, S.M. Drucker, B. Lee, P. Simard, and J. Suh. Modeltracker: Redesigning performance analysis tools for machine learning. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pages pp. 337–346. ACM, 2015.

[6]

M. Brooks, S. Amershi, B. Lee, S.M. Drucker, A. Kapoor, and P. Simard. Featureinsight: Visual support for error-driven feature ideation in text classification. In Visual Analytics Science and Technology (VAST), 2015 IEEE Conference on, pages pp. 105–112. IEEE, 2015.

[7]

G. Cadamuro, R. Gilad-Bachrach, and X. Zhu. Debugging machine learning. In ACM CHI Workshop on Human Centered Machine Learning, 2016.

[8]

H. Chen, W. Chen, H. Mei, Z. Liu, K. Zhou, W. Chen, W. Gu, and K.-L. Ma. Visual abstraction and exploration of multi-class scatterplots. IEEE Transactions on Visualization and Computer Graphics, 20 (12) pp. 1683–1692, 2014.

[9]

J. Choo, C. Lee, C.K. Reddy, and H. Park. Utopian: User-driven topic modeling based on interactive nonnegative matrix factorization. IEEE transactions on visualization and computer graphics, 19 (12) pp. 1992–2001, 2013.

Digital Library

[10]

J.-D. Fekete. Visual analytics infrastructures: From data management to exploration. Computer, 46 (7) pp. 22–29, 2013.

Digital Library

[11]

Kaggle. Spooky Author Identification. [Online]. Available: https://www.kaggle.com/c/spooky-author-identification, 2017. Online; accessed 29 March 2018.

[12]

M. Kahng, P.Y. Andrews, A. Kalro, and D.H.P. Chau. Activis: Visual exploration of industry-scale deep neural network models. IEEE transactions on visualization and computer graphics, 24 (1) pp. 88–97, 2018.

[13]

A. Kapoor, B. Lee, D. Tan, and E. Horvitz. Interactive optimization for steering machine classification. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages pp. 1343–1352. ACM, 2010.

[14]

J. Krause, A. Dasgupta, J. Swartz, Y. Aphinyanaphongs, and E. Bertini. A workflow for visual diagnostics of binary classifiers using instance-level explanations. arXiv preprint arXiv:, 2017.

[15]

J. Krause, A. Perer, and K. Ng. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pages pp. 5686–5697, 2016.

Digital Library

[16]

T. Kulesza, M. Burnett, W.-K. Wong, and S. Stumpf. Principles of explanatory debugging to personalize interactive machine learning. In Proceedings of the 20th International Conference on Intelligent User Interfaces, pages pp. 126–137. ACM, 2015.

[17]

T. Kulesza, S. Stumpf, M. Burnett, W.-K. Wong, Y. Riche, T. Moore, I. Oberst, A. Shinsel, and K. McIntosh. Explanatory debugging: Supporting end-user debugging of machine-learned programs. In Visual Languages and Human-Centric Computing (VL/HCC), 2010 IEEE Symposium on, pages pp. 41–48. IEEE, 2010.

[18]

S. Kullback and R.A. Leibler. On information and sufficiency. The annals of mathematical statistics, 22 (1) pp. 79–86, 1951.

[19]

Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. nature, 521 (7553) p. 436, 2015.

[20]

M. Liu, J. Shi, K. Cao, J. Zhu, and S. Liu. Analyzing the training processes of deep generative models. IEEE transactions on visualization and computer graphics, 24 (1) pp. 77–87, 2018.

[21]

S. Liu, X. Wang, M. Liu, and J. Zhu. Towards better analysis of machine learning models: A visual analytics perspective. Visual Informatics, 1 (1) pp. 48–56, 2017.

[22]

S. Liu, J. Xiao, J. Liu, X. Wang, J. Wu, and J. Zhu. Visual diagnosis of tree boosting methods. IEEE transactions on visualization and computer graphics, 24 (1) pp. 163–173, 2018.

[23]

C.D. Manning and H. Schütze. Foundations of statistical natural language processing. MIT press, 1999.

Digital Library

[24]

A. Mayorga and M. Gleicher. Splatterplots: Overcoming overdraw in scatter plots. IEEE transactions on visualization and computer graphics, 19 (9) pp. 1526–1538, 2013.

Digital Library

[25]

Y. Ming, S. Cao, R. Zhang, Z. Li, Y. Chen, Y. Song, and H. Qu. Understanding hidden memories of recurrent neural networks. arXiv preprint arXiv:, 2017.

[26]

T. Mühlbacher and H. Piringer. A partition-based framework for building and validating regression models. IEEE Transactions on Visualization and Computer Graphics, 19 (12) pp. 1962–1971, 2013.

Digital Library

[27]

T. Mühlbacher, H. Piringer, S. Gratzl, M. Sedlmair, and M. Streit. Opening the black box: Strategies for increased user involvement in existing algorithm implementations. IEEE transactions on visualization and computer graphics, 20 (12) pp. 1643–1652, 2014.

[28]

T. Munzner. A nested model for visualization design and validation. IEEE transactions on visualization and computer graphics, 15 (6), 2009.

[29]

J.G. S. Paiva, W.R. Schwartz, H. Pedrini, and R. Minghim. An approach to supporting incremental visual data classification. IEEE transactions on visualization and computer graphics, 21 (1) pp. 4–17, 2015.

[30]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct): pp. 2825–2830, 2011.

[31]

J. Pennington, R. Socher, and C. Manning. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages pp. 1532–1543, 2014.

[32]

P.E. Rauber, S.G. Fadel, A.X. Falcao, and A.C. Telea. Visualizing the hidden activity of artificial neural networks. IEEE transactions on visualization and computer graphics, 23 (1) pp. 101–110, 2017.

Digital Library

[33]

D. Ren, S. Amershi, B. Lee, J. Suh, and J.D. Williams. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE transactions on visualization and computer graphics, 23 (1) pp. 61–70, 2017.

Digital Library

[34]

M.T. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages pp. 1135–1144. ACM, 2016.

[35]

B. Schneider, D. Jäckle, F. Stoffel, A. Diehl, J. Fuchs, and D. Keim. Visual integration of data and model space in ensemble learning. arXiv preprint arXiv:, 2017.

[36]

C. Stolte, D. Tang, and P. Hanrahan. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE Transactions on Visualization and Computer Graphics, 8 (1) pp. 52–65, 2002.

Digital Library

[37]

H. Strobelt, S. Gehrmann, H. Pfister, and A.M. Rush. Lstmvis: A tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE transactions on visualization and computer graphics, 24 (1) pp. 667–676, 2018.

[38]

J. Talbot, B. Lee, A. Kapoor, and D.S. Tan. Ensemblematrix: interactive visualization to support machine learning with multiple classifiers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages pp. 1283–1292. ACM, 2009.

[39]

G.K. Tam, V. Kothari, and M. Chen. An analysis of machine- and human-analytics in classification. IEEE transactions on visualization and computer graphics, 23 (1) pp. 71–80, 2017.

Digital Library

[40]

Y. Wang. Deck.gl: Large-scale web-based visual analytics made easy. In IEEE Visualization Workshop on Visualization in Practice (VIP), 2017.

[41]

K. Wongsuphasawat, D. Smilkov, J. Wexler, J. Wilson, D. Mané, D. Fritz, D. Krishnan, F.B. Viégas, and M. Wattenberg. Visualizing dataflow graphs of deep learning models in tensorflow. IEEE transactions on visualization and computer graphics, 24 (1) pp. 1–12, 2018.

[42]

J. Yosinski, J. Clune, A. Nguyen, T. Fuchs, and H. Lipson. Understanding neural networks through deep visualization. arXiv preprint arXiv:, 2015.

[43]

K. Zhao, M.O. Ward, E.A. Rundensteiner, and H.N. Higgins. “Lovis: Local pattern visualization for model refinement”. In Computer Graphics Forum, volume 33, pages pp. 331–340. Wiley Online Library, 2014.

Cited By

Zhang TFeng HHuang WLiang LZhang HChen ZTung AChen W(2025)FedCare: towards interactive diagnosis of federated learning systemsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3735-719:7Online publication date: 1-Jul-2025
https://dl.acm.org/doi/10.1007/s11704-024-3735-7
Das Antar AMolaei SChen YLee MBanovic N(2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676323
He CRaj VMoen HGröhn TWang CPeltonen LKoivusalo SMarttinen PJacucci G(2024)VMS: Interactive Visualization to Support the Sensemaking and Selection of Predictive ModelsProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645151(229-244)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645151
Show More Cited By

Recommendations

Riemannian Manifold Learning

Recently, manifold learning has been widely exploited in pattern recognition, data analysis, and machine learning. This paper presents a novel framework, called Riemannian manifold learning (RML), based on the assumption that the input high-dimensional ...
Non-isometric manifold learning: analysis and an algorithm
ICML '07: Proceedings of the 24th international conference on Machine learning

In this work we take a novel view of nonlinear manifold learning. Usually, manifold learning is formulated in terms of finding an embedding or 'unrolling' of a manifold into a lower dimensional space. Instead, we treat it as the problem of learning a ...
Introduction to manifold learning

A popular research area today in statistics and machine learning is that of manifold learning, which is related to the algorithmic techniques of dimensionality reduction. Manifold learning can be divided into linear and nonlinear methods. Linear methods,...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Visualization and Computer Graphics

IEEE Transactions on Visualization and Computer Graphics Volume 25, Issue 1

Jan. 2019

1266 pages

ISSN:1077-2626

Issue’s Table of Contents

1077-2626 © 2018 IEEE.

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 January 2019

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang TFeng HHuang WLiang LZhang HChen ZTung AChen W(2025)FedCare: towards interactive diagnosis of federated learning systemsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3735-719:7Online publication date: 1-Jul-2025
https://dl.acm.org/doi/10.1007/s11704-024-3735-7
Das Antar AMolaei SChen YLee MBanovic N(2024)VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision-MakingProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676323(1-21)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676323
He CRaj VMoen HGröhn TWang CPeltonen LKoivusalo SMarttinen PJacucci G(2024)VMS: Interactive Visualization to Support the Sensemaking and Selection of Predictive ModelsProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645151(229-244)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645151
Zheng GYe WZhang ABaeza-Yates RBonchi F(2024)Spuriousness-Aware Meta-Learning for Learning Robust ClassifiersProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672006(4524-4535)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3672006
Wang JLiu SZhang W(2024)Visual Analytics for Machine Learning: A Data Perspective SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335706530:12(7637-7656)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TVCG.2024.3357065
Castelo SRulff JMcGowan ESteers BWu GChen SRoman ILopez RBrewer EZhao CQian JCho KHe HSun QVo HBello JKrone MSilva C(2024): Visualization of AI-Assisted Task Guidance in ARIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332739630:1(1313-1323)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3327396
Huang YZhang ZJiao AMa YCheng R(2024)A Comparative Visual Analytics Framework for Evaluating Evolutionary Processes in Multi-Objective OptimizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332692130:1(661-671)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3326921
Meng Lvan den Elzen SPezzotti NVilanova A(2024)Class-Constrained t-SNE: Combining Data Features and Class ProbabilitiesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332660030:1(164-174)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3326600
Jamonnak SGuo JHe WGou LRen L(2024)OW-Adapter: Human-Assisted Open-World Object Detection with a Few ExamplesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332657730:1(694-704)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3326577
Angelini MBlasilli GLenti SSantucci G(2024)A Visual Analytics Conceptual Framework for Explorable and Steerable Partial Dependence AnalysisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.326373930:8(4497-4513)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1109/TVCG.2023.3263739
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents