research-article

Open access

Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels

Authors:

Jochen Görtler,

Dominik Moritz,

Kanit Wongsuphasawat,

Kayur PatelAuthors Info & Claims

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

Article No.: 408, Pages 1 - 13

https://doi.org/10.1145/3491102.3501823

Published: 29 April 2022 Publication History

All formats PDF

Abstract

The confusion matrix, a ubiquitous visualization for helping people evaluate machine learning models, is a tabular layout that compares predicted class labels against actual class labels over all data instances. We conduct formative research with machine learning practitioners at Apple and find that conventional confusion matrices do not support more complex data-structures found in modern-day applications, such as hierarchical and multi-output labels. To express such variations of confusion matrices, we design an algebra that models confusion matrices as probability distributions. Based on this algebra, we develop Neo, a visual analytics system that enables practitioners to flexibly author and interact with hierarchical and multi-output confusion matrices, visualize derived metrics, renormalize confusions, and share matrix specifications. Finally, we demonstrate Neo’s utility with three model evaluation scenarios that help people better understand model performance and reveal hidden confusions.

Supplementary Material

MP4 File (3491102.3501823-video-preview.mp4)

Video Preview

Download
16.59 MB

MP4 File (3491102.3501823-talk-video.mp4)

Talk Video

Download
258.37 MB

References

[1]

Bilal Alsallakh, Allan Hanbury, Helwig Hauser, Silvia Miksch, and Andreas Rauber. 2014. Visual methods for analyzing probabilistic classification data. IEEE Transactions on Visualization and Computer Graphics (2014). https://doi.org/10.1109/tvcg.2014.2346660

[2]

Bilal Alsallakh, Amin Jourabloo, Mao Ye, Xiaoming Liu, and Liu Ren. 2017. Do convolutional neural networks learn class hierarchy?IEEE Transactions on Visualization and Computer Graphics (2017). https://doi.org/10.1109/tvcg.2017.2744683

[3]

Bilal Alsallakh, Zhixin Yan, Shabnam Ghaffarzadegan, Zeng Dai, and Liu Ren. 2020. Visualizing classification structure of large-scale classifiers. In ICML Workshop on Human Interpretability in Machine Learning.

[4]

Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE. https://doi.org/10.1109/icse-seip.2019.00042

Digital Library

[5]

Saleema Amershi, Max Chickering, Steven M Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. Modeltracker: Redesigning performance analysis tools for machine learning. In Proceedings of the CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/2702123.2702509

Digital Library

[6]

Jacques Bertin. 1983. Semiology of graphics. University of Wisconsin Press.

Digital Library

[7]

Daniel Bruckner. 2014. ML-o-Scope: A diagnostic visualization system for deep machine learning pipelines. Technical Report. https://doi.org/10.21236/ada605112

[8]

Olivier Caelen. 2017. A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence (2017). https://doi.org/10.1007/s10472-017-9564-8

Digital Library

[9]

Stuart K Card, Jock D Mackinlay, and Ben Shneiderman. 1999. Readings in information visualization: Using vision to think. Morgan Kaufmann Publishers Inc.

Digital Library

[10]

Edgar Frank Codd. 1970. A relational model of data for large shared data banks. Commun. ACM (1970).

[11]

Graham R Gibbs. 2007. Thematic coding and categorizing. Analyzing Qualitative Data(2007). https://doi.org/10.4135/9781849208574.n4

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European Conference on Computer Vision. Springer. https://doi.org/10.1007/978-3-319-46493-0_38

[13]

A. Hinterreiter, P. Ruch, H. Stitz, M. Ennemoser, J. Bernard, H. Strobelt, and M. Streit. 2020. ConfusionFlow: A model-agnostic visualization for temporal analysis of classifier confusion. IEEE Transactions on Visualization and Computer Graphics (2020). https://doi.org/10.1109/tvcg.2020.3012063

Digital Library

[14]

Robert Hogg and Elliot Tanis. 2020. Probability and statistical inference. Pearson.

[15]

Fred Hohman, Kanit Wongsuphasawat, Mary Beth Kery, and Kayur Patel. 2020. Understanding and visualizing data iteration in machine learning. In Proceedings of the CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3313831.3376177

Digital Library

[16]

Mohammad Hossin and M. N. Sulaiman. 2015. A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process (2015). https://doi.org/10.5121/ijdkp.2015.5201

[17]

Jigsaw. 2017. Toxic comment classification challenge. Kaggle (2017).

[18]

Minsuk Kahng, Pierre Y Andrews, Aditya Kalro, and Duen Horng Chau. 2017. ActiVis: Visual exploration of industry-scale deep neural network models. IEEE Transactions on Visualization and Computer Graphics (2017). https://doi.org/10.1109/tvcg.2017.2744718

[19]

Sean Kandel, Ravi Parikh, Andreas Paepcke, Joseph M Hellerstein, and Jeffrey Heer. 2012. Profiler: Integrated statistical analysis and visualization for data quality assessment. In Proceedings of the International Working Conference on Advanced Visual Interfaces. https://doi.org/10.1145/2254556.2254659

Digital Library

[20]

Ashish Kapoor, Bongshin Lee, Desney Tan, and Eric Horvitz. 2010. Interactive optimization for steering machine classification. In Proceedings of the CHI Conference on Human Factors in Computing Systems(CHI ’10). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1753326.1753529

Digital Library

[21]

Mary Beth Kery, Donghao Ren, Fred Hohman, Dominik Moritz, Kanit Wongsuphasawat, and Kayur Patel. 2020. mage: Fluid moves between code and graphical work in computational notebooks. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. https://doi.org/10.1145/3379337.3415842

Digital Library

[22]

Damir Krstinić, Maja Braović, Ljiljana Šerić, and Dunja Božić-Štulić. 2020. Multi-label classifier performance evaluation with confusion matrix. In Computer Science & Information Technology. AIRCC Publishing Corporation. https://doi.org/10.5121/csit.2020.100801

[23]

George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM (1995). https://doi.org/10.1145/219717.219748

Digital Library

[24]

Kayur Patel, Naomi Bancroft, Steven M Drucker, James Fogarty, Andrew J Ko, and James Landay. 2010. Gestalt: Integrated support for implementation and analysis in machine learning. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology. https://doi.org/10.1145/1866029.1866038

Digital Library

[25]

Donghao Ren, Saleema Amershi, Bongshin Lee, Jina Suh, and Jason D Williams. 2016. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE Transactions on Visualization and Computer Graphics (2016). https://doi.org/10.1109/tvcg.2016.2598828

Digital Library

[26]

Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and Jeffrey Heer. 2016. Vega-lite: A grammar of interactive graphics. IEEE Transactions on Visualization and Computer Graphics (2016). https://doi.org/10.31219/osf.io/mqzyx

[27]

Christin Seifert and Elisabeth Lex. 2009. A novel visualization approach for data-mining-related classification. In 2009 13th International Conference Information Visualisation. IEEE. https://doi.org/10.1109/iv.2009.45

Digital Library

[28]

Hong Shen, Haojian Jin, Ángel Alexander Cabrera, Adam Perer, Haiyi Zhu, and Jason I Hong. 2020. Designing alternative representations of confusion matrices to support non-expert public understanding of algorithm performance. Proceedings of the ACM on Human-Computer Interaction (2020). https://doi.org/10.1145/3415224

Digital Library

[29]

Chris Stolte, Diane Tang, and Pat Hanrahan. 2002. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE Transactions on Visualization and Computer Graphics (2002). https://doi.org/10.1109/2945.981851

Digital Library

[30]

Chris Stolte, Diane Tang, and Pat Hanrahan. 2002. Query, analysis, and visualization of hierarchically structured data using polaris. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. https://doi.org/10.1145/775047.775064

Digital Library

[31]

Aixin Sun and Ee-Peng Lim. 2001. Hierarchical text classification and evaluation. In Proceedings 2001 IEEE International Conference on Data Mining. IEEE. https://doi.org/10.1109/ICDM.2001.989560

[32]

Robert Susmaga. 2004. Confusion matrix visualization. In Intelligent Information Processing and Web Mining. Springer. https://doi.org/10.1007/978-3-540-39985-8_12

[33]

Justin Talbot, Bongshin Lee, Ashish Kapoor, and Desney S. Tan. 2009. EnsembleMatrix: Interactive visualization to support machine learning with multiple classifiers. In Proceedings of the CHI Conference on Human Factors in Computing Systems(CHI ’09). ACM, 10 pages. https://doi.org/10.1145/1518701.1518895

Digital Library

[34]

Niklas Tötsch and Daniel Hoffmann. 2021. Classifier uncertainty: Evidence, potential impact, and probabilistic treatment. PeerJ Computer Science(2021). https://doi.org/10.7717/peerj-cs.398

[35]

S. van den Elzen and J. J. van Wijk. 2011. BaobabView: Interactive construction and analysis of decision trees. In 2011 IEEE Conference on Visual Analytics Science and Technology. https://doi.org/10.1109/vast.2011.6102453

[36]

James Wexler. 2017. Facets: An open source visualization tool for machine learning training data. http://ai.googleblog.com/2017/07/facets-open-source-visualization-tool.html.

[37]

Leland Wilkinson and Michael Friendly. 2009. The history of the cluster heat map. The American Statistician(2009). https://doi.org/10.1198/tas.2009.0033

[38]

Kanit Wongsuphasawat, Zening Qu, Dominik Moritz, Riley Chang, Felix Ouk, Anushka Anand, Jock Mackinlay, Bill Howe, and Jeffrey Heer. 2017. Voyager 2: Augmenting visual analysis with partial view specifications. In Proceedings of the CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3025453.3025768

Digital Library

[39]

Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding interactive machine learning tool design in how non-experts actually build models. In Proceedings of the 2018 Designing Interactive Systems Conference. https://doi.org/10.1145/3196709.3196729

Digital Library

[40]

J. Zhang, Y. Wang, P. Molino, L. Li, and D. S. Ebert. 2019. Manifold: A model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE Transactions on Visualization and Computer Graphics (2019). https://doi.org/10.1109/tvcg.2018.2864499

Digital Library

[41]

Xiaoyi Zhang, Lilian de Greef, Amanda Swearngin, Samuel White, Kyle Murray, Lisa Yu, Qi Shan, Jeffrey Nichols, Jason Wu, Chris Fleizach, Aaron Everitt, and Jeffrey P Bigham. 2021. Screen Recognition: Creating Accessibility Metadata for Mobile Applications from Pixels. Proceedings of the CHI Conference on Human Factors in Computing Systems (2021). https://doi.org/10.1145/3411764.3445186

Digital Library

Cited By

Alrabayah OCaus DWatson RSchulten HWeigel TRüpke LAl-Halbouni D(2024)Deep-Learning-Based Automatic Sinkhole Recognition: Application to the Eastern Dead SeaRemote Sensing10.3390/rs1613226416:13(2264)Online publication date: 21-Jun-2024
https://doi.org/10.3390/rs16132264
Ahmad MShao ZJaved AAhmad IIslam FSkilodimou HBathrellos G(2024)Optical–SAR Data Fusion Based on Simple Layer Stacking and the XGBoost Algorithm to Extract Urban Impervious Surfaces in Global Alpha CitiesRemote Sensing10.3390/rs1605087316:5(873)Online publication date: 1-Mar-2024
https://doi.org/10.3390/rs16050873
Shao ZAhmad MJaved A(2024)Comparison of Random Forest and XGBoost Classifiers Using Integrated Optical and SAR Features for Mapping Urban Impervious SurfaceRemote Sensing10.3390/rs1604066516:4(665)Online publication date: 13-Feb-2024
https://doi.org/10.3390/rs16040665
Show More Cited By

Index Terms

Neo: Generalizing Confusion Matrix Visualization to Hierarchical and Multi-Output Labels
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools
  2. Visualization
    1. Visualization application domains
      1. Visual analytics

Recommendations

Construction and application of hierarchical matrix preconditioners
Matrix Factorization for Identifying Noisy Labels of Multi-label Instances
PRICAI 2018: Trends in Artificial Intelligence
Abstract
Current effort on multi-label learning generally assumes that the given labels are noise-free. However, obtaining noise-free labels is quite difficult and often impractical. In this paper, we study how to identify a subset of relevant labels from ...
Hierarchical matrix preconditioners for the Oseen equations

Hierarchical matrices provide a technique for the data-sparse approximation and matrix arithmetic of large, fully populated matrices. In particular, approximate inverses as well as approximate LU factorizations of finite element stiffness matrices may ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems

April 2022

10459 pages

ISBN:9781450391573

DOI:10.1145/3491102

Editors:
Simone Barbosa
PUC-Rio, Brazil
,
Cliff Lampe
University of Michigan, USA
,
Caroline Appert
Université Paris-Saclay, France
,
David A. Shamma
Toyota Research Institute, USA
,
Steven Drucker
Microsoft Research, USA
,
Julie Williamson
University of Glasgow, UK
,
Koji Yatani
University of Tokyo, Japan

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2022

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '22

Sponsor:

SIGCHI

CHI '22: CHI Conference on Human Factors in Computing Systems

April 29 - May 5, 2022

LA, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
3,130
Total Downloads

Downloads (Last 12 months)1,375
Downloads (Last 6 weeks)135

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Alrabayah OCaus DWatson RSchulten HWeigel TRüpke LAl-Halbouni D(2024)Deep-Learning-Based Automatic Sinkhole Recognition: Application to the Eastern Dead SeaRemote Sensing10.3390/rs1613226416:13(2264)Online publication date: 21-Jun-2024
https://doi.org/10.3390/rs16132264
Ahmad MShao ZJaved AAhmad IIslam FSkilodimou HBathrellos G(2024)Optical–SAR Data Fusion Based on Simple Layer Stacking and the XGBoost Algorithm to Extract Urban Impervious Surfaces in Global Alpha CitiesRemote Sensing10.3390/rs1605087316:5(873)Online publication date: 1-Mar-2024
https://doi.org/10.3390/rs16050873
Shao ZAhmad MJaved A(2024)Comparison of Random Forest and XGBoost Classifiers Using Integrated Optical and SAR Features for Mapping Urban Impervious SurfaceRemote Sensing10.3390/rs1604066516:4(665)Online publication date: 13-Feb-2024
https://doi.org/10.3390/rs16040665
Li MJeong SLiu SBerger M(2024)CAN: Concept‐Aligned Neurons for Visual Comparison of Deep Neural Network ModelsComputer Graphics Forum10.1111/cgf.1508543:3Online publication date: 10-Jun-2024
https://doi.org/10.1111/cgf.15085
Ritu Pandit MBrahma BBhoi A(2024)Gait Analysis through Machine Learning Algorithms2024 International Conference on Advancements in Smart, Secure and Intelligent Computing (ASSIC)10.1109/ASSIC60049.2024.10507902(1-6)Online publication date: 27-Jan-2024
https://doi.org/10.1109/ASSIC60049.2024.10507902
Yang WLiu MWang ZLiu S(2024)Foundation models meet visualizations: Challenges and opportunitiesComputational Visual Media10.1007/s41095-023-0393-x10:3(399-424)Online publication date: 2-May-2024
https://doi.org/10.1007/s41095-023-0393-x
Tang YZhang Y(2024)Set-based visualization and enhancement of embedding results for heterogeneous multi-label networksJournal of Visualization10.1007/s12650-024-00996-wOnline publication date: 20-May-2024
https://doi.org/10.1007/s12650-024-00996-w
Pommé LBourqui RGiot RAuber D(2024)Relative Confusion Matrix: An Efficient Visualization for the Comparison of Classification ModelsArtificial Intelligence and Visualization: Advancing Visual Knowledge Discovery10.1007/978-3-031-46549-9_7(223-243)Online publication date: 25-Apr-2024
https://doi.org/10.1007/978-3-031-46549-9_7
Lovell DMiller DCapra JBradley AKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Never mind the metricsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619352(22702-22757)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619352
Ramjan SSunkpho J(2023)ClassificationPrinciples and Theories of Data Mining With RapidMiner10.4018/978-1-6684-4730-7.ch005(83-106)Online publication date: 2-Jun-2023
https://doi.org/10.4018/978-1-6684-4730-7.ch005
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents