Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3397481.3450658acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
short-paper

Model LineUpper: Supporting Interactive Model Comparison at Multiple Levels for AutoML

Published: 14 April 2021 Publication History

Abstract

Automated Machine Learning (AutoML) is a rapidly growing set of technologies that automate the model development pipeline by searching model space and generating candidate models. A critical, final step of AutoML is human selection of a final model from dozens of candidates. In current AutoML systems, selection is supported only by performance metrics. Prior work has shown that in practice, people evaluate ML models based on additional criteria, such as the way a model makes predictions. Comparison may happen at multiple levels, from types of errors, to feature importance, to how the model makes predictions of specific instances. We developed Model LineUpper to support interactive model comparison for AutoML by integrating multiple Explainable AI (XAI) and visualization techniques. We conducted a user study in which we both evaluated the system and used it as a technology probe to understand how users perform model comparison in an AutoML system. We discuss design implications for utilizing XAI techniques for model comparison and supporting the unique needs of data scientists in comparing AutoML models.

References

[1]
Bilal Alsallakh, Allan Hanbury, Helwig Hauser, Silvia Miksch, and Andreas Rauber. 2014. Visual Methods for Analyzing Probabilistic Classification Data. IEEE Transactions on Visualization and Computer Graphics 20, 12 (Dec. 2014), 1703–1712. https://doi.org/10/f6qj3c
[2]
Saleema Amershi, Max Chickering, Steven M. Drucker, Bongshin Lee, Patrice Simard, and Jina Suh. 2015. ModelTracker: Redesigning Performance Analysis Tools for Machine Learning. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems(CHI ’15). Association for Computing Machinery, New York, NY, USA, 337–346. https://doi.org/10/ggscn5
[3]
Aaron Bangor, Philip T Kortum, and James T Miller. 2008. An empirical evaluation of the system usability scale. Intl. Journal of Human–Computer Interaction 24, 6(2008), 574–594.
[4]
Michael Brooks, Saleema Amershi, Bongshin Lee, Steven M. Drucker, Ashish Kapoor, and Patrice Simard. 2015. FeatureInsight: Visual Support for Error-Driven Feature Ideation in Text Classification. In 2015 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, Chicago, IL, USA, 105–112. https://doi.org/10/ggscqd
[5]
Jaegul Choo, Hanseung Lee, Jaeyeon Kihm, and Haesun Park. 2010. iVisClassifier: An Interactive Visual Analytics System for Classification Based on Supervised Dimension Reduction. In 2010 IEEE Symposium on Visual Analytics Science and Technology. 27–34. https://doi.org/10/c6gcb3
[6]
Jaimie Drozdal, Justin Weisz, 2020. Exploring Information Needs for Establishing Trust in Automated Data Science Systems. In IUI’20. ACM, in press.
[7]
Google. [n.d.]. Cloud AutoML. Retrieved 3-April-2019 from https://cloud.google.com/automl/
[8]
Google PAIR. 2018. What-If Tool. https://pair-code.github.io/what-if-tool/.
[9]
H2O. [n.d.]. H2O. Retrieved 3-April-2019 from https://h2o.ai
[10]
Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M. Drucker. [n.d.]. Gamut: A Design Probe to Understand How Data Scientists Understand Machine Learning Models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19 (Glasgow, Scotland Uk, 2019). ACM Press, 1–13. https://doi.org/10/ggcn2m
[11]
Youyang Hou and Dakuo Wang. 2017. Hacking with NPOs: collaborative analytics and broker roles in civic data hackathons. Proceedings of the ACM on Human-Computer Interaction 1, CSCW(2017), 53.
[12]
IBM. [n.d.]. AutoAI. Retrieved 06-Oct-2019 from https://www.ibm.com/cloud/watson-studio/autoai
[13]
Minsuk Kahng, Pierre Y Andrews, Aditya Kalro, and Duen Horng Polo Chau. 2017. Activis: Visual exploration of industry-scale deep neural network models. IEEE transactions on visualization and computer graphics 24, 1(2017), 88–97.
[14]
Josua Krause, Adam Perer, and Kenney Ng. 2016. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 5686–5697.
[15]
Peter Krensky, Pieter den Harner, Erick Brethenoux, Jim Hare, Svetlana Sicular, and Shubhangi Vashisth. 2020. Magic Quadrant for data science and machine-learning platforms. Gartner, Inc (2020).
[16]
Doris Jung-Lin Lee, Stephen Macke, Doris Xin, Angela Lee, Silu Huang, and Aditya Parameswaran. 2019. A Human-in-the-loop Perspective on AutoML: Milestones and the Road Ahead. Data Engineering (2019), 58.
[17]
LendingClub. [n.d.]. LendingClub Statistics. https://www.lendingclub.com/info/statistics.action
[18]
Sijia Liu, Parikshit Ram, Deepak Vijaykeerthy, Djallel Bouneffouf, Gregory Bramble, Horst Samulowitz, Dakuo Wang, Andrew Conn, and Alexander G Gray. 2020. An ADMM Based Framework for AutoML Pipeline Configuration. In AAAI. 4892–4899.
[19]
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett(Eds.). Curran Associates, Inc., 4765–4774.
[20]
Yaoli Mao, Dakuo Wang, Michael Muller, Kush Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilovic. 2020. How Data Scientists Work Together With Domain Experts in Scientific Collaborations. In Proceedings of the 2020 ACM conference on GROUP. ACM.
[21]
Michael Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Q. Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, UK) (CHI ’19). ACM, New York, NY, USA, Forthcoming.
[22]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-Learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.
[23]
David Piorkowski, Soya Park, April Yi Wang, Dakuo Wang, Michael Muller, and Felix Portnoy. 2021. How AI Developers Overcome Communication Challenges in a Multidisciplinary Team: A Case Study. In Proceedings of the CSCW 2021.
[24]
Donghao Ren, Saleema Amershi, Bongshin Lee, Jina Suh, and Jason D. Williams. [n.d.]. Squares: Supporting Interactive Performance Analysis for Multiclass Classifiers. 23, 1([n. d.]), 61–70. https://doi.org/10/f9zx4t
[25]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ”Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD ’16). ACM, New York, NY, USA, 1135–1144. https://doi.org/10/gfgrbd
[26]
Hendrik Strobelt, Sebastian Gehrmann, Hanspeter Pfister, and Alexander M. Rush. 2018. LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks. IEEE Transactions on Visualization and Computer Graphics 24, 1 (Jan. 2018), 667–676. https://doi.org/10/gcp7b5
[27]
Erik Štrumbelj and Igor Kononenko. 2014. Explaining Prediction Models and Individual Predictions with Feature Contributions. Knowledge and Information Systems 41, 3 (Dec. 2014), 647–665. https://doi.org/10.1007/s10115-013-0679-x
[28]
Dong Sun, Zezheng Feng, Yuanzhe Chen, Yong Wang, Jia Zeng, Mingxuan Yuan, Ting-Chuen Pong, and Huamin Qu. [n.d.]. DFSeer: A Visual Analytics Approach to Facilitate Model Selection for Demand Forecasting. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA, 2020-04-21) (CHI ’20). Association for Computing Machinery, 1–13. https://doi.org/10.1145/3313831.3376866
[29]
Dakuo Wang, Josh Andres, Justin Weisz, Erick Oduor, and Casey Dugan. 2021. AutoDS: Towards Human-Centered Automation of Data Science. In Proceedings of the CHI 2021.
[30]
Dakuo Wang, Q. Vera Liao, Yunfeng Zhang, Udayan Khurana, Horst Samulowitz, Soya Park, Michael Muller, and Lisa Amini. 2021. How Much Automation Does a Data Scientist Want?. In pre-print.
[31]
Dakuo Wang, Parikshit Ram, Daniel Karl I Weidele, Sijia Liu, Michael Muller, Justin D Weisz, Abel Valente, Arunima Chaudhary, Dustin Torres, Horst Samulowitz, 2020. AutoAI: Automating the End-to-End AI Lifecycle with Humans-in-the-Loop. In Proceedings of the 25th International Conference on Intelligent User Interfaces Companion. 77–78.
[32]
Dakuo Wang, Justin D. Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. 2019. Human-AI Collaboration in Data Science: Exploring Data Scientists’ Perceptions of Automated AI. To appear in Computer Supported Cooperative Work (CSCW) (2019).
[33]
Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, and Huamin Qu. [n.d.]. ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk, 2019-05-02) (CHI ’19). Association for Computing Machinery, 1–12. https://doi.org/10/ggcn2s
[34]
Daniel Weidele, Justin Weisz, Erick Oduor, Michael Muller, Josh Andres, Alexander Gray, and Dakuo Wang. 2020. AutoAIViz: Opening the Blackbox of Automated Artificial Intelligence with Conditional Parallel Coordinates. In IUI’20. ACM, in press.
[35]
Jiawei Zhang, Yang Wang, Piero Molino, Lezhi Li, and David S. Ebert. [n.d.]. Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models. 25, 1([n. d.]), 364–373. https://doi.org/10/ggsr89 arxiv:1808.00196
[36]
Marc-André Zöller and Marco F Huber. 2019. Survey on Automated Machine Learning. arXiv preprint arXiv:1904.12054(2019).

Cited By

View all
  • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693301(30566-30584)Online publication date: 21-Jul-2024
  • (2024)A Roadmap of Explainable Artificial Intelligence: Explain to Whom, When, What and How?ACM Transactions on Autonomous and Adaptive Systems10.1145/370200419:4(1-40)Online publication date: 24-Nov-2024
  • (2024)Seamful XAI: Operationalizing Seamful Design in Explainable AIProceedings of the ACM on Human-Computer Interaction10.1145/36373968:CSCW1(1-29)Online publication date: 26-Apr-2024
  • Show More Cited By

Index Terms

  1. Model LineUpper: Supporting Interactive Model Comparison at Multiple Levels for AutoML
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        IUI '21: Proceedings of the 26th International Conference on Intelligent User Interfaces
        April 2021
        618 pages
        ISBN:9781450380171
        DOI:10.1145/3397481
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 14 April 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Short-paper
        • Research
        • Refereed limited

        Conference

        IUI '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 746 of 2,811 submissions, 27%

        Upcoming Conference

        IUI '25

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)54
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 12 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)PositionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693301(30566-30584)Online publication date: 21-Jul-2024
        • (2024)A Roadmap of Explainable Artificial Intelligence: Explain to Whom, When, What and How?ACM Transactions on Autonomous and Adaptive Systems10.1145/370200419:4(1-40)Online publication date: 24-Nov-2024
        • (2024)Seamful XAI: Operationalizing Seamful Design in Explainable AIProceedings of the ACM on Human-Computer Interaction10.1145/36373968:CSCW1(1-29)Online publication date: 26-Apr-2024
        • (2024)Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression ExperimentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345637131:1(809-819)Online publication date: 10-Sep-2024
        • (2024)Lori: Local Low-Rank Response Imputation for Automatic Configuration of Contextualized Artificial IntelligenceIEEE Transactions on Industrial Informatics10.1109/TII.2024.343107920:12(13707-13718)Online publication date: Dec-2024
        • (2024)Democratizing Data Science:Using Language Models for Intuitive Data Insights and Visualizations2024 4th International Conference on Pervasive Computing and Social Networking (ICPCSN)10.1109/ICPCSN62568.2024.00177(1065-1069)Online publication date: 3-May-2024
        • (2023)Selective Explanations: Leveraging Human Input to Align Explainable AIProceedings of the ACM on Human-Computer Interaction10.1145/36102067:CSCW2(1-35)Online publication date: 4-Oct-2023
        • (2023)Deliberating with AI: Improving Decision-Making for the Future through Participatory AI Design and Stakeholder DeliberationProceedings of the ACM on Human-Computer Interaction10.1145/35796017:CSCW1(1-32)Online publication date: 16-Apr-2023
        • (2023)Faulty or Ready? Handling Failures in Deep-Learning Computer Vision Models until Deployment: A Study of Practices, Challenges, and NeedsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581555(1-20)Online publication date: 19-Apr-2023
        • (2023)AutoML in The Wild: Obstacles, Workarounds, and ExpectationsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581082(1-15)Online publication date: 19-Apr-2023
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media