research-article

Open access

Investigating Explainability of Generative AI for Code through Scenario-based Design

Authors:

Michael Muller,

Mayank Agarwal,

Stephanie Houde,

Kartik Talamadupula,

Justin D. WeiszAuthors Info & Claims

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Pages 212 - 228

https://doi.org/10.1145/3490099.3511119

Published: 22 March 2022 Publication History

All formats PDF

Abstract

What does it mean for a generative AI model to be explainable? The emergent discipline of explainable AI (XAI) has made great strides in helping people understand discriminative models. Less attention has been paid to generative models that produce artifacts, rather than decisions, as output. Meanwhile, generative AI (GenAI) technologies are maturing and being applied to application domains such as software engineering. Using scenario-based design and question-driven XAI design approaches, we explore users’ explainability needs for GenAI in three software engineering use cases: natural language to code, code translation, and code auto-completion. We conducted 9 workshops with 43 software engineers in which real examples from state-of-the-art generative AI models were used to elicit users’ explainability needs. Drawing from prior work, we also propose 4 types of XAI features for GenAI for code and gathered additional design ideas from participants. Our work explores explainability needs for GenAI for code and demonstrates how human-centered approaches can drive the technical development of XAI in novel domains.

References

[1]

Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access 6(2018), 52138–52160.

[2]

Mayank Agarwal, Kartik Talamadupula, Stephanie Houde, Fernando Martinez, Michael J. Muller, John T. Richards, Steven Ross, and Justin D. Weisz. 2020. Quality Estimation & Interpretability for Code Translation. ArXiv abs/2012.07581(2020).

[3]

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2021. Unified Pre-training for Program Understanding and Generation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 2655–2668. https://doi.org/10.18653/v1/2021.naacl-main.211

[4]

Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2018. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–37.

Digital Library

[5]

Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine 35, 4 (2014), 105–120.

Digital Library

[6]

Cecilia Aragon, Shion Guha, Marina Kogan, Michael Muller, and Gina Neff. 2022. Human-Centered Data Science: An Introduction. MIT Press, Cambridge, MA.

[7]

Cecilia Aragon, Clayton Hutto, Andy Echenique, Brittany Fiore-Gartland, Yun Huang, Jinyoung Kim, Gina Neff, Wanli Xing, and Joseph Bayer. 2016. Developing a research agenda for human-centered data science. In Proceedings of the 19th ACM Conference on Computer Supported Cooperative Work and Social Computing Companion. 529–535.

Digital Library

[8]

Matthew Arnold, Rachel KE Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, K Natesan Ramamurthy, Alexandra Olteanu, David Piorkowski, 2019. FactSheets: Increasing trust in AI services through supplier’s declarations of conformity. IBM Journal of Research and Development 63, 4/5 (2019), 6–1.

[9]

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, 2021. Program Synthesis with Large Language Models. arXiv preprint arXiv:2108.07732(2021).

[10]

Umang Bhatt, Javier Antorán, Yunfeng Zhang, Q Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo, 2021. Uncertainty as a form of transparency: Measuring, communicating, and using uncertainty. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 401–413.

Digital Library

[11]

Umang Bhatt, Alice Xiang, Shubham Sharma, Adrian Weller, Ankur Taly, Yunhan Jia, Joydeep Ghosh, Ruchir Puri, José MF Moura, and Peter Eckersley. 2020. Explainable machine learning in deployment. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 648–657.

Digital Library

[12]

Kirsten Boehner, Janet Vertesi, Phoebe Sengers, and Paul Dourish. 2007. How HCI interprets the probes. In Proceedings of the SIGCHI conference on Human factors in computing systems. 1077–1086.

Digital Library

[13]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, T. J. Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeff Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. ArXiv abs/2005.14165(2020).

[14]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of KDD.

Digital Library

[15]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating Large Language Models Trained on Code. arxiv:2107.03374 [cs.LG]

[16]

Tian Qi Chen, Xuechen Li, Roger B. Grosse, and David Kristjanson Duvenaud. 2018. Isolating Sources of Disentanglement in Variational Autoencoders. In NeurIPS.

[17]

Premkumar Devanbu. 2015. New initiative: The naturalness of software. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 2. IEEE, 543–546.

[18]

Shipi Dhanorkar, Christine T Wolf, Kun Qian, Anbang Xu, Lucian Popa, and Yunyao Li. 2021. Who needs to know what, when?: Broadening the Explainable AI (XAI) Design Space by Looking at Explanations Across the AI Lifecycle. In Designing Interactive Systems Conference 2021. 1591–1602.

Digital Library

[19]

Upol Ehsan, Q Vera Liao, Michael Muller, Mark O Riedl, and Justin D Weisz. 2021. Expanding explainability: Towards social transparency in ai systems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–19.

Digital Library

[20]

Upol Ehsan and Mark Riedl. 2021. Explainability Pitfalls: Beyond Dark Patterns in Explainable AI - paper at HCAI@NeurIPS2021 workshop on human centered AI. https://sites.google.com/view/hcai-human-centered-ai-neurips/home Accessed January 19, 2022.

[21]

Upol Ehsan and Mark O Riedl. 2020. Human-centered explainable ai: Towards a reflective sociotechnical approach. In International Conference on Human-Computer Interaction. Springer, 449–466.

Digital Library

[22]

Upol Ehsan, Philipp Wintersberger, Q Vera Liao, Martina Mara, Marc Streit, Sandra Wachter, Andreas Riener, and Mark O Riedl. 2021. Operationalizing Human-Centered Perspectives in Explainable AI. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–6.

[23]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, 2020. Codebert: A pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155(2020).

[24]

Bill Gaver, Tony Dunne, and Elena Pacenti. 1999. Design: Cultural Probes. Interactions 6, 1 (jan 1999), 21–29. https://doi.org/10.1145/291224.291235

Digital Library

[25]

Werner Geyer, Lydia B Chilton, Justin D Weisz, and Mary Lou Maher. 2021. HAI-GEN 2021: 2nd Workshop on Human-AI Co-Creation with Generative Models. In 26th International Conference on Intelligent User Interfaces. 15–17.

Digital Library

[26]

Soumya Ghosh, Q Vera Liao, Karthikeyan Natesan Ramamurthy, Jiri Navratil, Prasanna Sattigeri, Kush R Varshney, and Yunfeng Zhang. 2021. Uncertainty Quantification 360: A Holistic Toolkit for Quantifying and Communicating the Uncertainty of AI. arXiv preprint arXiv:2106.01410(2021).

[27]

Github. 2021. Copilot. Retrieved 03-August-2021 from https://copilot.github.com

[28]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (2020), 139–144.

Digital Library

[29]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51, 5 (2018), 1–42.

[30]

Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, 2020. Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366(2020).

[31]

Shunan Guo, Fan Du, Sana Malik, Eunyee Koh, Sungchul Kim, Zhicheng Liu, Donghyun Kim, Hongyuan Zha, and Nan Cao. 2019. Visualizing uncertainty and alternatives in event sequence predictions. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[32]

Aaron Halfaker and R Stuart Geiger. 2020. Ores: Lowering barriers with participatory machine learning in wikipedia. Proceedings of the ACM on Human-Computer Interaction 4, CSCW2(2020), 1–37.

Digital Library

[33]

Dan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, 2021. Measuring Coding Challenge Competence With APPS. arXiv preprint arXiv:2105.09938(2021).

[34]

Denis J Hilton. 1990. Conversational processes and causal explanation.Psychological Bulletin 107, 1 (1990), 65.

[35]

M. Hind, Stephanie Houde, Jacquelyn Martino, A. Mojsilovic, David Piorkowski, John T. Richards, and K. Varshney. 2020. Experiences with Improving the Transparency of AI Models and Services. Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (2020).

[36]

M. Hind, S. Mehta, A. Mojsilovic, R. Nair, K. Ramamurthy, Alexandra Olteanu, and K. Varshney. 2019. Increasing Trust in AI Services through Supplier’s Declarations of Conformity. IBM J. Res. Dev. 63(2019), 6:1–6:13.

[37]

Abram Hindle, Earl T Barr, Mark Gabel, Zhendong Su, and Premkumar Devanbu. 2016. On the naturalness of software. Commun. ACM 59, 5 (2016), 122–131.

Digital Library

[38]

Eric Horvitz. 1999. Principles of Mixed-Initiative User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Pittsburgh, Pennsylvania, USA) (CHI ’99). Association for Computing Machinery, New York, NY, USA, 159–166. https://doi.org/10.1145/302979.303030

Digital Library

[39]

Seohyun Kim, Jinman Zhao, Yuchi Tian, and Satish Chandra. 2021. Code prediction by feeding trees to transformers. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 150–162.

Digital Library

[40]

Bran Knowles and John T. Richards. 2021. The Sanction of Authority: Promoting Public Trust in AI. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021).

Digital Library

[41]

Marina Kogan, Aaron Halfaker, Shion Guha, Cecilia Aragon, Michael Muller, and Stuart Geiger. 2020. Mapping Out Human-Centered Data Science: Methods, Approaches, and Best Practices. In Companion of the 2020 ACM International Conference on Supporting Group Work. 151–156.

Digital Library

[42]

Sandeep Kaur Kuttal, Jarow Myers, Sam Gurka, David Magar, David Piorkowski, and Rachel Bellamy. 2020. Towards designing conversational agents for pair programming: Accounting for creativity strategies and conversational styles. In 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 1–11.

[43]

Sandeep Kaur Kuttal, Bali Ong, Kate Kwasny, and Peter Robe. 2021. Trade-offs for Substituting a Human with an Agent in a Pair Programming Context: The Good, the Bad, and the Ugly. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–20.

Digital Library

[44]

Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1675–1684. https://doi.org/10.1145/2939672.2939874

Digital Library

[45]

Min Kyung Lee, Nina Grgić-Hlača, Michael Carl Tschantz, Reuben Binns, Adrian Weller, Michelle Carney, and Kori Inkpen. 2020. Human-centered approaches to fair and responsible AI. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[46]

Min Kyung Lee, Daniel Kusbit, Anson Kahng, Ji Tae Kim, Xinran Yuan, Allissa Chan, Daniel See, Ritesh Noothigattu, Siheon Lee, Alexandros Psomas, 2019. WeBuildAI: Participatory framework for algorithmic governance. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–35.

Digital Library

[47]

Tao Lei, Regina Barzilay, and T. Jaakkola. 2016. Rationalizing Neural Predictions. In EMNLP.

[48]

Q Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: informing design practices for explainable AI user experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[49]

Q Vera Liao and Michael Muller. 2019. Enabling Value Sensitive AI Systems through Participatory Design Fictions. arXiv preprint arXiv:1912.07381(2019).

[50]

Q Vera Liao, Milena Pribić, Jaesik Han, Sarah Miller, and Daby Sow. 2021. Question-Driven Design Process for Explainable AI User Experiences. arXiv preprint arXiv:2104.03483(2021).

[51]

Q Vera Liao, Moninder Singh, Yunfeng Zhang, and Rachel Bellamy. 2021. Introduction to explainable ai. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. 1–3.

Digital Library

[52]

Q Vera Liao and Kush R Varshney. 2021. Human-Centered Explainable AI (XAI): From Algorithms to User Experiences. arXiv preprint arXiv:2110.10790(2021).

[53]

Brian Y Lim and Anind K Dey. 2010. Toolkit to support intelligibility in context-aware applications. In Proceedings of the 12th ACM international conference on Ubiquitous computing. 13–22.

Digital Library

[54]

Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the SIGCHI conference on human factors in computing systems. 2119–2128.

Digital Library

[55]

Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris B. Kotsiantis. 2021. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 23(2021).

[56]

Zachary C Lipton. 2018. The mythos of model interpretability. Queue 16, 3 (2018), 31–57.

Digital Library

[57]

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What Makes Good In-Context Examples for GPT-3?arXiv preprint arXiv:2101.06804(2021).

[58]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2021. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586(2021).

[59]

Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J Cai. 2020. Novice-AI music co-creation via AI-steering tools for deep generative models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[60]

Ryan Louie, Any Cohen, Cheng-Zhi Anna Huang, Michael Terry, and Carrie J Cai. 2020. Cococo: AI-Steering Tools for Music Novices Co-Creating with Generative Models. In HAI-GEN+ user2agent@ IUI.

[61]

Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, and Shujie Liu. 2021. CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation. ArXiv abs/2102.04664(2021).

[62]

Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems. 4768–4777.

[63]

Cade Metz. 2021. A.I. Can Now Write Its Own Computer Code. That’s Good News for Humans.The New York Times (9 September 2021). https://www.nytimes.com/2021/09/09/technology/codex-artificial-intelligence-coding.html

[64]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (2019).

Digital Library

[65]

Michael Muller, Plamen Angelov, Shion Guha, Marina Kogan, Gina Neff, Nuria Oliver, Manuel Gomez Rodriquez, and Adrian Weller. 2021. HCAI@NeurIPS2021: Human Centered AI workshop at NeurIPS 2021. https://sites.google.com/view/hcai-human-centered-ai-neurips/home Accessed January 17, 2022.

[66]

Michael Muller, Cecilia Aragon, Shion Guha, Marina Kogan, Gina Neff, Cathrine Seidelin, Katie Shilton, and Anissa Tanweer. 2020. Interrogating Data Science. In Conference Companion Publication of the 2020 on Computer Supported Cooperative Work and Social Computing. 467–473.

[67]

Michael Muller, Melanie Feinberg, Timothy George, Steven J Jackson, Bonnie E John, Mary Beth Kery, and Samir Passi. 2019. Human-centered study of data science work practices. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[68]

Michael Muller and Q Vera Liao. [n.d.]. Exploring AI Ethics and Values through Participatory Design Fictions. ([n. d.]).

[69]

Michael Muller, April Y. Wang, Steven I. Ross, Justin D. Weisz, Mayank Agarwal, Kartik Talamadupula, Stephanie Houde, Fernando Martinez, John Richards, Jaimie Drozdal, Xie Lui, David Piorkowski, and Dakuo Wang. 2021. How data scientists improve generated code documentation in Jupyter notebooks. Retrieved October 5, 2021 from https://hai-gen2021.github.io/program/

[70]

Michael Muller, Christine T Wolf, Josh Andres, Michael Desmond, Narendra Nath Joshi, Zahra Ashktorab, Aabhas Sharma, Kristina Brimijoin, Qian Pan, Evelyn Duesterwald, 2021. Designing Ground Truth and the Social Life of Labels. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.

Digital Library

[71]

Anh Tuan Nguyen, Tung Thanh Nguyen, and Tien N Nguyen. 2014. Migrating code with statistical machine translation. In Companion Proceedings of the 36th International Conference on Software Engineering. 544–547.

Digital Library

[72]

Yusuke Oda, Hiroyuki Fudaba, Graham Neubig, Hideaki Hata, Sakriani Sakti, Tomoki Toda, and Satoshi Nakamura. 2015. Learning to generate pseudo-code from source code using statistical machine translation (t). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 574–584.

Digital Library

[73]

Andrés Páez. 2019. The pragmatic turn in explainable artificial intelligence (XAI). Minds and Machines 29, 3 (2019), 441–459.

Digital Library

[74]

Raja Parasuraman, Thomas B Sheridan, and Christopher D Wickens. 2000. A model for types and levels of human interaction with automation. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans 30, 3 (2000), 286–297.

Digital Library

[75]

David Piorkowski, D. Gonz’alez, John T. Richards, and Stephanie Houde. 2020. Towards evaluating and eliciting high-quality documentation for intelligent systems. ArXiv abs/2011.08774(2020).

[76]

David Piorkowski, Soya Park, A. Wang, Dakuo Wang, Michael J. Muller, and Felix Portnoy. 2021. How AI Developers Overcome Communication Challenges in a Multidisciplinary Team. Proceedings of the ACM on Human-Computer Interaction 5 (2021), 1 – 25.

Digital Library

[77]

Ruchi Puri, D. Kung, G. Janssen, Wei Zhang, Giacomo Domeniconi, Vladmir Zolotov, Julian Dolby, Jie Chen, M. Choudhury, Lindsey Decker, Veronika Thost, Luca Buratti, Saurabh Pujar, and Ulrich Finkler. 2021. Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks. ArXiv abs/2105.12655(2021).

[78]

Inioluwa Deborah Raji and Jingying Yang. 2019. About ml: Annotation and benchmarking on understanding and transparency of machine learning lifecycles. arXiv preprint arXiv:1912.06166(2019).

[79]

Sahil Barjtya Ankur Sharma Usha Rani. 2017. A detailed study of Software Development Life Cycle (SDLC) Models. International Journal of Engineering and Computer Science 6 (2017).

[80]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of KDD.

Digital Library

[81]

John Richards, David Piorkowski, Michael Hind, Stephanie Houde, and Aleksandra Mojsilović. 2020. A Methodology for Creating AI FactSheets. arXiv preprint arXiv:2006.13796(2020).

[82]

John T. Richards, David Piorkowski, M. Hind, Stephanie Houde, and Aleksandra Mojsilovi’c. 2020. A Methodology for Creating AI FactSheets. ArXiv abs/2006.13796(2020).

[83]

Karl Ridgeway. 2016. A Survey of Inductive Biases for Factorial Representation-Learning. ArXiv abs/1612.05299(2016).

[84]

Karl Ridgeway and Michael C. Mozer. 2018. Learning Deep Disentangled Embeddings with the F-Statistic Loss. In NeurIPS.

[85]

Mark O Riedl. 2019. Human-centered artificial intelligence and machine learning. Human Behavior and Emerging Technologies 1, 1 (2019), 33–36.

[86]

Andrew Ross, Nina Chen, Elisa Zhao Hang, Elena L Glassman, and Finale Doshi-Velez. 2021. Evaluating the Interpretability of Generative Models by Interactive Reconstruction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[87]

Mary Beth Rosson and John M Carroll. 2009. Scenario-based design. In Human-computer interaction. CRC Press, 161–180.

[88]

Baptiste Roziere, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. In NeurIPS.

[89]

Ben Shneiderman. 2020. Bridging the gap between ethics and practice: Guidelines for reliable, safe, and trustworthy Human-Centered AI systems. ACM Transactions on Interactive Intelligent Systems (TiiS) 10, 4(2020), 1–31.

Digital Library

[90]

H Colleen Stuart, Laura Dabbish, Sara Kiesler, Peter Kinnaird, and Ruogu Kang. 2012. Social transparency in networked information exchange: a theoretical framework. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work. 451–460.

Digital Library

[91]

Hariharan Subramonyam, Colleen Seifert, and Eytan Adar. 2021. Towards A Process Model for Co-Creating AI Experiences. arXiv preprint arXiv:2104.07595(2021).

[92]

Kartik Talamadupula. 2021. Applied AI Matters - AI4Code: Applying Artificial Intelligence to Source Code. Association for Computing Machinery (ACM) Special Interest Group on AI (SIGAI) AI Matters 7(2021). Issue 1.

Digital Library

[93]

Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit Test Case Generation with Transformers. arXiv preprint arXiv:2009.05617(2020).

[94]

Jennifer Wortman Vaughan and Hanna Wallach. 2020. A human-centered agenda for intelligible machine learning. Machines We Trust: Getting Along with Artificial Intelligence (2020).

[95]

Jesse Vig. 2019. A Multiscale Visualization of Attention in the Transformer Model. In ACL.

[96]

Jesse Vig and Yonatan Belinkov. 2019. Analyzing the Structure of Attention in a Transformer Language Model. In BlackboxNLP@ACL.

[97]

Donald Martin Vinodkumar Prabhakaran Jr. 2020. Participatory Machine Learning Using Community-Based System Dynamics. Health and Human Rights 22, 2 (2020), 71.

[98]

Abhishek Wadhwani and Priyank Jain. 2020. Machine Learning Model Cards Transparency Review: Using model card toolkit. In 2020 IEEE Pune Section International Conference (PuneCon). IEEE, 133–137.

[99]

Yue Wang, Weishi Wang, Shafiq R. Joty, and S. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation.

[100]

Justin D Weisz, Michael Muller, Stephanie Houde, John Richards, Steven I Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection Not Required? Human-AI Partnerships in Code Translation. In 26th International Conference on Intelligent User Interfaces. 402–412.

[101]

Sarah Wiegreffe and Yuval Pinter. 2019. Attention is not not explanation. arXiv preprint arXiv:1908.04626(2019).

[102]

Christine T Wolf. 2019. Explainability scenarios: towards scenario-based XAI design. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 252–257.

Digital Library

[103]

Enhao Zhang and Nikola Banovic. 2021. Method for Exploring Generative Adversarial Networks (GANs) via Automatically Generated Image Galleries. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[104]

Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning Deep Features for Discriminative Localization. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 2921–2929.

[105]

Haiyi Zhu, Bowen Yu, Aaron Halfaker, and Loren Terveen. 2018. Value-sensitive algorithm design: Method, case study, and lessons. Proceedings of the ACM on Human-Computer Interaction 2, CSCW(2018), 1–23.

Digital Library

Cited By

Bancilhon MSiu ARossi RLipka N(2024)Toward an Optimized Human-AI Reviewing Strategy for Contract InspectionThe New Era of Business Intelligence [Working Title]10.5772/intechopen.1005255Online publication date: 1-Jul-2024
https://doi.org/10.5772/intechopen.1005255
Mahadevappa PMuzammal STayyab M(2024)Introduction to Generative AI in Web EngineeringGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch015(297-330)Online publication date: 27-Sep-2024
https://doi.org/10.4018/979-8-3693-3703-5.ch015
Singh B(2024)Lensing Legal Dynamics for Examining Responsibility and Deliberation of Generative AI-Tethered Technological Privacy ConcernsExploring the Ethical Implications of Generative AI10.4018/979-8-3693-1565-1.ch009(146-167)Online publication date: 19-Apr-2024
https://doi.org/10.4018/979-8-3693-1565-1.ch009
Show More Cited By

Index Terms

Investigating Explainability of Generative AI for Code through Scenario-based Design
1. Human-centered computing
2. Software and its engineering
  1. Software creation and management
    1. Software development process management
  2. Software notations and tools

Index terms have been assigned to the content through auto-classification.

Recommendations

Design Principles for Generative AI Applications
CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Generative AI applications present unique design challenges. As generative AI technologies are increasingly being incorporated into mainstream applications, there is an urgent need for guidance on how to design user experiences that foster effective and ...
When, What, and how should generative artificial intelligence explain to Users?
Highlights
- Generative AI services requires eXplainable AI services, and designers can consider the characteristics of interactive interfaces to design eXplainable AI services in Generative AI.
- In order to deliver Generative AI’s eXplainable AI ...
Abstract
With the commercialization of ChatGPT, generative artificial intelligence (AI) has been applied almost everywhere in our lives. However, even though generative AI has become a daily technology that anyone can use, most non-majors need to know the ...
Towards Design Principles for User-Centric Explainable AI in Fraud Detection
Artificial Intelligence in HCI
Abstract
Experts rely on fraud detection and decision support systems to analyze fraud cases, a growing problem in digital retailing and banking. With the advent of Artificial Intelligence (AI) for decision support, those experts face the black-box problem ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

March 2022

888 pages

ISBN:9781450391443

DOI:10.1145/3490099

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 March 2022

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

IUI '22

Sponsor:

IUI '22: 27th International Conference on Intelligent User Interfaces

March 22 - 25, 2022

Helsinki, Finland

Acceptance Rates

Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Sponsor:
sigai
sigai

30th International Conference on Intelligent User Interfaces

March 24 - 27, 2025

Cagliari , Italy

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

93
Total Citations
View Citations
15,474
Total Downloads

Downloads (Last 12 months)7,323
Downloads (Last 6 weeks)503

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bancilhon MSiu ARossi RLipka N(2024)Toward an Optimized Human-AI Reviewing Strategy for Contract InspectionThe New Era of Business Intelligence [Working Title]10.5772/intechopen.1005255Online publication date: 1-Jul-2024
https://doi.org/10.5772/intechopen.1005255
Mahadevappa PMuzammal STayyab M(2024)Introduction to Generative AI in Web EngineeringGenerative AI for Web Engineering Models10.4018/979-8-3693-3703-5.ch015(297-330)Online publication date: 27-Sep-2024
https://doi.org/10.4018/979-8-3693-3703-5.ch015
Singh B(2024)Lensing Legal Dynamics for Examining Responsibility and Deliberation of Generative AI-Tethered Technological Privacy ConcernsExploring the Ethical Implications of Generative AI10.4018/979-8-3693-1565-1.ch009(146-167)Online publication date: 19-Apr-2024
https://doi.org/10.4018/979-8-3693-1565-1.ch009
Sengul CNeykova RDestefanis G(2024)Software engineering education in the era of conversational AI: current trends and future directionsFrontiers in Artificial Intelligence10.3389/frai.2024.14363507Online publication date: 29-Aug-2024
https://doi.org/10.3389/frai.2024.1436350
Tortora L(2024)Beyond Discrimination: Generative AI Applications and Ethical Challenges in Forensic PsychiatryFrontiers in Psychiatry10.3389/fpsyt.2024.134605915Online publication date: 8-Mar-2024
https://doi.org/10.3389/fpsyt.2024.1346059
He RSarwal VQiu XZhuang YZhang LLiu YChiang J(2024)Generative AI models in time varying biomedical data: a systematic review (Preprint)Journal of Medical Internet Research10.2196/59792Online publication date: 30-Apr-2024
https://doi.org/10.2196/59792
Calero Valdez AHeine MFranke TJochems NJetter HSchrills T(2024)The European commitment to human-centered technology: the integral role of HCI in the EU AI Act’s successi-com10.1515/icom-2024-001423:2(249-261)Online publication date: 15-Jul-2024
https://doi.org/10.1515/icom-2024-0014
Sampaio SLima Mde Souza EMeireles MPessoa MConte T(2024)Exploring the Use of Large Language Models in Requirements Engineering Education: An Experience Report with ChatGPT 3.5Proceedings of the XXIII Brazilian Symposium on Software Quality10.1145/3701625.3701687(624-634)Online publication date: 5-Nov-2024
https://dl.acm.org/doi/10.1145/3701625.3701687
Batista SBranco BCastro OAvelino G(2024)Code on Demand: A Comparative Analysis of the Efficiency Understandability and Self-Correction Capability of Copilot ChatGPT and GeminiProceedings of the XXIII Brazilian Symposium on Software Quality10.1145/3701625.3701673(351-361)Online publication date: 5-Nov-2024
https://dl.acm.org/doi/10.1145/3701625.3701673
Malheiros PLima ROran A(2024)Impact of Generative AI Technologies on Software Development Professionals' Perceptions of Job SecurityProceedings of the XXIII Brazilian Symposium on Software Quality10.1145/3701625.3701656(169-178)Online publication date: 5-Nov-2024
https://dl.acm.org/doi/10.1145/3701625.3701656
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents