research-article

Trustworthiness Perceptions in Code Review: An Eye-tracking Study

Authors:

Westley Weimer,

Zohreh SharafiAuthors Info & Claims

ESEM '20: Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Article No.: 31, Pages 1 - 6

https://doi.org/10.1145/3382494.3422164

Published: 23 October 2020 Publication History

Abstract

Background: Automated program repair and other bug-fixing approaches are gaining attention in the software engineering community. Automation shows promise in reducing bug fixing costs. However, many developers express reluctance about accepting machine-generated patches into their codebases.

Aims: To contribute to the scientific understanding and the empirical investigation of human trust and perception with regards to automation in software maintenance.

Method: We design and conduct an eye-tracking study investigating how developers perceive trust as a function of code provenance (i.e., author or source). We systematically vary provenance while controlling for patch quality.

Results: In our study of ten participants, overall visual code scanning and the distribution of attention differed across identical code patches labeled as human- vs. machine-written. Participants looked more at the source code for human-labeled patches and looked more at tests for machine-labeled patches. Participants judged human-labeled patches to have better readability and coding style. However, participants were more comfortable giving a critical task to an automated program repair tool.

Conclusion: We find that there are significant differences in code review behavior based on trust as a function of patch provenance. Further, we find that important differences can be revealed by eye tracking. Our results may inform the subsequent design and analysis of automated repair techniques to increase developers' trust and, consequently, their deployment.

References

[1]

Gene M Alarcon, Rose Gamble, Sarah A Jessup, Charles Walter, Tyler J Ryan, David W Wood, and Chris S Calhoun. 2017. Application of the heuristic-systematic model to computer code trustworthiness: The influence of reputation and transparency. Cogent Psychology 4, 1 (2017), 1389640.

[2]

Gene M. Alarcon and Tyler J. Ryan. 2018. Trustworthiness Perceptions of Computer Code: A Heuristic-Systematic Processing Model. In Proceedings of the 51st Hawaii International Conference on System Sciences.

[3]

Andrew Begel and Hana Vrzakova. 2018. Eye Movements in Code Review. In Proceedings of the Workshop on Eye Movements in Programming. Article 5, 5 pages. https://doi.org/10.1145/3216723.3216727

Digital Library

[4]

Anneli Eteläpelto. 1993. Metacognition and the expertise of computer program comprehension. Scandinavian Journal of Educational Research 37, 3 (1993), 243--254.

[5]

Quyin Fan. 2010. The Effects of Beacons, Comments, and Tasks on Program Comprehension Process in Software Maintenance. Ph.D. Dissertation. University of Maryland, Baltimore County, Catonsville, MD, USA. Advisor(s) Norcio, Anthony F. AAI3422807.

[6]

Denae Ford, Mahnaz Behroozi, Alexander Serebrenik, and Chris Parnin. 2019. Beyond the code itself: how programmers really look at pull requests. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS). IEEE, 51--60.

[7]

Zachary P. Fry, Bryan Landau, and Westley Weimer. 2012. A human study of patch maintainability. In International Symposium on Software Testing and Analysis, ISSTA 2012, Minneapolis, MN, USA, July 15-20, 2012, Mats Per Erik Heimdahl and Zhendong Su (Eds.). ACM, 177--187. https://doi.org/10.1145/2338965.2336775

Digital Library

[8]

Claudia Geitner, Ben D Sawyer, S Birrell, P Jennings, L Skyrypchuk, Bruce Mehler, and Bryan Reimer. 2017. A link between trust in technology and glance allocation in on-road driving. (2017).

[9]

Christian Gold, Moritz Körber, Christoph Hohenberger, David Lechner, and Klaus Bengler. 2015. Trust in automation-Before and after the experience of take-over scenarios in a highly automated vehicle. Procedia Manufacturing 3 (2015), 3025--3032.

[10]

Joseph H. Goldberg and Jonathan I. Helfman. 2010. Comparing Information Graphics: A Critical Look at Eye Tracking. In Proceedings of the 3rd BE yond Time and Errors: Novel evaLuation Methods for Information Visualization Workshop (Atlanta, Georgia) (BELIV '10). ACM, New York, NY, USA, 71--78. https://doi.org/10.1145/2110192.2110203

[11]

Claire Goues, Stephanie Forrest, and Westley Weimer. 2013. Current Challenges in Automatic Software Repair. Software Quality Journal 21, 3 (Sept. 2013), 421--443. https://doi.org/10.1007/s11219-013-9208-0

Digital Library

[12]

Saemundur O. Haraldsson, John R. Woodward, Alexander E. I. Brownlee, and Kristin Siggeirsdottir. 2017. Fixing Bugs in Your Sleep: How Genetic Improvement Became an Overnight Success. https://doi-org.proxy.lib.umich.edu/10.1145/3067695.3082517

[13]

https://www.tobiipro.com/. 2001. Online; Accessed 17-07-2020.

[14]

Yu Huang, Kevin Leach, Zohreh Sharafi, Nicholas McKay, Tyler Santander, and Westley Weimer. 2020. Investigating Gender Bias and Differences in Code Review: Using Medical Imaging and Eye-Tracking. In International Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM/SIGSOFT.

[15]

Yu Huang, Xinyu Liu, Ryan Krueger, Tyler Santander, Xiaosu Hu, Kevin Leach, and Westley Weimer. 2019. Distilling Neural Representations of Data Structure Manipulation Using FMRI and FNIRS. In Proceedings of the 41st International Conference on Software Engineering (ICSE 19). IEEE Press, Montreal, Quebec, Canada, 396--407. https://doi.org/10.1109/ICSE.2019.00053

Digital Library

[16]

Marcel A Just and Patricia A Carpenter. 1980. A theory of reading: from eye fixations to comprehension. Psychological review 87, 4 (1980), 329.

[17]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. 2013. Automatic patch generation learned from human-written patches. In 2013 35th International Conference on Software Engineering(ICSE). IEEE, 802--811.

[18]

Barbara A. Kitchenham, Shari Lawrence Pfleeger, Lesley M. Pickard, Peter W. Jones, David C. Hoaglin, Khaled El Emam, and Jarrett Rosenberg. 2002. Preliminary Guidelines for Empirical Research in Software Engineering. IEEE Transactions on Software Engineering 28, 8 (Aug. 2002), 721--734. https://doi.org/10.1109/TSE.2002.1027796

Digital Library

[19]

Xuan-Bach D. Le, Ferdian Thung, David Lo, and Claire Le Goues. 2018. Overfitting in semantics-based automated program repair. Empirical Software Engineering 23, 5 (2018), 3007--3033. https://doi.org/10.1007/s10664-017-9577-2

Digital Library

[20]

Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A Systematic Study of Automated Program Repair: Fixing 55 out of 105 Bugs for $8 Each. In International Conference on Software Engineering.

[21]

Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. In Principles of Programming Languages. https://doi.org/10.1145/2837614.2837617

[22]

Y. Lu and N. Sarter. 2019. Eye Tracking: A Process-Oriented Method for Inferring Trust in Automation as a Function of Priming and System Reliability. IEEE Transactions on Human-Machine Systems 49, 6 (Dec 2019), 560--568. https://doi.org/10.1109/THMS.2019.2930980

[23]

Joseph B Lyons, Nhut T Ho, William E Fergueson, Garrett G Sadler, Samantha D Cals, Casey E Richardson, and Mark A Wilkins. 2016. Trust of an automatic ground collision avoidance technology: A fighter pilot perspective. Military Psychology 28, 4 (2016), 271--277.

[24]

A. Marginean, J. Bader, S. Chandra, M. Harman, Y. Jia, K. Mao, A. Mols, and A. Scott. 2019. SapFix: Automated End-to-End Repair at Scale. In International Conference on Software Engineering: Software Engineering in Practice. 269--278.

[25]

Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan, and Martin Monperrus. 2016. Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4J Dataset. Springer Empirical Software Engineering (2016). https://doi.org/10.1007/s10664-016-9470-4

[26]

Stephanie Merritt, Lei Shirase, and Garett Foster. 2020. Normed Images for X-ray Screening Vigilance Tasks. Journal of Open Psychology Data 8, 1 (2020).

[27]

Martin Monperrus. 2018. Automatic Software Repair: A Bibliography. ACM Comput. Surv. 51, 1, Article 17 (Jan. 2018), 24 pages. https://doi.org/10.1145/3105906

[28]

Martin Monperrus, Simon Urli, Thomas Durieux, Matias Martinez, Benoit Baudry, and Lionel Seinturier. 2019. Repairnator Patches Programs Automatically. Ubiquity 2019, July, Article 2 (July 2019), 12 pages. https://doi.org/10.1145/3349589

[29]

Alex Poole and Linden J. Ball. 2005. Eye Tracking in Human-Computer Interaction and Usability Research: Current Status and Future. In Prospects", Chapter in C. Ghaoui (Ed.): Encyclopedia of Human-Computer Interaction. Pennsylvania: Idea Group, Inc. Information Science Reference, Hershey, PA, 1--5.

[30]

K. Rayner. 1978. Eye movements in reading and information processing. Psychological Bulletin 85, 3 (1978), 618--660.

[31]

Tyler J. Ryan, Gene M. Alarcon, Charles Walter, Rose F. Gamble, Sarah A. Jessup, August A. Capiola, and Marc D. Pfahler. 2019. Trust in Automated Software Repair - The Effects of Repair Source, Transparency, and Programmer Experience on Perceived Trustworthiness and Trust. In Proceedings of Cybersecurity, Privacy and Trust - First International Conference, HCI-CPT 2019 Orlando, FL, USA, July (Lecture Notes in Computer Science), Abbas Moallem (Ed.), Vol. 11594. Springer, 452--470. https://doi.org/10.1007/978-3-030-22351-9_31

[32]

Timothy R. Shaffer, Jenna L. Wise, Braden M. Walters, Sebastian C. Müller, Michael Falcone, and Bonita Sharif. 2015. iTrace: Enabling Eye Tracking on Software Artifacts Within the IDE to Support Software Engineering Tasks. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering (Bergamo, Italy) (ESEC/FSE 2015). 954--957.

Digital Library

[33]

Jenessa R Shapiro and Steven L Neuberg. 2007. From stereotype threat to stereotype threats: Implications of a multi-threat framework for causes, moderators, mediators, consequences, and interventions. Personality and Social Psychology Review 11, 2 (2007), 107--130.

[34]

Zohreh Sharafi, Timothy Shaffer, Bonita Sharif, and Yann-Gaël Guéhéneuc. 2015. Eye-tracking metrics in software engineering. In Proceeding of 2015 Asia-Pacific Software Engineering Conference (APSEC). IEEE, 96--103.

[35]

Zohreh Sharafi, Bonita Sharif, Yann-Gaël Guéhéneuc, Andrew Begel, Roman Bednarik, and Martha Crosby. 2020. A practical guide on conducting eye tracking studies in software engineering. Empirical Software Engineering (2020), 1--47.

[36]

Zohreh Sharafi, Zéphyrin Soh, and Yann-Gaël Guéhéneuc. 2015. Asystematic literatu rereview on the usage of eye-track in gin software engineering. Information and Software Technology 67 (2015), 79--107.

Digital Library

[37]

Bonita Sharif, Michael Falcone, and Jonathan I. Maletic. 2012. An Eye-Tracking Study on the Role of Scan Time in Finding Source Code Defects. In Symposium on Eye Tracking Research and Applications. https://doi.org/10.1145/2168556.2168642

[38]

Janet Siegmund, Christian Kästner, Jörg Liebig, Sven Apel, and Stefan Hanenberg. 2014. Measuring and modeling programming experience. Empirical Software Engineering 19, 5 (2014), 1299--1334.

Digital Library

[39]

Claude M Steele and Joshua Aronson. 1995. Stereotype threat and the intellectual test performance of African Americans. Journal of personality and social psychology 69, 5 (1995), 797.

[40]

Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson R. Murphy-Hill, Chris Parnin, and Jon Stallings. 2017. Gender differences and bias in open source: pull request acceptance of women versus men. Peer J Comput. Sci. 3 (2017), e111. https://doi.org/10.7717/peerj-cs.111

[41]

Hidetake Uwano, Masahide Nakamura, Akito Monden, and Ken-ichi Matsumoto. 2006. Analyzing Individual Performance of Source Code Review Using Reviewers' Eye Movement. In Eye Tracking Research Applications. 133--140.

[42]

Jacob O Wobbrock, Leah Findlater, Darren Gergle, and James J Higgins. 2011. The aligned rank transform for nonparametric factorial analyses using only anova procedures. In Proceedings of the SIGCHI conference on human factors in computing systems. ACM, 143--146.

Digital Library

Cited By

Yabesi SAmini MRistic JSharafi ZBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Exploring the Effects of Urgency and Reputation in Code Review: An Eye-Tracking StudyProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644425(202-213)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644425
Tang NAn JChen MBansal AHuang YMcMillan CLi TRoychoudhury APaiva AAbreu RStorey M(2024)CodeGRITS: A Research Toolkit for Developer Behavior and Eye Tracking in IDEProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3640037(119-123)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3640037
Tang NChen MNing ZBansal AHuang YMcMillan CLi T(2024)Developer Behaviors in Validating and Repairing LLM-Generated Code Using IDE and Eye Tracking2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00015(40-46)Online publication date: 2-Sep-2024
https://doi.org/10.1109/VL/HCC60511.2024.00015
Show More Cited By

Index Terms

Trustworthiness Perceptions in Code Review: An Eye-tracking Study
1. Software and its engineering
  1. Software creation and management

Recommendations

Eye movements in code review
EMIP '18: Proceedings of the Workshop on Eye Movements in Programming

In order to ensure sufficient quality, software engineers conduct code reviews to read over one another's code looking for errors that should be fixed before committing to their source code repositories. Many kinds of errors are spotted, from simple ...
Exploring the Effects of Urgency and Reputation in Code Review: An Eye-Tracking Study
ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

The Pull-Based development model, a fundamental mechanism of collaboration in modern software engineering (SE), initiates the code review process when a contributor submits pull requests (PRs) for evaluation. Although the decision to approve or decline ...
Effects of Reputation, Organization, and Readability on Trustworthiness Perceptions of Computer Code
Human-Computer Interaction. Human Values and Quality of Life
Abstract
Computer code has entered our society in contexts ranging from medical to manufacturing settings. The current study expanded previous literature by examining the effects of three between-subject factors (i.e., reputation, organization, and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '20: Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

October 2020

412 pages

ISBN:9781450375801

DOI:10.1145/3382494

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ESEM '20

Sponsor:

SIGSOFT

ESEM '20: ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

October 5 - 9, 2020

Bari, Italy

Acceptance Rates

ESEM '20 Paper Acceptance Rate 26 of 123 submissions, 21%;

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
311
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yabesi SAmini MRistic JSharafi ZBaysal OLinares-Vasquez MMoran KSteinmacher I(2024)Exploring the Effects of Urgency and Reputation in Code Review: An Eye-Tracking StudyProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension10.1145/3643916.3644425(202-213)Online publication date: 15-Apr-2024
https://dl.acm.org/doi/10.1145/3643916.3644425
Tang NAn JChen MBansal AHuang YMcMillan CLi TRoychoudhury APaiva AAbreu RStorey M(2024)CodeGRITS: A Research Toolkit for Developer Behavior and Eye Tracking in IDEProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3640037(119-123)Online publication date: 14-Apr-2024
https://dl.acm.org/doi/10.1145/3639478.3640037
Tang NChen MNing ZBansal AHuang YMcMillan CLi T(2024)Developer Behaviors in Validating and Repairing LLM-Generated Code Using IDE and Eye Tracking2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00015(40-46)Online publication date: 2-Sep-2024
https://doi.org/10.1109/VL/HCC60511.2024.00015
Behler JVillalobos GPangonis JSharif BMaletic J(2024)Extending iTrace-Visualize to Support Token-based Heatmaps and Region of Interest Scarf Plots for Source Code2024 IEEE Working Conference on Software Visualization (VISSOFT)10.1109/VISSOFT64034.2024.00027(139-143)Online publication date: 6-Oct-2024
https://doi.org/10.1109/VISSOFT64034.2024.00027
Grabinger LHauser FWolff CMottok J(2024)On Eye Tracking in Software EngineeringSN Computer Science10.1007/s42979-024-03045-35:6Online publication date: 26-Jul-2024
https://dl.acm.org/doi/10.1007/s42979-024-03045-3
Yoshioka HUwano H(2024)An Analysis of Program Comprehension Process by Eye Movement Mapping to Syntax TreesNetworking and Parallel/Distributed Computing Systems10.1007/978-3-031-53274-0_10(137-152)Online publication date: 27-Apr-2024
https://doi.org/10.1007/978-3-031-53274-0_10
Alcocer JCossio-Chavalier ARojas-Stambuk TMerino L(2023)An Eye-Tracking Study on the Use of Split/Unified Code Change Views for Bug DetectionIEEE Access10.1109/ACCESS.2023.333685911(136195-136205)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3336859
Noller YShariffdeen RGao XRoychoudhury ADwyer MDamian DZeller A(2022)Trust enhancement issues in program repairProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510040(2228-2240)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510040

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents