research-article

Nudge: Accelerating Overdue Pull Requests toward Completion

Authors:

Chandra Maddila,

Sai Surya Upadrasta,

Nachiappan Nagappan,

Georgios Gousios,

Arie van DeursenAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 32, Issue 2

Article No.: 35, Pages 1 - 30

https://doi.org/10.1145/3544791

Published: 30 March 2023 Publication History

Abstract

Pull requests are a key part of the collaborative software development and code review process today. However, pull requests can also slow down the software development process when the reviewer(s) or the author do not actively engage with the pull request. In this work, we design an end-to-end service, Nudge, for accelerating overdue pull requests toward completion by reminding the author or the reviewer(s) to engage with their overdue pull requests. First, we use models based on effort estimation and machine learning to predict the completion time for a given pull request. Second, we use activity detection to filter out pull requests that may be overdue but for which sufficient action is taking place nonetheless. Last, we use actor identification to understand who the blocker of the pull request is and nudge the appropriate actor (author or reviewer(s)). The key novelty of Nudge is that it succeeds in reducing pull request resolution time, while ensuring that developers perceive the notifications sent as useful, at the scale of thousands of repositories. In a randomized trial on 147 repositories in use at Microsoft, Nudge was able to reduce pull request resolution time by 60% for 8,500 pull requests, when compared to overdue pull requests for which Nudge did not send a notification. Furthermore, developers receiving Nudge notifications resolved 73% of these notifications as positive. We observed similar results when scaling up the deployment of Nudge to 8,000 repositories at Microsoft, for which Nudge sent 210,000 notifications during a full year. This demonstrates Nudge’s ability to scale to thousands of repositories. Last, our qualitative analysis of a selection of Nudge notifications indicates areas for future research, such as taking dependencies among pull requests and developer availability into account.

References

[1]

Azure DevOps REST API. Retrieved 2020 from https://docs.microsoft.com/en-us/rest/api/azure/devops/?view=azure-devops-rest-5.0.

[2]

GitHub. Retrieved 2020 from https://flow.microsoft.com/en-us/.

[3]

Accessed 2020. GitHub. Retrieved 2020 from https://flow.microsoft.com/en-us/blog/sending-pull-request-review-reminders-using-ms-flows/.

[4]

GitHub. Retrieved 2020 from https://www.openml.org/a/estimation-procedures/9.

[5]

GitHub Marketplace. Retrieved 2020 from https://github.com/marketplace.

[6]

Long-Running Branches Considered Harmful. Retrieved from https://blog.newrelic.com/culture/long-running-branches-considered-harmful/.

[7]

Sumit Asthana, Rahul Kumar, Ranjita Bhagwan, Christian Bird, Chetan Bansal, Chandra Maddila, Sonu Mehta, and B. Ashok. 2019. Whodo: Automating reviewer suggestions at scale. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 937–945.

Digital Library

[8]

Iman Attarzadeh, Amin Mehranzadeh, and Ali Barati. 2012. Proposing an enhanced artificial neural network prediction model to improve the accuracy in software effort estimation. In Proceedings of the 4th International Conference on Computational Intelligence, Communication Systems and Networks. IEEE, 167–172.

Digital Library

[9]

Chetan Bansal, Sundararajan Renganathan, Ashima Asudani, Olivier Midy, and Mathru Janakiraman. 2020. DeCaf: diagnosing and triaging performance issues in large-scale cloud services. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice. ACM, 201–210.

Digital Library

[10]

Olga Baysal, Oleksii Kononenko, Reid Holmes, and Michael W. Godfrey. 2013. The influence of non-technical factors on code review. In Proceedings of the 20th working conference on reverse engineering (WCRE). IEEE, 122–131.

[11]

Nicolas Bettenburg, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. Towards improving statistical modeling of software engineering data: Think locally, act globally!Emp. Softw. Eng. 20, 2 (April2015), 294–335.

Digital Library

[12]

Ranjita Bhagwan, Rahul Kumar, Chandra Sekhar Maddila, and Adithya Abraham Philip. 2018. Orca: Differential bug localization in large-scale services. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). USENIX, 493–509.

[13]

Barry Boehm, Brad Clark, Ellis Horowitz, J. Westland, Raymond Madachy, and Richard Selby. 1995. Cost models for future software life cycle processes: COCOMO 2.0. Ann. Softw. Eng. 1 (121995), 57–94.

[14]

Barry W. Boehm. 1984. Software engineering economics. IEEE Trans. Softw. Eng. 10, 1 (January1984), 4–21.

Digital Library

[15]

Lionel C. Briand, Khaled El Emam, Dagmar Surmann, Isabella Wieczorek, and Katrina D. Maxwell. 1999. An assessment and comparison of common software cost estimation modeling techniques. In Proceedings of the 21st International Conference on Software Engineering (ICSE’99). ACM, New York, NY, 313–322.

Digital Library

[16]

Lionel C. Briand, Jürgen Wüst, John W. Daly, and D. Victor Porter. 2000. Exploring the relationship between design measures and software quality in object-oriented systems. J. Syst. Softw. 51, 3 (May2000), 245–273.

Digital Library

[17]

Gul Calikli, Berna A. Uzundag, and Ayse Bener. 2010. Confirmation bias in software development and testing: An analysis of the effects of company size, experience and reasoning skills. In Proceedings Workshop on Psychology of Programming Interest Group (PPIG’10), Rebecca Yates and Fabian Fagerholm (Eds.).

[18]

Sunita Chulani, Barry Boehm, and Bert Steece. 1999. Bayesian analysis of empirical software engineering cost models. IEEE Trans. Softw. Eng. 25, 4 (July1999), 573–583.

Digital Library

[19]

Nicola Dell, Vidya Vaidyanathan, Indrani Medhi, Edward Cutrell, and William Thies. 2012. “Yours is better!”: Participant response bias in HCI. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’12). Association for Computing Machinery, New York, NY, 1321–1330.

Digital Library

[20]

Tapajit Dey, Sara Mousavi, Eduardo Ponce, Tanner Fry, Bogdan Vasilescu, Anna Filippova, and Audris Mockus. 2020. Detecting and characterizing bots that commit code. In Proceedings of the 17th International Conference on Mining Software Repositories (MSR’20). ACM, 209–219.

Digital Library

[21]

Klissiomara Dias, Paulo Borba, and Marcos Barreto. 2020. Understanding predictive factors for merge conflicts. Inf. Softw. Technol. 121 (2020), 106256.

Digital Library

[22]

Alberto Faro, Daniela Giordano, and Mario Venticinque. 2021. Internetworked wrist sensing devices for Pervasive and M-Connected Eldercare. In Proceedings of the IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech’21). 454–456.

[23]

Rahul Kumar, Chetan Bansal, Chandra Maddila, Nitin Sharma, Shawn Martelock, and Ravi Bhargava. 2019. Building sankie: An AI platform for devops. In Proceedings of the IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE’19). IEEE, 48–53.

Digital Library

[24]

S. V. Aswin Kumer, P. Kanakaraja, A. Punya Teja, T. Harini Sree, and T. Tejaswni. 2021. Smart home automation using IFTTT and google assistant. Mater. Today: Proc. 46 (2021), 4070–4076.

[25]

Lucas Layman, Nachiappan Nagappan, Sam Guckenheimer, Jeff Beehler, and Andrew Begel. 2008. Mining Software Effort Data: Preliminary Analysis of Visual Studio Team System Data. In Proceedings of the 2008 International Working Conference on Mining Software Repositories (Leipzig, Germany) (MSR’08). Association for Computing Machinery, New York, NY, USA, 43–46.

Digital Library

[26]

Carlene Lebeuf, Alexey Zagalsky, Matthieu Foucault, and Margaret-Anne Storey. 2019. Defining and classifying software bots: a faceted taxonomy. In Proceedings of the IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE’19). IEEE, 1–6.

Digital Library

[27]

Dugang Liu, Chen Lin, Zhilin Zhang, Yanghua Xiao, and Hanghang Tong. 2019. Spiral of silence in recommender systems. In Proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM’19). Association for Computing Machinery, New York, NY, 222–230.

Digital Library

[28]

L. MacLeod, M. Greiler, M. Storey, C. Bird, and J. Czerwonka. 2018. Code Reviewing in the Trenches: Challenges and Best Practices. IEEE Softw. 35, 4 (July2018), 34–42.

Digital Library

[29]

Chandra Maddila, Chetan Bansal, and Nachiappan Nagappan. 2019. Predicting pull request completion time: A case study on large scale cloud services. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). Association for Computing Machinery, New York, NY, 874–882.

Digital Library

[30]

Sonu Mehta, Ranjita Bhagwan, Rahul Kumar, Chetan Bansal, Chandra Maddila, B. Ashok, Sumit Asthana, Christian Bird, and Aditya Kumar. 2020. Rex: Preventing bugs and misconfiguration in large services using correlated change analysis. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI’20). 435–448.

[31]

Varun G. Menon, Sunil Jacob, Saira Joseph, Paramjit Sehdev, Mohammad R. Khosravi, and Fadi Al-Turjman. 2020. An IoT-enabled intelligent automobile system for smart cities. IEEE IoT J. (2020), 100213.

[32]

T. Menzies, A. Butcher, D. Cok, A. Marcus, L. Layman, F. Shull, B. Turhan, and T. Zimmermann. 2013. Local versus global lessons for defect prediction and effort estimation. IEEE Trans. Softw. Eng. 39, 6 (June2013), 822–834.

Digital Library

[33]

Raymond Nickerson. 1998. Confirmation bias: A ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2 (61998), 175–220.

[34]

Thomas J. Ostrand, Elaine J. Weyuker, and Robert M. Bell. 2004. Where the Bugs Are. In Proceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’04). ACM, New York, NY, 86–96.

Digital Library

[35]

Steven Ovadia. 2014. Automate the internet with “if this then that” (IFTTT). Behav. Soc. Sci. Libr. 33, 4 (2014), 208–211.

[36]

Ayushi Rastogi, Nachiappan Nagappan, Georgios Gousios, and André van der Hoek. 2018. Relationship between geographical location and evaluation of developer contributions in Github. In Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM’18). ACM, New York, NY, Article 22, 8 pages.

Digital Library

[37]

Luyao Ren, Shurui Zhou, Christian Kästner, and Andrzej Wąsowski. 2019. Identifying redundancies in fork-based development. In Proceedings of the IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER’19). IEEE, 230–241.

[38]

Daricélio Moreira Soares, Manoel Limeira de Lima Júnior, Leonardo Murta, and Alexandre Plastino. 2015. Acceptance factors of pull requests in open-source projects. In Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC’15). ACM, New York, NY, 1541–1546.

Digital Library

[39]

Harald Steck. 2011. Item Popularity and Recommendation Accuracy. In Proceedings of the 5th ACM Conference on Recommender Systems (RecSys’11). Association for Computing Machinery, New York, NY, 125–132.

Digital Library

[40]

Margaret-Anne Storey, Alexander Serebrenik, Carolyn Penstein Rosé, Thomas Zimmermann, and James D. Herbsleb. 2020. BOTse: Bots in software engineering (Dagstuhl Seminar 19471). In Dagstuhl Reports, Vol. 9. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.

[41]

Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill, Chris Parnin, and Jon Stallings. 2017. Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ. Comput. Sci. 3 (May2017), e111.

[42]

Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, New York, NY, 356–366.

Digital Library

[43]

Erik Van Der Veen, Georgios Gousios, and Andy Zaidman. 2015. Automatically prioritizing pull requests. In Proceedings of the IEEE/ACM 12th Working Conference on Mining Software Repositories. IEEE, 357–361.

[44]

Qingye Wang, Bowen Xu, Xin Xia, Ting Wang, and Shanping Li. 2019. Duplicate pull request detection: When time matters. In Proceedings of the 11th Asia-Pacific Symposium on Internetware. 1–10.

Digital Library

[45]

Song Wang, Chetan Bansal, and Nachiappan Nagappan. 2020. Large-scale intent analysis for identifying large-review-effort code changes. Inf. Softw. Technol. (2020), 106408.

[46]

Song Wang, Chetan Bansal, Nachiappan Nagappan, and Adithya Abraham Philip. 2019. Leveraging change intents for characterizing and identifying large-review-effort changes. In Proceedings of the 15th International Conference on Predictive Models and Data Analytics in Software Engineering. 46–55.

Digital Library

[47]

Marvin Wyrich and Justus Bogner. 2019. Towards an autonomous bot for automatic source code refactoring. In Proceedings of the IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE’19). IEEE, 24–28.

Digital Library

[48]

Yue Yu, Huaimin Wang, Vladimir Filkov, Premkumar Devanbu, and Bogdan Vasilescu. 2015. Wait for It: Determinants of Pull Request Evaluation Latency on GitHub. In Proceedings of the 12th Working Conference on Mining Software Repositories (MSR’15). IEEE Press, Piscataway, NJ, 367–371. http://dl.acm.org/citation.cfm?id=2820518.2820564.

[49]

Yue Yu, Huaimin Wang, Gang Yin, and Tao Wang. 2016. Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment?Inf. Softw. Technol. 74 (2016), 204–218.

Digital Library

Cited By

Yang LXu JZhang HWu FLyu JLi YBacchelli AFilkov VRay BZhou M(2024)GPP: A Graph-Powered Prioritizer for Code Review RequestsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694990(104-116)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3694990
Khatoonabadi SAbdellatif ACosta DShihab E(2024)Predicting the First Response Latency of Maintainers and Contributors in Pull RequestsIEEE Transactions on Software Engineering10.1109/TSE.2024.344374150:10(2529-2543)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3443741
Joshi RKahani N(2024)Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00057(489-500)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00057
Show More Cited By

Index Terms

Nudge: Accelerating Overdue Pull Requests toward Completion
1. Software and its engineering
  1. Software notations and tools

Recommendations

An exploratory study of the pull-based software development model
ICSE 2014: Proceedings of the 36th International Conference on Software Engineering

The advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them ...
Prediction of Pull Requests Review Time in Open Source Projects
SBQS '21: Proceedings of the XX Brazilian Symposium on Software Quality

In open-source projects that receive large amounts of pull requests, the tasks of maintaining quality and prioritizing code review have become a complex task. In this sense, several works explored data on pull requests in order to provide useful ...
ConE: A Concurrent Edit Detection Tool for Large-scale Software Development
Modern, complex software systems are being continuously extended and adjusted. The developers responsible for this may come from different teams or organizations, and may be distributed over the world. This may make it difficult to keep track of what ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 32, Issue 2

March 2023

946 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/3586025

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2023

Online AM: 25 June 2022

Accepted: 20 May 2022

Revised: 20 February 2022

Received: 16 March 2021

Published in TOSEM Volume 32, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
719
Total Downloads

Downloads (Last 12 months)246
Downloads (Last 6 weeks)25

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yang LXu JZhang HWu FLyu JLi YBacchelli AFilkov VRay BZhou M(2024)GPP: A Graph-Powered Prioritizer for Code Review RequestsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694990(104-116)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3694990
Khatoonabadi SAbdellatif ACosta DShihab E(2024)Predicting the First Response Latency of Maintainers and Contributors in Pull RequestsIEEE Transactions on Software Engineering10.1109/TSE.2024.344374150:10(2529-2543)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3443741
Joshi RKahani N(2024)Comparative Study of Reinforcement Learning in GitHub Pull Request Outcome Predictions2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER60148.2024.00057(489-500)Online publication date: 12-Mar-2024
https://doi.org/10.1109/SANER60148.2024.00057
Yang LZhang HXu JLyu JZhou XShao DGao SBacchelli A(2024)A preliminary investigation on using multi-task learning to predict change performance in code reviewsEmpirical Software Engineering10.1007/s10664-024-10526-929:6Online publication date: 28-Sep-2024
https://doi.org/10.1007/s10664-024-10526-9
Berntzen MEngdal SGellein MMoe N(2024)Coordination in Agile Product Areas: A Case Study from a Large FinTech OrganizationAgile Processes in Software Engineering and Extreme Programming10.1007/978-3-031-61154-4_3(36-52)Online publication date: 31-May-2024
https://doi.org/10.1007/978-3-031-61154-4_3
Arakawa RYakura HGoto M(2023)CatAlyst: Domain-Extensible Intervention for Preventing Task Procrastination Using Large Generative ModelsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581133(1-19)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3581133
Kudrjavets GRastogi A(2023)Does code review speed matter for practitioners?Empirical Software Engineering10.1007/s10664-023-10401-z29:1Online publication date: 22-Nov-2023
https://dl.acm.org/doi/10.1007/s10664-023-10401-z
Shan QSukhdeo DHuang QRogers SChen LParadis ERigby PNagappan NRoychoudhury ACadar CKim M(2022)Using nudges to accelerate code reviews at scaleProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549104(472-482)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3540250.3549104

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents