Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2568225.2568303acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Data-guided repair of selection statements

Published: 31 May 2014 Publication History

Abstract

Database-centric programs form the backbone of many enterprise systems. Fixing defects in such programs takes much human effort due to the interplay between imperative code and database-centric logic. This paper presents a novel data-driven approach for automated fixing of bugs in the selection condition of database statements (e.g., WHERE clause of SELECT statements) – a common form of bugs in such programs. Our key observation is that in real-world data, there is information latent in the distribution of data that can be useful to repair selection conditions efficiently. Given a faulty database program and input data, only a part of which induces the defect, our novelty is in determining the correct behavior for the defect-inducing data by taking advantage of the information revealed by the rest of the data. We accomplish this by employing semi-supervised learning to predict the correct behavior for defect-inducing data and by patching up any inaccuracies in the prediction by a SAT-based combinatorial search. Next, we learn a compact decision tree for the correct behavior, including the correct behavior on the defect-inducing data. This tree suggests a plausible fix to the selection condition. We demonstrate the feasibility of our approach on seven realworld examples.

References

[1]
Thomas Ackling, Bradley Alexander, and Ian Grunert. Evolving patches for software repair. In GECCO, pages 1427–1434, 2011.
[2]
Kristin P. Bennett and Colin Campbell. Support vector machines: hype or hallelujah? SIGKDD Explor. Newsl., 2(2):1–13, December 2000.
[3]
Lionel C. Briand, Yvan Labiche, and Xuetao Liu. Using machine learning to support debugging with Tarantula. In ISSRE, pages 137–146, 2007.
[4]
Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. Angelic debugging. In ICSE, pages 121–130, 2011.
[5]
Vidroha Debroy and W. Eric Wong. Using mutation to automatically suggest fixes for faulty programs. In ICST, pages 65–74, 2010.
[6]
Divya Gopinath, Sarfraz Khurshid, Diptikalyan Saha, and Satish Chandra. Data-Guided Repair of Selection Statements. Technical report, IBM Research. India, 2014. IBM Technical Report RI14004, available from http://domino.watson.ibm.com/library/CyberDig.nsf/home.
[7]
Divya Gopinath, Muhammad Zubair Malik, and Sarfraz Khurshid. Specification-based program repair using SAT. In TACAS, pages 173–188, March 2011.
[8]
Andreas Griesmayer, Roderick Bloem, and Byron Cook. Repair of boolean programs with an application to C. In CAV, pages 358–371, 2006.
[9]
Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In POPL, pages 317–330, 2011.
[10]
Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In PLDI, pages 62–73, 2011.
[11]
Daniel Jackson. Alloy: a lightweight object modelling notation. ACM Trans. Softw. Eng. Methodol., 11(2), April 2002.
[12]
Lingxiao Jiang and Zhendong Su. Context-aware statistical debugging: from bug predictors to faulty control flow paths. In ASE, pages 184–193, 2007.
[13]
T. Joachims. Making large-scale svm learning practical. Advances in Kernel Methods - Support Vector Learning, 1999.
[14]
B. Jobstmann, A. Griesmayer, and R. Bloem. Program repair as a game. In CAV, pages 226–238, 2005.
[15]
James A. Jones, James F. Bowring, and Mary Jean Harrold. Debugging in parallel. In ISSTA, pages 16–26, 2007.
[16]
James A. Jones, Mary Jean Harrold, and John Stasko. Visualization of test information to assist fault localization. In ICSE, pages 467–477, 2002.
[17]
Sarfraz Khurshid, Iván García, and Yuk Lai Suen. Repairing structurally complex data. In SPIN, pages 123–138, 2005.
[18]
Viktor Kuncak, Mikael Mayer, Ruzica Piskac, and Philippe Suter. Complete functional synthesis. In PLDI, pages 316–329, 2010.
[19]
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan. Scalable statistical bug isolation. In PLDI, pages 15–26, 2005.
[20]
M. Z. Malik, K. Ghori, B. Elkarablieh, and S. Khurshid. A case for automated debugging using data structure repair. In ASE, pages 615–619, November 2009.
[21]
T. Mitchell. Machine Learning. McGraw Hill, 1997.
[22]
Rishabh Singh and Armando Solar-Lezama. SPT: Storyboard programming tool. In CAV, pages 738–743, 2012.
[23]
Armando Solar-Lezama. The sketching approach to program synthesis. In APLAS, pages 4–13, 2009.
[24]
V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, 1995.
[25]
Yi Wei, Yu Pei, Carlo A. Furia, Lucas S. Silva, Stefan Buchholz, Bertrand Meyer, and Andreas Zeller. Automated fixing of programs with contracts. In ISSTA, pages 61–72, 2010.
[26]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. Automatically finding patches using genetic programming. In ICSE, pages 364–374, 2009.

Cited By

View all
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2023)Automatic SQL Error Mitigation in OracleProceedings of the VLDB Endowment10.14778/3611540.361156816:12(3835-3847)Online publication date: 1-Aug-2023
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE 2014: Proceedings of the 36th International Conference on Software Engineering
May 2014
1139 pages
ISBN:9781450327565
DOI:10.1145/2568225
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • TCSE: IEEE Computer Society's Tech. Council on Software Engin.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ABAP
  2. Databases
  3. Machine Learning
  4. Program Repair
  5. SAT
  6. Support Vector Machines
  7. data-centric programs

Qualifiers

  • Research-article

Conference

ICSE '14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • (2023)Automatic SQL Error Mitigation in OracleProceedings of the VLDB Endowment10.14778/3611540.361156816:12(3835-3847)Online publication date: 1-Aug-2023
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
  • (2021)An Integrated Approach for Column-Oriented Database Application Evolution Using Conceptual ModelsAdvances in Conceptual Modeling10.1007/978-3-030-88358-4_3(26-32)Online publication date: 18-Oct-2021
  • (2020)A study of the learnability of relational properties: model counting meets machine learning (MCML)Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386015(1098-1111)Online publication date: 11-Jun-2020
  • (2019)History-driven build failure fixing: how far are we?Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3293882.3330578(43-54)Online publication date: 10-Jul-2019
  • (2019)DeepFL: integrating multiple fault diagnosis dimensions for deep fault localizationProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3293882.3330574(169-180)Online publication date: 10-Jul-2019
  • (2018)Automated model repair for AlloyProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering10.1145/3238147.3238162(577-588)Online publication date: 3-Sep-2018
  • (2018)Automatic Software RepairACM Computing Surveys10.1145/310590651:1(1-24)Online publication date: 23-Jan-2018
  • (2018)Fixing Defects in Integrity Constraints via Constraint Mutation2018 11th International Conference on the Quality of Information and Communications Technology (QUATIC)10.1109/QUATIC.2018.00020(74-82)Online publication date: Sep-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media