research-article

An Empirical Study to Determine if Mutants Can Effectively Simulate Students' Programming Mistakes to Increase Tutors' Confidence in Autograding

Authors:

Benjamin Simon Clegg,

Gordon FraserAuthors Info & Claims

SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education

Pages 1055 - 1061

https://doi.org/10.1145/3408877.3432411

Published: 05 March 2021 Publication History

Abstract

Automated grading often requires automated test suites to identify students' faults. However, tests may not detect some faults, limiting feedback, and providing inaccurate grades. This issue can be mitigated by first ensuring that tests can detect faults. Mutation analysis is a technique that generates artificial faulty variants of a program for this purpose, called mutants. Mutants that are not detected by tests reveal their inadequacies, providing knowledge on how they can be improved. By using mutants to improve test suites, tutors can gain the confidence that: a) generated grades will not be biased by unidentified faults, and b) students will receive appropriate feedback for their mistakes. Existing work has shown that mutants are suitable substitutes for faults in real world software, but no work has shown that this holds for students' faults. In this paper, we investigate whether mutants are capable of replicating mistakes made by students. We conducted a quantitative study on 197 Java classes written by students across three introductory programming assignments, and mutants generated from the assignments' model solutions. We found that generated mutants capture the observed faulty behaviour of students' solutions. We also found that mutants better assess test adequacy than code coverage in some cases. Our results indicate that tutors can use mutants to identify and remedy deficiencies in grading test suites.

References

[1]

Almasi, M. M., Hemmati, H., Fraser, G., Arcuri, A., and Benefelds, J. An industrial evaluation of unit test generation: Finding real faults in a financial application. In Proceedings - 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track, ICSE-SEIP 2017 (jun 2017), Institute of Electrical and Electronics Engineers Inc., pp. 263--272.

Digital Library

[2]

Brown, N. C. C., and Altadmri, A. Novice Java Programming Mistakes: Large-Scale Data vs. Educator Beliefs. ACM Transactions on Computing Education 17, 2 (may 2017), 1--21.

[3]

Chen, Y. T., Tadakamalla, A., Ernst, M. D., Holmes, R., Fraser, G., Ammann, P., and Just, R. Revisiting the Relationship Between Fault Detection, Test Adequacy Criteria, and Test Set Size. 35th IEEE/ACM International Conference on Automated Software Engineering (ASE '20) (2020), 284--296.

Digital Library

[4]

Coles, H. PIT Mutation Testing. [Online; accessed 2020-08--26] https://pitest.org/.

[5]

DeMillo, R. A., Lipton, R. J., and Sayward, F. G. Hints on test data selection: Help for the practicing programmer. Computer 11, 4 (1978), 34--41.

Digital Library

[6]

Fraser, G., and Arcuri, A. EvoSuite: Automatic test suite generation for object-oriented software. SIGSOFT/FSE 2011 - Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering (2011), 416--419.

Digital Library

[7]

Gopinath, R., Jensen, C., and Groce, A. Mutations: How close are they to real faults? In Proceedings - International Symposium on Software Reliability Engineering, ISSRE (dec 2014), IEEE Computer Society, pp. 189--200.

Digital Library

[8]

Hoffmann, M. R., Mandrikov, E., and Friedenhagen, M. JaCoCo Java Code Coverage Library. [Online; accessed 2020-08--27] http://eclemma.org/jacoco/, 2016.

[9]

Insa, D., and Silva, J. Automatic assessment of Java code. Computer Languages, Systems and Structures 53 (2018), 59--72.

[10]

Jia, Y., and Harman, M. An Analysis and Survey of the Development of Mutation Testing. IEEE Transactions on Software Engineering 37, 5 (sep 2011), 649--678.

Digital Library

[11]

Just, R. The Major Mutation Framework. [Online; accessed 2020-08--26] http://mutation-testing.org/doc/major.pdf, 2018.

[12]

Just, R., Jalali, D., Inozemtseva, L., Ernst, M. D., Holmes, R., and Fraser, G. Are mutants a valid substitute for real faults in software testing? Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering 16--21-Nove (nov 2014), 654--665.

Digital Library

[13]

Just, R., Schweiggert, F., and Kapfhammer, G. M. MAJOR: An efficient and extensible tool for mutation analysis in a Java compiler. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering, ASE 2011, Proceedings (2011), pp. 612--615.

Digital Library

[14]

Krusche, S., and Seitz, A. ArTEMiS - An Automatic Assessment Management System for Interactive Learning. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education - SIGCSE '18 (New York, New York, USA, 2018), ACM Press, pp. 284--289.

[15]

Manzoor, H., Naik, A., Shaffer, C. A., North, C., and Edwards, S. H. Auto-grading Jupyter Notebooks. In Proceedings of the 51st ACM Technical Symposium on Computer Science Education (SIGCSE '20) (New York, NY, USA, feb 2020), Association for Computing Machinery, pp. 1139--1144.

Digital Library

[16]

Marin, V. J., Pereira, T., Sridharan, S., and Rivero, C. R. Automated personalized feedback in introductory Java programming MOOCs. Proceedings - International Conference on Data Engineering (2017), 1259--1270.

[17]

Papadakis, M., and Le Traon, Y. Effective Fault Localization via Mutation Analysis: A Selective Mutation Approach. In Proceedings of the 29th Annual ACM Symposium on Applied Computing - SAC '14 (New York, New York, USA, 2014), ACM Press.

Digital Library

[18]

Papadakis, M., and Le Traon, Y. Metallaxis-FL: Mutation-based fault localization. Software Testing Verification and Reliability 25, 5--7 (aug 2015), 605--628.

Digital Library

[19]

Pu, Y., Narasimhan, K., Solar-Lezama, A., and Barzilay, R. sk_p: a neural program corrector for MOOCs. In SPLASH Companion 2016 - Companion Proceedings of the 2016 ACM SIGPLAN International Conference on Systems, Programming, Languages and Applications: Software for Humanity (New York, New York, USA, oct 2016), ACM Press, pp. 39--40.

Digital Library

[20]

Singh, R., Gulwani, S., and Solar-Lezama, A. Automated feedback generation for introductory programming assignments. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) 48, 6 (jun 2013), 15--26.

Digital Library

[21]

Souza, D. M., Felizardo, K. R., and Barbosa, E. F. A systematic literature review of assessment tools for programming assignments. Proceedings - 2016 IEEE 29th Conference on Software Engineering Education and Training, CSEEandT 2016 (apr 2016), 147--156.

[22]

Staubitz, T., Klement, H., Renz, J., Teusner, R., and Meinel, C. Towards practical programming exercises and automated assessment in Massive Open Online Courses. In Proceedings of 2015 IEEE International Conference on Teaching, Assessment and Learning for Engineering, TALE 2015 (jan 2016), Institute of Electrical and Electronics Engineers Inc., pp. 23--30.

Cited By

Zhuang BPerretta JGuha ABell J(2023)A Tool for Mutation Analysis in Racket2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW58534.2023.00061(308-313)Online publication date: Apr-2023
https://doi.org/10.1109/ICSTW58534.2023.00061
Perretta JDeOrio AGuha ABell JRyu SSmaragdakis Y(2022)On the use of mutation analysis for evaluating student test suite qualityProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534217(263-275)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1145/3533767.3534217

Index Terms

An Empirical Study to Determine if Mutants Can Effectively Simulate Students' Programming Mistakes to Increase Tutors' Confidence in Autograding
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education
        CS1
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Are mutants a valid substitute for real faults in software testing?
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

A good test suite is one that detects real faults. Because the set of faults in a program is usually unknowable, this definition is not useful to practitioners who are creating test suites, nor to researchers who are creating and evaluating tools that ...
Mitigating the effects of equivalent mutants with mutant classification strategies

Mutation Testing has been shown to be a powerful technique in detecting software faults. Despite this advantage, in practice there is a need to deal with the equivalent mutants' problem. Automatically detecting equivalent mutants is an undecidable ...
Comprehensive analysis of FBD test coverage criteria using mutants

Function block diagram (FBD), a graphical modeling language for programmable logic controllers, has been widely used to implement safety critical system software such as nuclear reactor protection systems. With the growing importance of structural ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education

March 2021

1454 pages

ISBN:9781450380621

DOI:10.1145/3408877

General Chairs:
Mark Sherriff
University of Virginia, USA
,
Laurence D. Merkle
Air Force Institute of Technology, USA
,
Program Chairs:
Pamela Cutter
Kalamazoo College, USA
,
Alvaro Monge
California State University, Long Beach, USA
,
Judithe Sheard
Monash University, Australia

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCSE: ACM Special Interest Group on Computer Science Education

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Institute of Coding (Office for Students)

Conference

SIGCSE '21

Sponsor:

SIGCSE

SIGCSE '21: The 52nd ACM Technical Symposium on Computer Science Education

March 13 - 20, 2021

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,787 of 5,146 submissions, 35%

Upcoming Conference

SIGCSE TS 2025

Sponsor:
sigcse

The 56th ACM Technical Symposium on Computer Science Education

February 26 - March 1, 2025

Pittsburgh , PA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
148
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)1

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhuang BPerretta JGuha ABell J(2023)A Tool for Mutation Analysis in Racket2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW58534.2023.00061(308-313)Online publication date: Apr-2023
https://doi.org/10.1109/ICSTW58534.2023.00061
Perretta JDeOrio AGuha ABell JRyu SSmaragdakis Y(2022)On the use of mutation analysis for evaluating student test suite qualityProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3533767.3534217(263-275)Online publication date: 18-Jul-2022
https://dl.acm.org/doi/10.1145/3533767.3534217

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten