research-article

Edit Based Grading of SQL Queries

Authors:

Bikash Chandra,

Ananyo Banerjee,

S. SudarshanAuthors Info & Claims

CODS-COMAD '21: Proceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)

Pages 56 - 64

https://doi.org/10.1145/3430984.3431012

Published: 02 January 2021 Publication History

Abstract

Grading student SQL queries manually is a tedious and error-prone process. Earlier work on testing correctness of student SQL queries, such as the XData system, can be used to test the correctness of a student query. However, in case a student query is found to be incorrect there is currently no way to automatically assign partial marks. Partial marking is important so that small errors are penalized less than large errors. Manually awarding partial marks is not scalable for classes with large number of students, especially MOOCs, and is also prone to human errors.

In this paper, we discuss techniques to find a minimum cost set of edits to a student query that would make it correct, which can help assign partial marks, and to help students understand exactly where they went wrong. Given the limitations of current formal methods for checking equivalence, our approach is based on finding the nearest query from a set of instructor provided correct queries, that is found to be equivalent based on query canonicalization. We show that exhaustive techniques are expensive, and propose a greedy heuristic approach that works well both in terms of runtime and accuracy on queries in real-world datasets. Our system can also be used in a learning mode where query edits can be suggested as feedback to students to guide them towards a correct query. Our partial marking system has been successfully used in courses at IIT Bombay and IIT Dharwad.

References

[1]

Alfred V. Aho, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalences Among Relational Expressions. SIAM J. Comput. (1979).

[2]

Philip Bille. 2005. A Survey on Tree Edit Distance and Related Problems. Theor. Comput. Sci. (2005).

[3]

Bikash Chandra, Ananyo Banerjee, Udbhas Hazra, Mathew Joseph, and S. Sudarshan. 2019. Automated Grading of SQL Queries. ICDE (Poster) (2019).

[4]

Bikash Chandra, Ananyo Banerjee, Udbhas Hazra, Mathew Joseph, and S. Sudarshan. 2019. Edit Based Grading of SQL Queries. CoRR (2019). http://arxiv.org/abs/1912.09019

[5]

Bikash Chandra, Bhupesh Chawda, Biplab Kar, K. V. Maheshwara Reddy, Shetal Shah, and S. Sudarshan. 2015. Data generation for testing and grading SQL queries. VLDB J. (2015).

[6]

Bikash Chandra, Mathew Joseph, Bharath Radhakrishnan, Shreevidhya Acharya, and S. Sudarshan. 2016. Partial Marking for Automated Grading of SQL Queries. PVLDB (Demo) (2016).

[7]

Shumo Chu, Brendan Murphy, Jared Roesch, Alvin Cheung, and Dan Suciu. 2018. Axiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries. PVLDB (2018).

[8]

Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. In CIDR.

[9]

Sumit Gulwani, Ivan Radiček, and Florian Zuleger. 2018. Automated Clustering and Program Repair for Introductory Programming Assignments. In PLDI.

[10]

Yannis E. Ioannidis and Raghu Ramakrishnan. 1995. Containment of Conjunctive Queries: Beyond Relations as Sets. ACM Trans. Database Syst.(1995).

[11]

T. S. Jayram, Phokion G. Kolaitis, and Erik Vee. 2006. The Containment Problem for Real Conjunctive Queries with Inequalities. In PODS.

[12]

Garvit Juniwal, Alexandre Donzé, Jeff C. Jensen, and Sanjit A. Seshia. 2014. CPSGrader: Synthesizing Temporal Logic Testers for Auto-grading an Embedded Systems Laboratory. In EMSOFT.

[13]

Yehoshua Sagiv and Mihalis Yannakakis. 1978. Equivalence among Relational Expressions with the Union and Difference Operation. In VLDB.

[14]

Shetal Shah, S. Sudarshan, Suhas Kajbaje, Sandeep Patidar, Bhanu Pratap Gupta, and Devang Vira. 2011. Generating Test Data for Killing SQL Mutants: A Constraint-based Approach. In ICDE.

[15]

Abraham Silberschatz, Henry F. Korth, and S. Sudarshan. 2019. Database System Concepts(7th ed.). McGraw Hill.

[16]

Rishabh Singh, Sumit Gulwani, and Armando Solar-Lezama. 2013. Automated Feedback Generation for Introductory Programming Assignments. In PLDI.

[17]

Ke Wang, Rishabh Singh, and Zhendong Su. 2018. Search, Align, and Repair: Data-driven Feedback Generation for Introductory Programming Exercises. In PLDI.

Cited By

Kleiner CHeine FMonga MLonati VBarendsen ESheard JPaterson J(2024)Enhancing Feedback Generation for Autograded SQL Statements to Improve Student LearningProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653579(248-254)Online publication date: 3-Jul-2024
https://dl.acm.org/doi/10.1145/3649217.3653579
Parikh PChatterjee OJain MHarsh AShahani GBiswas RArya K(2022)Auto-Query - A simple natural language to SQL query generator for an e-learning platform2022 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON52537.2022.9766617(936-940)Online publication date: 28-Mar-2022
https://doi.org/10.1109/EDUCON52537.2022.9766617
Wang MSibia NDema ILiut MSuárez C(2021)Building a Better SQL Automarker for Database CoursesProceedings of the 21st Koli Calling International Conference on Computing Education Research10.1145/3488042.3489970(1-3)Online publication date: 17-Nov-2021
https://dl.acm.org/doi/10.1145/3488042.3489970
Show More Cited By

Recommendations

Partial marking for automated grading of SQL queries

The XData system, currently being developed at IIT Bombay, provides an automated and interactive platform for grading student SQL queries, as well as for learning SQL. Prior work on the XData system focused on generating query specific test cases to ...
Equivalence of SQL queries in presence of embedded dependencies
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

We consider the problem of finding equivalent minimal-size reformulations of SQL queries in presence of embedded dependencies [1]. Our focus is on select-project-join (SPJ) queries with equality comparisons, also known as safe conjunctive (CQ) queries, ...
Weighted Edit Distance based FAQ Retrieval using Noisy Queries
FIRE '12 & '13: Proceedings of the 4th and 5th Annual Meetings of the Forum for Information Retrieval Evaluation

In this paper, we describe our contribution to the FIRE 2013 shared task on "FAQ Retrieval using Noisy Queries". Short messaging service (SMS) and voice-based interfaces such as Siri have become quite popular for quick information retrieval these days. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CODS-COMAD '21: Proceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)

January 2021

453 pages

ISBN:9781450388177

DOI:10.1145/3430984

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Tata Consultancy Services

Conference

CODS COMAD 2021

CODS COMAD 2021: 8th ACM IKDD CODS and 26th COMAD

January 2 - 4, 2021

Bangalore, India

Acceptance Rates

Overall Acceptance Rate 197 of 680 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
135
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)4

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kleiner CHeine FMonga MLonati VBarendsen ESheard JPaterson J(2024)Enhancing Feedback Generation for Autograded SQL Statements to Improve Student LearningProceedings of the 2024 on Innovation and Technology in Computer Science Education V. 110.1145/3649217.3653579(248-254)Online publication date: 3-Jul-2024
https://dl.acm.org/doi/10.1145/3649217.3653579
Parikh PChatterjee OJain MHarsh AShahani GBiswas RArya K(2022)Auto-Query - A simple natural language to SQL query generator for an e-learning platform2022 IEEE Global Engineering Education Conference (EDUCON)10.1109/EDUCON52537.2022.9766617(936-940)Online publication date: 28-Mar-2022
https://doi.org/10.1109/EDUCON52537.2022.9766617
Wang MSibia NDema ILiut MSuárez C(2021)Building a Better SQL Automarker for Database CoursesProceedings of the 21st Koli Calling International Conference on Computing Education Research10.1145/3488042.3489970(1-3)Online publication date: 17-Nov-2021
https://dl.acm.org/doi/10.1145/3488042.3489970
Rivas PSchwartz D(2021)Modeling SQL Statement Correctness with Attention-Based Convolutional Neural Networks2021 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI54926.2021.00086(64-71)Online publication date: Dec-2021
https://doi.org/10.1109/CSCI54926.2021.00086

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents