research-article

Open access

Framework for SQL Error Message Design: A Data-Driven Approach

Authors:

Toni Taipalus,

Hilkka GrahnAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 1

Article No.: 9, Pages 1 - 50

https://doi.org/10.1145/3607180

Published: 23 November 2023 Publication History

PDF eReader

Abstract

Software developers use a significant amount of time reading and interpreting error messages. However, error messages have often been based on either anecdotal evidence or expert opinion, disregarding novices, who arguably are the ones who benefit the most from effective error messages. Furthermore, the usability aspects of Structured Query Language (SQL) error messages have not received much scientific attention. In this mixed-methods study, we coded a total of 128 error messages from eight database management systems (DBMS), and using data from 311 participants, analysed 4,796 queries using regression analysis to find out if and how acknowledged error message qualities explain SQL syntax error fixing success rates. Additionally, we performed a conventional content analysis on 1,505 suggestions on how to improve SQL error messages, and based on the analysis, formulated a framework consisting of nine guidelines for SQL error message design. The results indicate that general error message qualities do not necessarily explain query fixing success in the context of SQL syntax errors and that even some novel NewSQL systems fail to account for basic error message design guidelines. The error message design framework and examples of its practical applications shown in this study are applicable in educational contexts as well as by DBMS vendors in understanding novice perspectives in error message design.

1 Introduction

A common view is that computer error messages are confusing and unhelpful, even for professionals [9, 27], and that the difficulty of reading error messages is similar to the difficulty of reading source code [5]. As developers use a significant amount of time reading and interpreting error messages [5], it seems natural that both the software industry and academia are interested in the qualities of error messages. Consequently, several studies have shown that a more usable software development environment results in increased productivity [36, 38], and that end-users benefit from increased usability in general [17]. From a pedagogical point of view, feedback—or formative assessment—during the learning process is a powerful way to improve novices’ sense of achievement and motivation [e.g., 13]. Furthermore, feedback is effective especially in focusing on treating mistakes or errors as learning opportunities. This view of seeing mistakes as opportunities means that novices are shown regularly occurring mistakes and errors they make and then, utilizing feedback, are advised how to fix these mistakes [13]. Explaining mistakes and errors encourages learning [23].

Several studies agree that learning Structured Query Language (SQL) can be challenging, possibly due to the declarative nature [44, 63]. Since feedback is seen vital in learning, the database management systems (DBMSs) that novices use to learn SQL should provide constructive and useful feedback in their error messages. Unfortunately, SQL compiler error messages are rarely clear or helpful from a novice viewpoint, although there are differences between DBMSs [53, 54]. In contrast to a professional, a novice often sees a particular error message for the first time, which makes the quality of the error message even more important [60]. Hence, previous literature has made attempts to examine [7] and enhance [40] error messages of programming language compilers, or system error messages in general [47]. However, literature regarding SQL error messages is still almost entirely lacking. Additionally, research dealing with programming language compiler error messages is often based on either anecdotal evidence, or expert opinions [7], yet it has been questioned whether experts can reliably understand novice viewpoints [44]. This study attempts to fulfill the needs for both the data-driven approach and the novice perspective for SQL error message design.

In this study, we explore if and how general error message guidelines apply in the context of SQL error messages and query fixing success rates. This study also presents a framework specifically for SQL error message design. The guidelines in the framework are based on an analysis of novice feedback on the 16 most common SQL syntax errors and corresponding syntax error messages from 8 DBMSs. Additionally, we apply our framework to present examples of modified SQL error messages.

The structure of this study is as follows. In the next section, we discuss the theoretical background behind SQL query formulation, error messages in the context of general system messages as well as programming language compilers, and error message qualities. In Section 3, we describe our research goals (RG), methods, and data collection. Sections 4 and 5 as well as Appendix A present the results of the study, i.e., results from the statistical analyses, the error message design framework, and some applications of said framework. In Section 6, we discuss the implications of our results in light of previous studies, as well as the implications for industry and education. Section 7 concludes the study.

2 Theoretical Background

2.1 Query Execution

A query writer usually communicates with a relational database via a DBMS. Depending on the scenario, a query may be written, e.g., by using a DBMS interface, or by embedding SQL into a host language such as Java. When the DBMS receives the query, the query is processed and executed, and then a set of data or some form of feedback is returned. How the query is processed and executed is dependent on a particular DBMS internals [21, 22]. Because of these different internals, a query may be deemed erroneous by one DBMS, while executed by another [42]. It follows that the feedback the DBMS provides to the query writer is dependent on the DBMS.

After the query is sent, the query processing done by the DBMS is largely a black box to a novice query writer [61]. After the query writer receives an error message instead of a result table or a result table that does not meet the query writer’s expectations, the process typically repeats. This is called a feedback loop between the query writer and the DBMS. Apart from the query written, the DBMS has limited means to help the query writer to formulate the query they want to formulate. Although the query writer usually cannot communicate with the DBMS in natural language, the DBMS can relate natural language messages to the query writer, bridging the gap between the two. This is typically done in one of two ways. First, if the DBMS deems the query syntactically incorrect, the DBMS outputs a natural language syntax error message [1, 11], which usually helps the query writer to pinpoint the error and even fix it. Second, if the query is deemed syntactically correct, the query writer may obtain more information on how the query was executed through a query execution plan [58]. As reading query execution plans require both that the query is syntactically correct as well as considerable knowledge on DBMS internals and physical database design [20, 61], they are typically used by developers for query optimization and not by novices in undergraduate database courses [58]. For these reasons, this study focuses on syntax error messages as the means of communication between the user and the DBMS.

2.2 SQL Syntax Errors

Different SQL errors have received growing scientific attention, especially in computing education research [56]. Current research [11, 57] divides SQL errors regarding data retrieval into (i) syntax errors, which are identified by the DBMS, and which result in a syntax error message, (ii) semantic errors, which are typically not identified by the DBMS, and which result in incorrect data in the result table regardless of what the query is supposed to retrieve, (iii) logical errors, which are not identified by the DBMS, and which result in incorrect data in the result table when the intent of the query is considered, and (iv) complications, which may be identified by the DBMS, and which do not result in incorrect data in the result table, but unnecessarily complicate the query.

Syntax errors have been shown as the most common errors, especially in novice query formulation [1, 55]. Depending on the study, common syntax errors are caused by e.g., references to undefined tables and columns [49], problems with grouping [43, 45], data type mismatches [1], misspellings [62], omitting mandatory clauses [49], as well as illegal aggregate function placement and duplicate clauses [57]. The identification of the most common SQL syntax errors is challenging due to the fact that, with a few exceptions, studies categorize SQL syntax errors depending on the DBMS used, making results incomparable to studies with a different DBMS. Furthermore, it has been shown that, rather intuitively, different SQL concepts such as grouping, joins or ordering invite different types of syntax errors [55], making, e.g., syntax errors common for queries involving a table join uncommon for queries involving grouping and a single table.

Despite their commonness over other types of errors, and probably because syntax errors are caught by the DBMS, syntax errors have been shown to be easier to fix than other types of errors [2, 55]. This is arguably intuitive, as a syntax error halts the execution of a query and prevents the query writer from receiving a result table, while e.g., a logical error does not. A query with a logical error may return a result table with even seemingly correct data, yet this data do not adhere to the query writer’s intent. Causes behind query formulation errors in general have been explained by both human factors such as cognitive load theory [49], different misconceptions regarding the language or generalizations [28], self-induced complexity [30], procedural fixedness [51], and simple sloppiness [29, 49], as well as environmental considerations such as database structure complexity [50] and database normal form [10].

2.3 Error Message Qualities

As syntax errors are typically the only type of errors identified by a DBMS, it follows that a natural way to enhance the communication between the query writer and the DBMS is with more effective syntax error messages. Despite the scientific attention given to SQL errors and programming language error messages, SQL error messages have not received much scientific attention. For this reason, we discuss programming language error message research here. It is worth noting that the declarative nature, as well as the purpose of SQL, is different from programming languages such as C# or Python, which poses challenges to the comparison of the results of programming language compiler error messages and SQL error messages.

Previous studies have shown that programming language compiler error messages are often considered confusing and unhelpful [7, 8], and that users are likely to feel inadequate and anxious when encountering error messages [48]. The role of error messages in the feedback loop is even more crucial when the user is a novice, yet as the quality of error messages affects the overall user experience, better error messages benefit professionals as well [25].

For the reasons above, previous literature has formed guidelines for designing error messages [7, 47, 60]. These guidelines have focused on either general system error messages or programming language compiler error messages, but not on SQL or query language error messages. However, since the domains are related, and previous SQL error message guidelines are unavailable in scientific literature, it is reasonable to inspect those related guidelines here.

The renowned guidelines for designing computer system messages by Shneiderman in 1982 [47] consists of five suggestions that system messages should meet: (i) be brief, (ii) be positive, (iii) be constructive, (iv) be specific, and (v) be comprehensible. Later, other authors have formed guidelines for programming language compiler error messages as well. For instance, Traver [60] suggested eight programming language error message qualities, reflected against Nielsen’s heuristics [34, 35] and his own experience as a programmer and an educator, for error message design. In addition, Becker et al. [7] published a comprehensive review of papers on programming language error messages, categorizing the studies to historical, anecdotal, or empirical research, and presenting a compilation of 10 guidelines for programming language error message design.

These three guidelines, understandably, share similarities. Shneiderman’s first suggestion, be brief, is discussed by both Traver [60] as clarity and brevity as well as Becker et al. [7] as reduce cognitive load. This guideline is, depending on the source, effectively realised by aesthetic and minimalist design [34, 35], meaning that error messages should not be cryptic, long, or hard to interpret [60], and also by enhancing simplicity in the error messages [7]. It also has many other meanings in the literature, such as removing jargon, using complete sentences, and using simple vocabulary [16].

Second, be positive, more broadly expressed as proper phrasing [60] is related to Nielsen’s heuristic of match between system and the real world, which refers to the positive tone of the error message, guidance to help fix the error, use of simple language, as well as using similar words in the system when referring to similar concepts in the real world. In the compiled guidelines [7], the counterpart is phrased as use a positive tone, referring to avoiding negative words, such as illegal, incorrect, or invalid.

Third, Shneiderman’s [47] be constructive is included under proper phrasing as constructive guidance in Traver’s guidelines [60], and broadly interpreted counterparts in Becker et al. [7] are called provide scaffolding and show solutions or hints. Effectively, these guidelines suggest that the error message should provide explanations to the user on why they received the error message and giving support on how to proceed, rather than simply stating that there is an error. The error message should also suggest solutions for how the error can be fixed.

Fourth, be specific is formulated as specificity by Traver [60], and is related to two of Nielsen’s [34, 35] heuristics, recognition rather than recall, and help user to recognize diagnose, and recover from errors. Specifically, the guideline means that error messages should not be too general, since a general error message makes locating the erroneous position difficult. Traver’s [60] locality is related to this guideline, suggesting that the error message should indicate the true origin of the error. Related to this guideline, Becker et al. [7] suggest that the error message should provide context, meaning that there should be information about the programming code relevant to the error which helps understand and address the error.

Fifth, the guideline be comprehensible can be mapped to clarity and brevity [60], and increase readability [7]. This is perhaps one of the more subjective guidelines, as comprehensibility is closely dependent on the user reading the message. In addition to Shneiderman’s [47] five guidelines and their broadly interpreted counterparts, subsequent studies have suggested additional or more precise guidelines for programming language error messages. For example, it has been suggested that error messages should be context-insensitive, meaning that the same error results in the same error message [60], and that the error message conveys a logical train of thought to the user of why the error occurred by using logical argumentation [7] and nonanthropomorphic messages [60]. That is, the message should not use language which implies that the syntax was checked by a human-like actor.

Furthermore, Traver [60] suggests that the error message is divided into three levels: a short message first, then—if the user needs—a brief explanation or examples regarding the error message, and finally, possible corrective actions. Additionally, the environment in general should use colors and fonts to notify the user of errors as early in the writing process as possible [7, 60]. Finally, the message should show the user examples of similar errors to improve the understanding of why the error occurred [7]. As can be seen, all these above guidelines share similarities with Shneiderman’s [47] renowned guidelines regarding computer system messages, as well as with each other.

3 Research Setting

3.1 Research Scope and Goals

Based on the research gaps identified in the previous sections, we formulated four RG and collected both quantitative and qualitative data to reach these goals. We chose to limit the scope of our study to the 16 most common syntax errors reported in a previous study [57] (cf. Table 1).

Table 1.

Test	Syntax error name	Test	Syntax error name
T01	ambiguous column	T09	failure to specify column name twice
T02	omitting quotes around character data	T10	using an aggregate function outside SELECT or HAVING
T03	IS where not applicable	T11	grouping error: extraneous grouping column
T04	confusing the syntax of keywords	T12	nonstandard operators
T05	confusing the logic of keywords	T13	using WHERE twice
T06	too many columns in subquery	T14	nonstandard keywords or standard keywords in wrong context
T07	undefined column	T15	synonyms
T08	misspellings	T16	curly, square or unmatched brackets

Table 1. Sixteen Most Common Syntax Errors [57] and Corresponding Tests [54]

We chose four “traditional” RDBMS as well as four NewSQL [39] RDBMS syntax error messages. NewSQL systems are RDBMSs built from the ground up in the 2010s to account for the innovations and technical development introduced by NoSQL systems, while also catering to the needs of RDBMS users by, e.g., using SQL and a strong consistency model [39]. The reason for including these eight DBMSs was that we wanted to inspect the state of error messages in long-running DBMSs as well as DBMSs developed in the 2010s, while limiting the study to DBMSs that can execute SQL to the degree the test suite requires. The test suite and the selected DBMSs are detailed in the next section.

As discussed in Section 2.3, scientific literature has made many recommendations and guidelines on how user interfaces and system messages in general, or programming language error messages in particular should be formulated. We chose to use the guidelines for system messages proposed by Shneiderman [47], because they are general, as opposed to guidelines proposed for programming language error messages [e.g., 7, 60], which are particular for a different purpose. Furthermore, with our choice of older guidelines, we wished to highlight how even some modern DBMSs disregard guidelines proposed in the early 1980s.

RG1: Find out if and how previously identified system message qualities (i.e., whether the message is brief, positive, constructive, specific, and comprehensible) affect SQL query fixing success in the error messages of the eight selected DBMSs. The results are presented in Section 4.

RG2: Formulate an SQL error message design framework consisting of guidelines derived from the collected data. The framework is presented in Section 5.1.

RG3: Investigate how query writers would improve syntax error messages of eight relational DBMSs both in general and specific to each of the most common syntax errors. The results are introduced in Section 5.2 and presented in detail in Appendix A.

RG4: Based on the formulated framework, propose examples of error messages pertaining to the 16 most common syntax errors. The examples are presented in Appendix A.

3.2 Data Collection

We created our data collection form around a previously reported syntax error test suite [54], which was, in turn, based on the 16 most common SQL syntax errors detailed in Table 1. This test suite provided us with a concrete database structure, 16 SQL data demands, and the corresponding erroneous queries. We then ran the erroneous queries on eight DBMSs (MySQL 8.0.12 with InnoDB storage engine, Oracle Database 19c Enterprise Edition 19.5.0.0, PostgreSQL 12.1, SQL Server 2019 Developer, CockroachDB 19.2.2, SingleStore 7.0.10 with InnoDB storage engine, NuoDB 4.0.4-2, and VoltDB Community 9.2.2) to capture corresponding syntax error messages. These versions were the most recent, stable available versions at the time of data collection. Based on the error messages, we created eight data collection forms—one for each DBMS—consisting of the database schema, a data demand, a corresponding erroneous SQL query, and the corresponding syntax error message, and two free text input fields in which the participant was asked to write a fixed SQL query, and with their own words describe how to improve the error message. All eight data collection forms were the same with the exception of the syntax error messages, and each data collection form consisted of 16 pages, one for each syntax error detailed in Table 1.

We recruited study participants from a database course given in the authors’ university. Prior to participation, the participants were given lectures and mandatory exercises on topics recommended in AIS/ACM curriculum guidelines [59] for an undergraduate database course. Each topic (separated by semicolons) included approximately 4 hours of lectures and 7 hours of exercises: conceptual modeling; the relational model; relational calculus; data manipulation language using SQLite with simple operators, inner and outer joins, and ordering; data manipulation language using SQLite with aggregate functions, grouping, and correlated and uncorrelated subqueries; and data definition language using SQLite. After these topics, this study was introduced. After this study, the course continued with topics such as transaction management and normalization theory. When a participant decided to participate in the study, they were randomly presented with one of the eight data collection forms, e.g., a participant assigned to a data collection form with syntax errors from SQL Server filled out the input fields based on their perceptions of SQL Server error messages. For each participant, the 16 pages were shown in random order. Participation was voluntary, and the participants were shown a data privacy statement prior to their decision on whether to participate. Out of the 363 respondents, 311 (86%) chose to participate.

3.3 Data Preparation

We coded the 128 syntax error messages (16 tests \(\times\) 8 DBMSs) according to the five system message guidelines described by Shneiderman [47], according to which a system message should be brief, positive, constructive, specific, and comprehensible. Because the message qualities given by Shneiderman are general regarding the scope of our study (system messages, as opposed to SQL error messages), and because Shneiderman gives no example on, e.g., what is a brief message, we defined a rubric (Table 2) according to which we coded the SQL error messages. We coded the same subset (20%) of the error messages individually using the rubric and compared our results. All the codings were similar. The first author then coded the rest of the error messages. The original error messages of the eight DBMSs and respective coding are reported in Appendix B (Figures 17–32).

Table 2.

Quality	Code	Description
brief	○	More than 171 characters, not counting the possibly replicated query or parts thereof, or parts replicated from environmental variables.
	◐	Between 100 and 171 characters, not counting the possibly replicated query or parts thereof, or parts replicated from environmental variables.
	●	99 characters or less, not counting the possibly replicated query or parts thereof, or parts replicated from environmental variables.
positive	○	Tone is negative, aggressive, dramatic or discouraging, using words such as illegal, invalid, error, or incorrect.
positive	●	Tone is positive or neutral, the message contains no negative words.
constructive	○	Offers no advice.
	◐	Offers advice on what causes the error.
	●	Offers constructive advice on how to fix the error. General advice such as “refer to manual” or “see help on SELECT” was not considered constructive advice.
specific	○	Provides no error position.
	◐	Replicates a relatively large part of the erroneous query or replicates the query in full, or otherwise provides an approximate error position.
	●	Specifically shows the position of the error.
comprehensible	○	The error message is almost incomprehensible or generally unhelpful.
	◐	The error message contains unnecessary jargon.
	●	Not including error codes, SQL keywords, or common relational database terms, the message reads closely to plain English.

Table 2. The Rubric Created for Error Message Coding; the Numerical Codes Were Converted to Empty (White), Half, and Full Circles (Black) for Readability; the Cutoffs for the Brevity of the Messages Were Determined Based on Tertiles of the Length of the Messages Studied

After data collection, we executed the 4,976 SQL queries (311 participants \(\times\) 16 tests) the participants had attempted to fix on the corresponding DBMS, e.g., if a participant had been shown VoltDB error messages, and therefore attempted to fix erroneous queries based on VoltDB error messages, we executed their fixed queries on VoltDB. Additionally, the syntactically correct queries were manually checked to determine if they were also logical equivalents to the corresponding data demand. MySQL tolerated the syntax errors in tests T05 and T09, and SingleStore tolerated the syntax errors in tests T09 and T11. In these data collection forms, the error messages were made up by us, and the corresponding participant answers were omitted from the analyses. This left a total of 124 error messages for coding, and 4,796 SQL queries for statistical analyses.

3.4 Data Analyses

In order to analyze the effects of error message qualities, we constructed binomial logistic regression models for each of the 16 tests. The data contained 311 answers per test, with the limitations concerning MySQL and SingleStore described in Section 3.3. The independent variables in the model were the error message qualities, i.e., whether the error message was brief, positive, constructive, specific, and comprehensible. The dependent variable in the model was query fixing success, which was binomial, 0 meaning that the query was not fixed, and 1 meaning that the participant succeeded in fixing the query.

Participants also suggested improvements for the error messages using a free text input field. Some participants did not suggest improvements for all error messages, while others suggested several improvements for each of the 16 error messages they were shown. In total, the participants gave 1,505 answers, which we analysed using conventional content analysis [24]. Effectively, the method groups similar content, or themes, into groups that are derived from the data rather than theory or prior literature. The goal of the method is to generalize or reduce data to a form that is easier to interpret. The participant answers were interpreted as is, resulting in both general themes that span across most error messages, and themes that are specific to a given error message.

We first collectively analyzed approximately 5% of the 1,505 answers, and derived example codes from the data, such as error message should show line number and error message should suggest a fix. We then chose a portion of 10% which we both then coded individually. After the individual coding, we compared our results and deemed that all our codings were similar. These agreements most likely stemmed from the fact that all the answers were relatively short, typically containing one or two sentences. The first author then proceeded to code the rest of the data. After this step, we convened to discuss whether the more specific categories should be merged, and categorized the most frequent codings into a higher-level framework for SQL error message design. That is, our results show three levels of abstraction. On the lowest level, we show suggestions for improvements per error message for the 16 tests. On the middle level, we show suggested improvements regarding all error messages. On the highest level, we categorize these suggested improvements regarding all error messages into themes.

4 Factors Affecting Error Message Effectiveness

Only in three tests, the binomial logistic regression model was statistically significant (alpha level .05). In test T01 (\(\chi ^{2}\)(3) = 13.339, p = .004), the model explained 8.0% (Nagelkerke R²) of the variance of fixing the query and correctly classified 87.8% of cases. Of the five predictor variables (i.e., error message qualities), only one was statistically significant: specificity of the error message (p = .012). Error messages being specific had 2.03 higher odds of being fixed successfully.

In test T11 (\(\chi ^{2}\)(5) = 14.003, p = .016), the model explained 7.3% (Nagelkerke R²) of the variance of fixing the query and correctly classified 82.6% of cases. Of the five predictor variables, only one was statistically significant: error message being brief (p = .020). The errors with long error messages had 1.79 higher odds of being fixed successfully.

In test T16 (\(\chi ^{2}\)(4) = 10.243, p = .037), the model explained 6.9% (Nagelkerke R²) of the variance of fixing the query and correctly classified 90.4% of cases. Of the five predictor variables, only one was statistically significant: error message being positive (p = .010). The errors with positive error messages had 1.95 lower odds of being fixed successfully. In the rest of the tests, the binomial logistic regression models did not identify the result as statistically significant (significance levels ranging from p = .101 to p = .969).

5 Error Message Design Framework and Modified Error Messages

5.1 Error Message Design Framework

Using conventional content analysis, and regardless of the test, we identified nine recurring suggestions for error message improvements in the data. Five suggested, general improvements received more than one hundred mentions in the 1,505 answers: specify the line number of the erroneous part (191 mentions, approximately 13%), suggest how to fix the error (181, 12%), remove unnecessary information (141, 9%), explain what causes the error and why (141, 9%), and specify the error position (111, 7%). It is worth noting that these mentions are only general mentions, not counting, e.g., more specific suggestions on how to fix an error (e.g., “suggest single quotes around character strings” or “suggest replacing ISwith =”). Among others, these abstracted suggestions comprise the SQL error message design framework presented in Table 3. The following list shows some selected quotations from the data.

Table 3.

Where	Provide line number: as accurately as possible, show the user on which line the error is.
	Specify the error position: point to the position of the error on the erroneous line.
What	Explain what causes the error: describe what is missing, extraneous, ill-placed, or incorrect.
	Explain why the error occurs: describe what principle is violated.
	Place the most important information first: let the user choose whether to read further.
How	Provide suggestions on how to fix the error: use reserved wording, as the intent of the user is unknown.
	Provide working examples of similar query concepts: show how a query concept is used as a part of a query.
Remove unnecessary elements: remove error codes, host names etc., or move them to the end of the message.
Use plain English: use well-understood terms, or explain complex terms using simple natural language words.

Table 3. The SQL Error Message Design Framework Consists of Three High-Level Themes Consisting of a Total of Nine Guidelines

–

[on CockroachDB, T04] “The error message tries to say that the LIKEoperator does not understand lists, but it says this in a very difficult way. A typical those-who-know-just-know type of message. Isn’t the point of error messages to help us, rather than further separate us from professional users?”

–

[on MySQL, T08] “It is unnecessary to state multiple times, or even once, that the query is erroneous. If it weren’t erroneous, I would receive results instead of a message.”

–

[on MySQL, T08] “It is astounding how the message cannot pinpoint such an obvious typo.”

–

[on VoltDB, T09] “Simply providing a line number would have been more helpful than this long error message.”

–

[on NuoDB, T10] “The error was easy to locate, but fixing it just requires skill which I do not have, and the error message is not helpful in this regard.”

–

[on NuoDB, T12] “The error message was so comprehensive it almost fixed the error for me.”

–

[on Oracle Database, T13] “Perhaps the message is technically the correct way to describe the error, but from a human perspective, this seems incomprehensible.”

First, two guidelines in the framework are closely related to where the error occurs. The data suggest that providing the line number (and providing it correctly) was one of the most frequent suggestions for improvements. Additionally, and while line breaks are not something SQL enforces, the data suggest that a line number is not always enough to specify the error position accurately. One solution to such suggestion (also suggested by the data) is to replicate the erroneous query or parts thereof, and specify the erroneous position using a free-standing circumflex like some DBMSs already do (e.g., PostgreSQL in Figure 17).

Second, three guidelines are closely associated with what elements should form the body of the error message. The data suggest that novices want to know what causes the compiler to halt the interpretation of the query, and providing the position of the error is not informative enough. Additionally, a frequent suggestion for improvements was that the error message should explain why the error occurs. For example, placing a semicolon in the middle of a query should result in an error message showing the line number and exact position of the error, stating that there is a semicolon in the middle of a query (the what), and explaining that a semicolon is used to terminate an SQL statement, and that placing a semicolon in the middle of a query is against this principle (the why). Finally, the error message should be structured in a way that the most important parts for the user are placed first. This arguably helps professional users, who are arguably not as interested in hints or SQL examples as novices.

Third, two guidelines concern how the user could or should proceed in fixing the erroneous query. The error message should suggest how the user could fix the error. As the DBMS does not understand the intent of the user, the wording in these suggestions should be reserved, as a suggested fix may point the user toward a fix that is not the correct fix for the particular intent. Next, the error message should show examples of correctly using the SQL concepts associated with the error, e.g., an erroneous expression should result in an error message showing examples of correctly written expressions.

Finally, general themes in the data were the removal of unnecessary elements and the use of plain English (or whatever the language is if the error messages are provided in some other language). For example, error codes, host names, and the statement that the query is erroneous were seen as unnecessary information. The use of plain English was also frequently suggested, as the environment is complex even without seemingly unnecessarily convoluted sentences.

5.2 Suggested Improvements and Modifications

We present the results from the conventional content analysis for each of the 16 tests in Appendix A (Figures 1–16) due to their length. Each of these figures consists of subfigures illustrating (a) the erroneous query with the erroneous part highlighted for readability, (b) respective fixed query, (c) suggested improvements derived with conventional content analysis, with the number of occurrences, and (d) an example of how an error message could be reformulated based on the proposed error message design framework. It is worth noting that suggestions with fewer than two mentions are not reported, as we deemed that one mention did not constitute a category. Despite the highlights in Appendix A, the erroneous parts were not highlighted for the study participants, and that the typographic details concerning, e.g., line breaks may differ from the tests proper due to horizontal space limitations here. The example queries shown in the modified error messages are intended as static examples, i.e., we do not intend the DBMS to generate dynamic examples based on the underlying database schema, although with recent advances in large language models, this might be a feature to consider. All corresponding error messages are presented in Appendix B.

6 Discussion

6.1 General Discussion

This study pursued to examine what qualities of error messages explain the rate the participants succeeded to correct the query they were shown and how would they improve the error messages. Utilizing the participants’ suggestions, we modified the error messages used in this study and compiled a set of guidelines for error message design (Table 3).

As can be observed, the queries with common syntax errors are simple in the test suite. Additionally, as the test suite is based on previously identified common syntax errors, the empirical observations underneath also show that novices commit simple syntax errors [57]. Despite this, the error messages do not reflect the simple nature of the errors. Based on the error messages listed in Appendix B, it seems justified to argue that many error messages fail to identify the nature of the error correctly, identify the error position incorrectly, or both. This arguably highlights the rather unfortunate state of error messages in many modern DBMSs, instead of begging the question of why the test suite only considers simple syntax errors. Despite what a reader thinks about the error message guidelines presented in this study or of those presented previously in scientific literature, we argue that all the DBMSs subject to this study have error messages that contain at least some elements that seem unintuitive in facilitating query fixing. Table 4 lists characteristics typical to each DBMS. The table arguably shows that many (if not all) of the design guidelines presented in this study have been implemented in at least one of the DBMSs studied.

Table 4.

DBMS	Characteristics of SQL error messages
MySQL (with InnoDB)	Sometimes contain error codes at the beginning of the message; both brief and wordy messages; general suggestions to check the manual; line numbers sometimes present; sometimes replicates a part of the query; non-uniform error messages.
Oracle Database	Error codes at the beginning of the message; brief messages; no line numbers; general messages.
PostgreSQL	No error codes; line numbers; specific error position is indicated by a free-standing circumflex; sometimes provides hints; replicates the erroneous line; complete sentences.
SQL Server	Error codes and additional environmental variable information at the beginning of the message; line numbers; a single message may identify multiple errors; replicates the erroneous position.
CockroachDB	No error codes; does not explicitly state that there is an error; both brief and wordy messages; no line numbers; sometimes the specific error position is indicated by a free-standing circumflex; sometimes provides general hints; sometimes replicates the query up to the position of the error.
SingleStore (with InnoDB)	Error codes at the beginning of the message; general suggestions to check the manual; line numbers sometimes present; sometimes replicates a part of the query.
NuoDB	Error codes at the beginning of the message; sometimes replicates parts of the query; sometimes the specific error position is indicated by a free-standing circumflex; sometimes explains what was expected at the erroneous position.
VoltDB	No error codes; no line numbers; replicates the whole query; sometimes provides hints; sometimes explains what was expected at the erroneous position.

Table 4. Typical Characteristics of the SQL Error Messages of Eight DBMSs; It is Worth Noting that These are Typical Characteristics Based Only on the 16 Types of Errors Studied

RG1, presented in Section 3.1, was concerned with previously identified system message qualities and how they affect SQL query fixing success. This was analyzed with binomial logistic regression. The results of the regression analyses presented in Section 4, with three exceptions, failed to reject the null hypothesis. This may indicate, at least with the data available, that general error message qualities do not explain SQL error fixing, i.e., the general guidelines fail to particularize. In addition, it should be noted that all the percentages of how much the three statistically significant models explained the success rate were very low. The results from the qualitative analysis, however, suggest that the participants value the error message qualities proposed by Shneiderman [47], with the exception that error messages should be positive. In a sense, the results from the quantitative analyses are not in line with the results from the qualitative analyses. That is, the regression analyses suggest that general system error message qualities do not affect query fixing, but nevertheless the results from the content analyses rather uniformly suggest that if the error message qualities tested in the regression model were not present in the error messages, the participants suggested adding these qualities.

RG2 was to formulate an SQL error message design framework derived from the data. Table 5 compares the guidelines presented in this study to those presented by Shneiderman [47], Traver [60], and Becker et al. [7]. The table shows that most of our guidelines map to most of the guidelines presented in previous studies, indicating that our study participants suggest improvements for error messages presented in previous studies unknown to them. The only clear omission in our guidelines is that the error message should be positive. According to our coding of the eight DBMS error messages, all DBMSs had at least one error message that was not positive, meaning that all 311 participants were exposed to at least one non-positive error message. Our data contained only three mentions (all from a single participant, approximately 0.3% of all participants) that the error message could be rephrased without the use of dramatic words. This observation may be biased due to the fact that the participants were recruited from a single university. Becker et al. [7] summarize that in the context of programming language error messages, the effects of a positive tone have been empirically tested by merely two studies, while 14 studies on the subject are of historical or anecdotal nature. A closer inspection of the two studies reveals that in the first study [31], 29% of the 77 participants observed that the word illegal may intimidate the user. In the second study [32], 6 of the 13 participants claimed that the error messages under observation were friendly, yet it was not further discussed whether friendly was considered helpful. Both our quantitative and qualitative results contest the recommendation that error messages should be positive. In fact, in test T16, a positive tone even reduced the odds of successful query fixing. Our interpretation of this result is that the positive error messages in test T16 simply had some other quality which hindered query fixing, rather than positive tone being detrimental to query fixing.

Table 5.

SQL error message design guidelines (this study)	General system message guidelines described by Shneiderman [47]	Programming language error message guidelines presented by Traver [60]	Programming language error message guidelines compiled by Becker et al. [7]
Provide line number	Specific	Specificity; Locality	Provide context
Specify the error position	Specific	Specificity; Locality	Provide context
Explain what causes the error	Constructive	(not present)	Provide scaffolding
Explain why the error occurs	(not present)	(not present)	Allow dynamic interaction; Provide scaffolding; Use logical argumentation
Place the most important information first	(not present)	Extensible help	Reduce cognitive load
Provide suggestions on how to fix the error	Constructive	Constructive guidance	Show solutions or hints
Provide working examples of similar query concepts	(not present)	Extensible help	Show examples
Remove unnecessary elements	Brief	Clarity and brevity	Reduce cognitive load
Use plain English	Comprehensible	Clarity and brevity; Programmer language	Increase readability
(not present)	Positive	Positive tone	Use a positive tone
(not present)	(not present)	(not present)	Provide errors at the right time
(not present)	(not present)	Context-insensitivity; Nonanthropomorphism; Consistency; Visual design	(not present)

Table 5. The SQL Error Message Design Framework Guidelines and Their Broadly Interpreted Counterparts in Previous Literature

RG3 was to investigate how the error messages should be improved according to the participants, and RG4 was to propose examples of modified error messages based on the suggested improvements. The modified error messages are presented in Appendix A. We listed specify the error position as one of the guidelines. It seems both crucial and needless to expand that the error message must provide the error position correctly, something that both programming language error messages [60] as well as SQL error messages (e.g., VoltDB in Figure 24 in Appendix B) sometimes fail to do. In cases when reliably pinpointing the error is not possible, the error message should provide a near position like some DBMSs already do, although even these positions are not always accurate, or replicate a part of the query which does not contain the error (e.g., SingleStore in Figure 24 in Appendix B). As demonstrated with a quotation in Section 5.1, at least one participant raised a concern that simply providing the line number of the erroneous part would be more informative than the error message provided. Although the consensus view seems to be that feedback with examples is more useful than binary feedback (correct/incorrect) [14], even this might not hold true when the error message provides unnecessary or incorrect feedback. For example, the error message of VoltDB shown in Figure 22 in Appendix B replicates the erroneous query in its entirety, yet does not provide the error position, demonstrating an error message that is neither brief nor specific. Also from the perspective of specificity and comprehensiveness, the error messages of VoltDB and SQL Server in Figure 22 in Appendix B are different. Considering that the error in the query in Figure 22(a) is that the subquery returns too many columns for the IN operator which is in this case only expecting values from one column, the cause of the error in VoltDB in Figure 22 in Appendix B is ambiguous, stating “row column count mismatch”, which seems to imply that the error is somehow related to rows. The error can be fixed by making sure that there is the same number of columns in the upper-level query’s expression concerning IN, as in the subquery’s SELECT clause. The error message of SQL Server in Figure 22 in Appendix B, however, uses a complete sentence and conveys the cause of the error more accurately.

Regarding suggestions on how to fix errors, Marceau et al. [26] suggest that novices can follow suggestions on how to fix an error without understanding what causes the error and whether the suggestion is even the right fix. This may be problematic from an educational perspective, as the goal is arguably not to fix an error per se but to learn how to write queries. Although making mistakes is part of any learning process, simply committing errors for the sake of receiving hints and suggestions on how to correctly write a query seems counterproductive to learning. We believe that by explaining what causes the error and why the error occurs, the error message can provide a more deep-rooted understanding to a novice, as opposed to merely providing a suggestion on how to fix the error.

Both Shneiderman [47] and Traver [60] have suggested that error messages should be brief, and the need for brevity is usually argued with the need to reduce cognitive load [46, 51]. Four of our guidelines, provide line number, specify the error position, place the most important information first, and remove unnecessary elements can be viewed as means to reduce cognitive load, yet it is worth noting that our data suggest very few observations on error message brevity per se. Based on the data, it seems reasonable to argue that brevity in itself is not a desirable goal for an SQL error message, as it can be seen to contradict guidelines such as using plain English with complete sentences, or with the guidelines of providing hints and suggestions. Additionally, some errors arguably cannot be described with both clarity and brevity, as the situations in which the errors occur, or why the error occurs may be complex. Therefore, we present that cognitive load should be reduced with other means such as the removal of unnecessary elements and ordering of information, rather than with brief error messages.

Previous results from scientific efforts toward more effective programming language error messages have been inconclusive, or not implemented by the industry, which somewhat diminishes the framework presented in this study. For example, some studies have tested enhanced error messages with novices, yet concluded that there are no positive effects [15, 40]. In contrast, at least one study has shown that the utilization of enhanced error messages results in fewer errors, and fewer repetitions of a similar error [6]. Nevertheless, it has been criticized that over several decades, error message guidelines revolve around similar themes, and despite the rise of new programming languages, the same problems persist [7]. In our opinion, there have also been exceptions, such as the error messages presented in the programming language Rust. From a critical perspective, this study is also yet another one proposing guidelines for error message design, albeit in the novel context of SQL. Despite the criticism presented about programming language error messages, Table 5 shows that there are fundamental differences between the error messages in different DBMSs and that the effects of these differences have also been shown to affect the perceived usefulness for error finding and fixing [54]. Therefore, it seems reasonable to argue that enhancing SQL error messages is a desirable goal for both industry and education, even though our quantitative analysis does not support the view that some error message qualities affect query fixing success rates. In other words, the effects of enhanced error messages may be explained with other variables besides success, not captured in our data or regression model.

Additionally, online learning environments have been proposed for SQL learning for decades [12], and several learning environments that provide enhanced error messages have been studied in programming education [8, 40]. Implementing enhanced error messages into learning environments seems like a natural and relatively fast way of helping novices, as well as acquiring empirical findings on the effectiveness of said error messages. The potential problems with such learning environments are the maintenance overhead and the fact that many of such environments are closed or proprietary, and to our knowledge not widely utilized in industry. Furthermore, in terms of SQL, which can be implemented by one of several different DBMSs with different internals, either the maintenance of the learning environment is even more laborious than that of programming language learning environments, or the SQL learning environment only supports a small subset of DBMSs. One workaround to maintenance could be to check the query syntax on the learning environment’s side, which, in turn, risks the situation where the learning environment evaluates the query syntax error-free, yet the underlying DBMS detects a syntax error (or vice versa). Given these considerations, we deem it beneficial for education that the DBMS vendors undertake the task of enhancing SQL error messages to consider design guidelines. That being said, the task is not as straightforward as changing the character strings provided by the compiler [4], and arguably requires refactoring of the query parsing process with, e.g., reclassification [64] or perhaps by utilizing the work done on automatic error correction in SQL [37], or with large language models. However, it is unclear how DBMSs identify error messages, how different these implementations are between different DBMSs [52], and how technically difficult it is to implement modified error messages. Finally, industry may be (rightfully) concerned about investing in enhancing error messages which have not been scientifically shown to affect the general user experience.

6.2 Limitations and Threats to Validity

The limitations concerning the scope of this study are that the error message design framework is solely based on data retrieval, and not on other types of SQL statements such as data insertion or updating and that we only considered the 16 most common syntax errors. The main reason behind limiting the scope of this study is the extent of previous studies. In terms of different SQL statements, data retrieval is the most well-studied in human-centered contexts [56], and provided us with a limited yet scientifically justified starting point with reports on which errors are the most common [57], and how these common errors can be tested [54]. Another limitation is that many of the syntax errors can be interpreted in multiple ways (e.g., T08 misspellings), yet we only tested each syntax error with one query. Furthermore, all the syntax errors were tested with relatively simple queries. Arguably, more complex queries emphasize the error message qualities even more, e.g., specifying the error position in the error message is more valuable in a query spanning 50 lines when compared to a query spanning 5 lines. Finally, we only tested the syntax errors using novice participants. This could be seen as a limitation affecting the generalizability of the results, yet given that many previously reported guidelines have been based on expert opinion [47, 60], and that this expert opinion has been critiqued [7], we believe the use of novices is a justified approach toward filling an identified research gap, rather than a limitation. The use of appropriate study participants has been argued for in detail in several studies [18, 19].

Regarding the regression analyses, it is possible that there may be a hidden factor or factors (i.e., predictors) not present in the data that affect query fixing success. Furthermore, the dependent variable (i.e., query fixing success) may not be a fitting metric for error message effectiveness. In the context of programming languages, it has been speculated that time taken to fix an error might be such a metric [3, 41], but time was not measured in this study due to the shortcomings of our data collection instrument. In hindsight, measuring time would have been informative and should be taken into account in further studies if the effects of the modified error messages are studied. Finally, it is possible that our coding of the error messages (Appendix B) captures Shneiderman’s [47] error message qualities incorrectly. We have explained the general nature of said guidelines in Section 3.3, and based our coding on a rubric reported in the same section to mitigate this threat and make the coding more transparent.

Another threat to validity is the unnatural environment in which the participants fixed their queries. As explained in Section 2.1, the user typically engages in a feedback loop with the compiler. In this study, however, the participants fixed queries written by someone other than themselves (i.e., us), and received no feedback on whether their fixes were at least syntactically correct. As we wanted to base the error message design framework on previously identified common syntax errors, designing the research setting in another way would have introduced other threats to validity. Nevertheless, the results should be interpreted while taking the environment into account.

6.3 Future Directions

Although our framework was constructed based on empirical findings, this study provides no empirical evidence if these modified error messages actually facilitate, e.g., error fixing success rate, the time required to fix errors or user experience in general. For this reason, we have refrained from calling the new error message examples enhanced or improved. An intuitive topic for future research is to test the effectiveness of these messages using several metrics and iterate the messages based on empirical findings.

A potential—and to our knowledge little studied—topic is the suggestions given by compilers. In our data, several participants criticized the error messages for giving misleading suggestions, or identifying the erroneous position incorrectly. Although we did not systematically examine such situations in our data, it seems justifiable to speculate whether certain error messages are even detrimental to error fixing. Future research could categorize the queries the participants had fixed as, e.g., more incorrect, still incorrect, incorrect in a different way, more correct, and correct, and examine whether the error message plays a part in the evolution of the originally erroneous query.

Finally, both Becker et al. [7] and Traver [60] briefly discuss the interaction between the human user and the compiler through a more interactive user interface than plain text. For example, such a simple modification as hyperlinks in the error message pointing to more extensive documentation is something our participants also suggested. External online documentation would also make fixes and updates to error messages more effortless, without requiring updating the DBMS. Additionally, the error messages may be provided in a form other than textual, if the environment allows [33]. While more rich feedback may arguably present problems in development contexts (as opposed to learning contexts) when, e.g., the DBMS error message is replicated in a plain text error stack, from an educational and human–computer interaction perspective, such richer error messages are an interesting future topic.

7 Conclusion

In this study, we set out to investigate if and how general system error message qualities explain SQL syntax error fixing success rates. The results indicate, at least in the scope of this study, that the general error messages qualities do not explain query fixing success. We also analyzed qualitative data regarding suggestions on how to improve SQL error messages and formulated a framework for SQL error message design. The framework guides error message design toward specifying where the error occurs, what causes the error and why, providing suggestions on how to fix the error, and showing examples of similar query concepts. Additionally, the framework emphasizes the ordering of information in the error message, the removal of unnecessary elements, and the use of plain English. Finally, and based on the formulated framework, we applied the framework and showed examples of how to design error messages for the 16 most common SQL syntax errors. We suggest the industry to follow either these or previously published general system message guidelines.

Acknowledgments

The authors thank all who participated in the study, as well as the anonymous reviewers for their insights on how to improve the paper.

A Suggested Improvements and Modified Error Messages

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

Fig. 10.

Fig. 11.

Fig. 12.

Fig. 13.

Fig. 14.

Fig. 15.

Fig. 16.

B Error Message Coding

Fig. 17.

Fig. 18.

Fig. 19.

Fig. 20.

Fig. 21.

Fig. 22.

Fig. 23.

Fig. 24.

Fig. 25.

Fig. 26.

Fig. 27.

Fig. 28.

Fig. 29.

Fig. 30.

Fig. 31.

Fig. 32.

References

[1]

Alireza Ahadi, Vahid Behbood, Arto Vihavainen, Julia Prior, and Raymond Lister. 2016. Students’ syntactic mistakes in writing seven different types of SQL queries and its application to predicting students’ success. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education. ACM, New York, NY, 401–406. DOI:

Abstract

1 Introduction

2 Theoretical Background

2.1 Query Execution

2.2 SQL Syntax Errors

2.3 Error Message Qualities

3 Research Setting

3.1 Research Scope and Goals

3.2 Data Collection

3.3 Data Preparation

3.4 Data Analyses

4 Factors Affecting Error Message Effectiveness

5 Error Message Design Framework and Modified Error Messages

5.1 Error Message Design Framework

5.2 Suggested Improvements and Modifications

6 Discussion

6.1 General Discussion

6.2 Limitations and Threats to Validity

6.3 Future Directions

7 Conclusion

Acknowledgments

A Suggested Improvements and Modified Error Messages

B Error Message Coding

References

Cited By

Index Terms

Recommendations

Building Blocks Towards More Effective SQL Error Messages

Error messages in relational database management systems: A comparison of effectiveness, usefulness, and user confidence

SQL: From Traditional Databases to Big Data

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations