SQL Interview Questions
SQL Interview Questions
1. What is Database?
2. What is DBMS?
RDBMS stands for Relational Database Management System. The key difference is
compared to DBMS, is that RDBMS stores data in the form of a collection of tables,
and relations can be defined between the common fields of these tables. Most modern
database management systems like MySQL, Microsoft SQL Server, Oracle, IBM DB2,
and Amazon Redshift are based on RDBMS.
4. What is SQL?
SQL stands for Structured Query Language. It is the standard language for relational
database management systems. It is especially useful in handling organized data
comprised of entities (variables) and relations between different entities of the data.
A table is an organized collection of data stored in the form of rows and columns.
Columns can be categorized as vertical and rows as horizontal. The columns in a
table are called fields while the rows can be referred to as records.
• NOT NULL - Restricts NULL value from being inserted into a column.
• CHECK - Verifies that all values in a field satisfy a condition.
• DEFAULT - Automatically assigns a default value if no value has been specified
for the field.
• UNIQUE - Ensures unique values to be inserted into the field.
• INDEX - Indexes a field providing faster retrieval of records.
• PRIMARY KEY - Uniquely identifies each record in a table.
• FOREIGN KEY - Ensures referential integrity for a record in another table.
The PRIMARY KEY constraint uniquely identifies each row in a table. It must contain
UNIQUE values and has an implicit NOT NULL constraint.
A table in SQL is strictly restricted to have one and only one primary key, which is
comprised of single or multiple fields (columns).
CREATE TABLE Students ( /* Create table with a single field as primary key */
ID INT NOT NULL
Name VARCHAR(255)
PRIMARY KEY (ID)
);
CREATE TABLE Students ( /* Create table with multiple fields as primary key */
ID INT NOT NULL
LastName VARCHAR(255)
FirstName VARCHAR(255) NOT NULL,
CONSTRAINT PK_Student
PRIMARY KEY (ID, FirstName)
);
write a sql statement to add primary key 't_id' to the table 'teachers'.
Write a SQL statement to add primary key constraint 'pk_a' for table 'table_a' and
fields 'col_b, col_c'.
A UNIQUE constraint ensures that all values in a column are different. This provides
uniqueness for the column(s) and helps identify each row uniquely. Unlike primary
key, there can be multiple unique constraints defined per table. The code syntax for
UNIQUE is quite similar to that of PRIMARY KEY and can be used interchangeably.
Write a SQL statement to add a FOREIGN KEY 'col_fk' in 'table_y' that references
'col_pk' in 'table_x'.
11. What is a Join? List its different types.
The SQL Join clause is used to combine records (rows) from two or more tables in a
SQL database based on a related column between the two.
• (INNER) JOIN: Retrieves records that have matching values in both tables
involved in the join. This is the widely used join for queries.
SELECT *
FROM Table_A
JOIN Table_B;
SELECT *
FROM Table_A
INNER JOIN Table_B;
• LEFT (OUTER) JOIN: Retrieves all the records/rows from the left and the
matched records/rows from the right table.
SELECT *
FROM Table_A A
LEFT JOIN Table_B B
ON A.col = B.col;
• RIGHT (OUTER) JOIN: Retrieves all the records/rows from the right and the
matched records/rows from the left table.
SELECT *
FROM Table_A A
RIGHT JOIN Table_B B
ON A.col = B.col;
• FULL (OUTER) JOIN: Retrieves all the records where there is a match in either
the left or right table.
SELECT *
FROM Table_A A
FULL JOIN Table_B B
ON A.col = B.col;
A self JOIN is a case of regular join where a table is joined to itself based on some
relation between its own column(s). Self-join uses the INNER JOIN or LEFT JOIN
clause and a table alias is used to assign different names to the table within the query.
SELECT A.emp_id AS "Emp_ID",A.emp_name AS "Employee",
B.emp_id AS "Sup_ID",B.emp_name AS "Supervisor"
FROM employee A, employee B
WHERE A.emp_sup = B.emp_id;
Cross join can be defined as a cartesian product of the two tables included in the join.
The table after join contains the same number of rows as in the cross-product of the
number of rows in the two tables. If a WHERE clause is used in cross join then the
query will work like an INNER JOIN.
Write a SQL statement to CROSS JOIN 'table_1' with 'table_2' and fetch 'col_1' from
table_1 & 'col_2' from table_2 respectively. Do not use alias.
Write a SQL statement to perform SELF JOIN for 'Table_X' with alias 'Table_1' and
'Table_2', on columns 'Col_1' and 'Col_2' respectively.
A database index is a data structure that provides a quick lookup of data in a column
or columns of a table. It enhances the speed of operations accessing data from a
database table at the cost of additional writes and memory to maintain the index data
structure.
There are different types of indexes that can be created for different purposes:
Unique indexes are indexes that help maintain data integrity by ensuring that no two
rows of data in a table have identical key values. Once a unique index has been
defined for a table, uniqueness is enforced whenever keys are added or changed
within the index.
Non-unique indexes, on the other hand, are not used to enforce constraints on the
tables with which they are associated. Instead, non-unique indexes are used solely to
improve query performance by maintaining a sorted order of data values that are used
frequently.
Clustered indexes are indexes whose order of the rows in the database corresponds
to the order of the rows in the index. This is why only one clustered index can exist in
a given table, whereas, multiple non-clustered indexes can exist in the table.
The only difference between clustered and non-clustered indexes is that the database
manager attempts to keep the data in the database in the same order as the
corresponding keys appear in the clustered index.
Clustering indexes can improve the performance of most query operations because
they provide a linear-access path to data stored in the database.
As explained above, the differences can be broken down into three small factors -
• Clustered index modifies the way records are stored in a database based on the
indexed column. A non-clustered index creates a separate entity within the table
which references the original table.
• Clustered index is used for easy and speedy retrieval of data from the database,
whereas, fetching records from the non-clustered index is relatively slower.
• In SQL, a table can have a single clustered index whereas it can have multiple
non-clustered indexes.
Data Integrity is the assurance of accuracy and consistency of data over its entire life-
cycle and is a critical aspect of the design, implementation, and usage of any system
which stores, processes, or retrieves data. It also defines integrity constraints to
enforce business rules on the data when it is entered into an application or a
database.
A subquery is a query within another query, also known as a nested query or inner
query. It is used to restrict or enhance the data to be queried by the main query, thus
restricting or enhancing the output of the main query respectively. For example, here
we fetch the contact information for students who have enrolled for the maths subject:
Write a SQL query to update the field "status" in table "applications" from 0 to 1.
Write a SQL query to select the field "app_id" in table "applications" where "app_id"
less than 1000.
Write a SQL query to fetch the field "app_name" from "apps" where "apps.id" is equal
to the above collection of "app_id".
SELECT operator in SQL is used to select data from a database. The data returned is
stored in a result table, called the result-set.
20. What are some common clauses used with SELECT query in SQL?
Some common SQL clauses used in conjuction with a SELECT query are as follows:
• WHERE clause in SQL is used to filter records that are necessary, based on
specific conditions.
• ORDER BY clause in SQL is used to sort the records based on some field(s) in
ascending (ASC) or descending order (DESC).
SELECT *
FROM myDB.students
WHERE graduation_year = 2019
ORDER BY studentID DESC;
• GROUP BY clause in SQL is used to group records with identical data and can
be used in conjunction with some aggregation functions to produce summarized
results from the database.
• HAVING clause in SQL is used to filter records in combination with the GROUP
BY clause. It is different from WHERE, since the WHERE clause cannot filter
aggregated records.
The UNION operator combines and returns the result-set retrieved by two or more
SELECT statements.
The MINUS operator in SQL is used to remove duplicates from the result-set obtained
by the second SELECT query from the result-set obtained by the first SELECT query
and then return the filtered results from the first.
The INTERSECT clause in SQL combines the result-set fetched by the two SELECT
statements where records from one match the other and then returns this intersection
of result-sets.
Certain conditions need to be met before executing either of the above statements in
SQL -
• Each SELECT statement within the clause must have the same number of
columns
• The columns must also have similar data types
• The columns in each SELECT statement should necessarily have the same
order
Write a SQL query to fetch "names" that are present in either table "accounts" or in
table "registry".
Write a SQL query to fetch "names" that are present in "accounts" but not in table
"registry".
Write a SQL query to fetch "names" from table "contacts" that are neither present in
"accounts.name" nor in "registry.name".
A database cursor is a control structure that allows for the traversal of records in a
database. Cursors, in addition, facilitates processing after traversal, such as retrieval,
addition, and deletion of database records. They can be viewed as a pointer to one
row in a set of rows.
1. DECLARE a cursor after any variable declaration. The cursor declaration must
always be associated with a SELECT Statement.
2. Open cursor to initialize the result set. The OPEN statement must be called
before fetching rows from the result set.
3. FETCH statement to retrieve and move to the next row in the result set.
4. Call the CLOSE statement to deactivate the cursor.
5. Finally use the DEALLOCATE statement to delete the cursor definition and
release the associated resources.
Entity: An entity can be a real-world object, either tangible or intangible, that can be
easily identifiable. For example, in a college database, students, professors, workers,
departments, and projects can be referred to as entities. Each entity has some
associated properties that provide it an identity.
• One-to-One - This can be defined as the relationship between two tables where
each record in one table is associated with the maximum of one record in the
other table.
• One-to-Many & Many-to-One - This is the most commonly used relationship
where a record in a table is associated with multiple records in the other table.
• Many-to-Many - This is used in cases when multiple instances on both sides are
needed for defining a relationship.
• Self-Referencing Relationships - This is used when a table needs to define a
relationship with itself.
An alias is represented explicitly by the AS keyword but in some cases, the same can
be performed without it as well. Nevertheless, using the AS keyword is always a good
practice.
Write an SQL statement to select all from table "Limited" with alias "Ltd".
A view in SQL is a virtual table based on the result-set of an SQL statement. A view
contains rows and columns, just like a real table. The fields in a view are fields from
one or more real tables in the database.
27. What is Normalization?
Normal Forms are used to eliminate or reduce redundancy in database tables. The
different forms are as follows:
Students Table
As we can observe, the Books Issued field has more than one value per record, and to
convert it into 1NF, this has to be resolved into separate individual records for each
book issued. Check the following table in 1NF form -
A relation is in second normal form if it satisfies the conditions for the first normal form
and does not contain any partial dependency. A relation in 2NF has no partial
dependency, i.e., it has no non-prime attribute that depends on any proper subset of
any candidate key of the table. Often, specifying a single column Primary Key is the
solution to the problem. Examples -
Example 1 - Consider the above example. As we can observe, the Students Table in
the 1NF form has a candidate key in the form of [Student, Address] that can uniquely
identify all records in the table. The field Books Issued (non-prime attribute) depends
partially on the Student field. Hence, the table is not in 2NF. To convert it into the 2nd
Normal Form, we will partition the tables into two while specifying a new Primary
Key attribute to identify the individual records in the Students table. The Foreign
Key constraint will be set on the other table to ensure referential integrity.
Here, WX is the only candidate key and there is no partial dependency, i.e., any
proper subset of WX doesn’t determine any non-prime attribute in the relation.
A relation is said to be in the third normal form, if it satisfies the conditions for the
second normal form and there is no transitive dependency between the non-prime
attributes, i.e., all non-prime attributes are determined only by the candidate keys of
the relation and not by any other non-prime attribute.
Example 1 - Consider the Students Table in the above example. As we can observe,
the Students Table in the 2NF form has a single candidate key Student_ID (primary
key) that can uniquely identify all records in the table. The field Salutation (non-prime
attribute), however, depends on the Student Field rather than the candidate key.
Hence, the table is not in 3NF. To convert it into the 3rd Normal Form, we will once
again partition the tables into two while specifying a new Foreign Key constraint to
identify the salutations for individual records in the Students table. The Primary
Key constraint for the same will be set on the Salutations table to identify each record
uniquely.
Salutation_ID Salutation
1 Ms.
2 Mr.
3 Mrs.
For the above relation to exist in 3NF, all possible candidate keys in the above relation
should be {P, RS, QR, T}.
A relation is in Boyce-Codd Normal Form if satisfies the conditions for third normal
form and for every functional dependency, Left-Hand-Side is super key. In other
words, a relation in BCNF has non-trivial functional dependencies in form X –> Y, such
that X is always a super key. For example - In the above example, Student_ID serves
as the sole unique identifier for the Students Table and Salutation_ID for the
Salutations Table, thus these tables exist in BCNF. The same cannot be said for the
Books Table and there can be several books with common Book Names and the same
Student_ID.
TRUNCATE command is used to delete all the rows from the table and free the space
containing the table.
Write a SQL query to remove first 1000 records from table 'Temporary' based on 'id'.
Write a SQL statement to delete the table 'Temporary' while keeping its relations
intact.
If a table is dropped, all things associated with the tables are dropped as well. This
includes - the relationships defined on the table with other tables, the integrity checks
and constraints, access privileges and other grants that the table has. To create and
use the table again in its original form, all these relations, checks, constraints,
privileges and relationships need to be redefined. However, if a table is truncated,
none of the above problems exist and the table retains its original structure.
The TRUNCATE command is used to delete all the rows from the table and free the
space containing the table.
The DELETE command deletes only the rows from the table based on the condition
given in the where clause or deletes all the rows from the table if no condition is
specified. But it does not free the space containing the table.
A scalar function returns a single value based on the input value. Following are the
widely used SQL scalar functions:
The user-defined functions in SQL are like functions in any other programming
language that accept parameters, perform complex calculations, and return a value.
They are written to use the logic repetitively whenever required. There are two types
of SQL user-defined functions:
OLAP stands for Online Analytical Processing, a class of software programs that
are characterized by the relatively low frequency of online transactions. Queries are
often too complex and involve a bunch of aggregations. For OLAP systems, the
effectiveness measure relies highly on response time. Such systems are widely used
for data mining or maintaining aggregated, historical data, usually in multi-dimensional
schemas.
Collation refers to a set of rules that determine how data is sorted and compared.
Rules defining the correct character sequence are used to sort the character data. It
incorporates options for specifying case sensitivity, accent marks, kana character
types, and character width. Below are the different types of collation sensitivity:
DELIMITER $$
CREATE PROCEDURE FetchAllStudents()
BEGIN
SELECT * FROM myDB.students;
END $$
DELIMITER ;
A stored procedure that calls itself until a boundary condition is reached, is called a
recursive stored procedure. This recursive function helps the programmers to deploy
the same set of code several times as and when required. Some SQL programming
languages limit the recursion depth to prevent an infinite loop of procedure calls from
causing a stack overflow, which slows down the system and may lead to system
crashes.
40. How to create empty tables with the same structure as another
table?
Creating empty tables with the same structure can be done smartly by fetching the
records of one table into a new table using the INTO operator while fixing a WHERE
clause to be false for all records. Hence, SQL prepares the new table with a duplicate
structure to accept the fetched records but since no records get fetched due to the
WHERE clause in action, nothing is inserted into the new ta