SQL_Interview_Cheat_Sheet
SQL_Interview_Cheat_Sheet
INTERVIEW
CHEAT SHEET
SQL Interview Cheat Sheet
1. Basic Functions
3. Conceptual questions
This summary walks you through all three types of questions when preparing for a SQL Interview.
This quick reference lists common pitfalls and important details to keep in mind when working with
SQL, making it easier for you to avoid mistakes and write bug-free queries.
Quick Checks
Keywords
Order of Statements
Table and Column Names
Parentheses
Aliases
Single Quotes
Carefully use Indentation & White spaces
Go for the ANSI-92 JOIN Syntax (explicit join)
Basic Functions
SELECT
CASE WHEN
ROUND
JOIN
Multiple Filters
LIKE
<>
GROUP BY
Quick Checks
Keywords
Check if the query contains all the necessary keywords
1. SELECT
2. FROM
3. WHERE
4. GROUP BY
5. HAVING
6. ORDER BY
Order of Statements
The order of the statement is the following. Mixing up the statement order will result in a syntax error.
1. SELECT
2. FROM
3. WHERE
4. GROUP BY
5. HAVING
6. ORDER BY
SELECT SELECT
COUNT(*) COUNT(*)
FROM table1 FROM table1
WHERE date = '2022-01-01' GROUP BY col2
Pro tip: copy and paste the table and column names to avoid typos
Table: dropoff_records
id dropoff_date
// Wrong table and column names
1 2022-01-01
SELECT dropoff_data
2 2022-01-02 FROM dropoff_record;
… …
Parentheses
Check if ALL parentheses are paired.
Aliases
Don’t forget to create aliases when tables have the same names, i.e. when self-joining tables.
SELECT SELECT
* *
FROM FROM
table1 AS today, table1,
table1 AS yesterday; table1;
Single Quotes
Use single quotes in queries for strings.
[S]ingle quotes are for [S]tring Literals (date literals are also strings).
SELECT * SELECT *
FROM table1 FROM table1
WHERE date = '2022-01-01' WHERE date = "2022-01-01"
AND status = 'ready'; AND status = "ready";
Basic Functions
SELECT
Add commas between columns. No comma after the last column.
SELECT SELECT
col1, col1
col2, col2,
col3 col3,
FROM table1; FROM table1;
CASE WHEN
Don’t forget the END keyword at the end.
SELECT SELECT
CASE CASE
WHEN score > 95 THEN 'Excellent' WHEN score > 95 THEN 'Excellent'
WHEN score > 80 THEN 'Good' WHEN score > 80 THEN 'Good'
WHEN score > 60 THEN 'Fair' WHEN score > 60 THEN 'Fair'
ELSE 'Poor' ELSE 'Poor'
END FROM table1;
FROM table1;
Syntax Details
The CASE statement evaluates its conditions sequentially and stops with the first condition
whose condition is satisfied.
ROUND
Don’t forget to use round for a ratio result to increase the readability.
JOIN
(INNER) JOIN : Returns records that have matching values in both tables
LEFT (OUTER) JOIN : Returns all records from the left table, and the matched records from the right table
RIGHT (OUTER) JOIN : Returns all records from the right table, and the matched records from the left table
FULL (OUTER) JOIN : Returns all records when there is a match in either left or right table
Don’t forget the ON statement unless you are using implicit join.
SELECT SELECT
* *
FROM table1 AS t2 FROM table1 AS t2
JOIN table2 AS t1 JOIN table2 AS t1;
ON t2.col1 = t1.col1;
Multiple Filters
Be careful when combining OR with AND statements. Use parentheses () to combine filters when
necessary.
Example:
Select records with the status ‘ready’ or ‘shipped’ on 2022-01-01.
SELECT SELECT
* *
FROM table1 FROM table1
WHERE date = '2022-01-01' WHERE date = '2022-01-01'
AND (status = 'ready' AND status = 'ready'
OR status = 'shipped'); OR status = 'shipped';
LIKE
The percent sign (%) represents zero, one, or multiple characters
SELECT SELECT
* *
FROM table1 AS t1 FROM table1 AS t1
WHERE address LIKE '%94401%'; WHERE address LIKE '_94401_';
<>
Find the best combo sell items, use <> to filter out the same item in an order.
You can use either != or <> both in your queries as both are technically the same but <> is
preferred as that is SQL-92 standard.
GROUP BY
When mixing aggregate functions ( SUM , AVG , COUNT , etc) with an unaggregated column, ensure it’s in the
GROUP BY statement.
When there are multiple unaggregated columns, ensure ALL are in the GROUP BY statement.
SELECT SELECT
COUNT(*), COUNT(*),
city, city,
country country
FROM table1 FROM table1
GROUP BY city, country; GROUP BY city;
ORDER BY
If you use ORDER BY without any keywords, it is ascending order by default. To sort the records in
descending order, use the DESC keyword.
Example:
Return score from highest to lowest:
LIMIT N
Example:
Return top 5 scores
Table a Table b
1 2
2 3
SELECT * SELECT *
FROM a FROM a
UNION ALL UNION
SELECT * SELECT *
FROM b; FROM b;
Complex Queries
DATE Format
DATE()
Check the date column format before using it. If it is TIMESTAMP or VARCHAR , you need to
convert it to DATE first. Otherwise, the result will not be aggregated to the date level.
Example:
SELECT SELECT
DATE(timestamp), timestamp,
SUM(sales) SUM(sales)
FROM table1 FROM table1
GROUP BY 1; GROUP BY 1;
CURDATE()
Example:
Find out last 7 days page visitor count
DATEDIFF()
If we want positive results, the first value must be less than the 2nd value.
Window function
ROW_NUMBER
ROW_NUMBER always returns unique rankings.
RANK/DENSE RANK
LEAD/LAG
LAG pulls from previous rows and LEAD pulls from following rows.
Example:
Compute MoM growth Rate from a monthly sales table.
WITH my_cte AS
(
SELECT col1, col2 FROM table
)
SELECT * FROM my_cte;
Conceptual Questions
How to write efficient SQL query
Use indexes
Indexes are used to quickly locate rows in a table based on the values in a specific column. Make
sure that you have proper indexes on the columns used in your WHERE clause and JOIN
conditions, as this can greatly improve the performance of your queries.
Subqueries can be slow, especially when they are used in the WHERE clause. Instead, try to use
JOIN s or derived tables to achieve the same result.
Functions like UPPER() , LOWER() , and CONVERT() can prevent the use of indexes, as they change the
values of the columns. Instead, try to use these functions in the SELECT statement, after the data
has been filtered.
When using JOINs, make sure to specify the correct JOIN conditions. Using a wrong join type or
missing join conditions can result in a large number of unnecessary rows being returned, which
can negatively impact performance.
If you only need a limited number of rows from your query, use the LIMIT and OFFSET clauses to
specify the number of rows to return. This can help reduce the amount of data returned, improving
performance.
Choose the appropriate data type for each column in your table. For example, using an INT data
type for a column that only contains numbers with two decimal places may not be the best choice,
as it will result in more storage space being used than necessary.
To ensure data in the specific column is A column or group of columns in a relational database table provides a
unique. link between data in two tables.
Uniquely identifies a record in the
It refers to the field in a table which is the primary key of another table.
relational database table.
Only one primary key is allowed in a table. More than one foreign key is allowed in a table.
select * from df df
row_number()
over(partition by seller_id df.groupby([‘seller_id’])[‘order_date’].rank(method=’first’, ascending = False)
order by order_date)
rank() over(partition by
seller_id order by df.groupby(['seller_id'])['order_date'].rank(method='min')
order_date)
dense_rank() over(partition
by seller_id order by df.groupby(['seller_id'])['order_date'].rank(method='dense')
order_date)
sum(amount) over(partition
by seller_id, order_month
df.groupby(['seller_id', 'order_month'])['amount'].cumsum()
order by order_date rows
unbounded preceding)
avg(amount) over(partition
df.groupby(['seller_id', 'order_month'])['amount'].transform('mean').round(1)
by seller_id, order_month)
lag(sales, 1) over(partition
df.groupby('seller')['sales'].shift(-1)
by seller order by date)
avg(sales) over(order by
date rows between 6
df['sales'].rolling(7).mean().round(1)
preceding and current
row)