SQL Subqueries Guide Standardized
SQL Subqueries Guide Standardized
A subquery (or inner query) is a query nested inside another query, known as the
outer query. Subqueries allow you to perform more complex SQL operations by using
the result of one query inside another. They can appear in various clauses, such as
SELECT, FROM, WHERE, HAVING, etc.
Before diving into subqueries, let's first create a database and the necessary
tables.
To practice with subqueries, you can first create a sample database called
company_db with two tables: employees and departments.
Create Database:
```sql
CREATE DATABASE company_db;
```
---
A subquery can be used within a WHERE clause to filter rows based on the result of
another query.
Example:
```sql
SELECT name, salary
FROM employees
WHERE department_id = (SELECT department_id FROM departments WHERE name = 'Sales');
```
In this example, the subquery finds the department_id of the 'Sales' department,
and the outer query returns employees who work in that department.
---
You can use subqueries that return multiple rows, typically with operators like IN,
ANY, ALL, or EXISTS.
Example:
```sql
SELECT name, salary
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE region_id = 2);
```
Here, the subquery returns multiple department_ids, and the outer query retrieves
employees in any of those departments.
---
A correlated subquery is evaluated once for each row processed by the outer query.
It uses values from the outer query to produce results.
Example:
```sql
SELECT name, salary
FROM employees e
WHERE salary > (SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id);
```
This query compares each employee's salary with the average salary of employees in
the same department.
---
Step 5: Using Subqueries in the FROM Clause
Subqueries can be used in the FROM clause, often referred to as a derived table.
Example:
```sql
SELECT department_id, AVG(salary)
FROM (SELECT department_id, salary
FROM employees
WHERE salary > 50000) AS high_salaries
GROUP BY department_id;
```
In this query, the subquery creates a derived table of employees with salaries
greater than 50,000, and the outer query calculates the average salary for each
department.
---
Subqueries can also be used in the SELECT clause to compute values for each row.
Example:
```sql
SELECT name,
(SELECT MAX(salary)
FROM employees
WHERE department_id = e.department_id) AS max_salary_in_dept
FROM employees e;
```
In this example, for each employee, the subquery returns the highest salary in
their department.
---
Step 7: EXISTS vs IN
- Correlated Subquery: A subquery that depends on the outer query (the inner query
uses columns from the outer query).
Example:
```sql
SELECT name
FROM employees e
WHERE EXISTS (SELECT 1 FROM departments d WHERE d.department_id =
e.department_id);
```
Example:
```sql
SELECT name
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
```
---
A scalar subquery returns a single value. These can be used in places where a
single value is expected, such as in the SELECT clause or WHERE clause.
Example:
```sql
SELECT name,
(SELECT COUNT(*) FROM employees WHERE department_id = e.department_id) AS
employee_count
FROM employees e;
```
This scalar subquery counts the number of employees in each employee's department.
---
You can use subqueries within the HAVING clause to filter group-based results.
Example:
```sql
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (SELECT AVG(salary) FROM employees);
```
This query returns departments where the average salary is higher than the overall
company average.
---
Step 11: Performance Considerations
- Use indexed columns: Ensure that columns in the subquery are indexed for faster
lookups.
- Avoid correlated subqueries if possible: These can be slower because they execute
the subquery for each row of the outer query.
- Consider JOIN over subqueries: In many cases, using JOIN instead of a subquery
can improve performance by reducing the number of queries.
---
Conclusion
Subqueries are a powerful tool in SQL that allows you to make your queries more
flexible and complex. They are particularly useful when you need to filter data
based on the results of another query or when comparing multiple datasets.
Understanding when and how to use different types of subqueries (single-row,
multiple-row, correlated, uncorrelated) will help you write efficient and optimized
SQL queries.