Postgresql (The Version of SQL We'Re Using)
Postgresql (The Version of SQL We'Re Using)
For example, you can filter text records such as title. The following code
returns all films with the title 'Metropolis':
SELECT title
FROM films
WHERE title = 'Metropolis';
Notice that the WHERE clause alwayscomes after the FROM statement!
Important: in PostgreSQL (the version of SQL we're using), you must
use single quotes with WHERE. The string inside these quotes are case
sensitive (for uppercase, lowercase, capitalize, etc. and even leading and
trailing spaces should not be there).
Note that in this course we will use <> and not != for the not equal
operator, as per the SQL standard.
The following query selects all details for films with a budget over 10,000 dollars
and duration over 100:
SELECT *
FROM films
WHERE budget > 10000 and duration < 100;
Get the title and release year of films released after 2000.
Page | 1
What if you want to select rows based on multiple conditions where some but
not all of the conditions need to be met? For this, SQL has the OR operator.
For example, the following returns all films released in either 1994 or 2000:
SELECT title
FROM films
WHERE release_year = 1994
OR release_year = 2000;
Note that you need to specify the column for every OR condition, so the
following is invalid:
SELECT title
FROM films
WHERE release_year = 1994 OR 2000; (This is invalid.)
When combining AND and OR, be sure to enclose the individual clauses in
parentheses, like so:
SELECT title
FROM films
WHERE (release_year = 1994 OR release_year = 1995)
AND (certification = 'PG' OR certification = 'R');
Otherwise, due to SQL's precedence rules, you may not get the results you're
expecting!
SELECT DISTINCT
Often your results will include many duplicate values. If you want to select all
the unique values from a column, you can use the DISTINCT keyword.
This might be useful if, for example, you're interested in knowing which
languages are represented in the films table:
SELECT DISTINCT language
FROM films;
BETWEEN
As you've learned, you can use the following query to get titles of all films
released in and between 1994 and 2000:
SELECT title
FROM films
WHERE release_year >= 1994
AND release_year <= 2000;
Checking for ranges like this is very common, so in SQL the BETWEEN keyword
provides a useful shorthand for filtering values within a specified range. This
query is equivalent to the one above:
SELECT title
FROM films
WHERE release_year
BETWEEN 1994 AND 2000;
It's important to remember that BETWEEN is inclusive, meaning the beginning and end values
are included in the results!
We can get the names of all kids between the ages of 2 and 12 from the United
States:
SELECT name
FROM kids
WHERE age BETWEEN 2 AND 12
AND nationality = 'USA';
Page | 3
Enter the IN operator! The IN operator allows you to specify multiple values in
a WHERE clause, making it easier and quicker to specify multiple OR conditions!
Get the title and language of all films which were in English, Spanish, or French.
Select title, language from films
where language IN ('English', 'Spanish', 'French')
Page | 4
The % wildcard will match zero, one, or many characters in text. For
example, the following query matches companies like 'Data', 'DataC'
'DataCamp', 'DataMind', and so on:
SELECT name
FROM companies
WHERE name LIKE 'Data%';
The _ wildcard will matcha single character. For example, the following
query matches companies like 'DataCamp', 'DataComp', and so on:
SELECT name
FROM companies
WHERE name LIKE 'DataC_mp';
You can also use the NOT LIKE operator to find records that don't match
the pattern you specify.
Get the names of people whose names have 'r' as the second letter. The
pattern you need is '_r%'.
Aggregate functions
Often, you will want to perform some calculation on the data in a database. SQL
provides a few functions, called aggregate functions, to help you out with this.
For example,
SELECT AVG(budget)
FROM films;
gives you the average value from the budget column of the films table.
Similarly, the MAX function returns the highest budget:
SELECT MAX(budget)
FROM films;
The SUM function returns the result of adding up the numeric values in a column:
SELECT SUM(budget)
FROM films;
You can probably guess what the MIN function does!
Combining aggregate functions with WHERE
Aggregate functions can be combined with the WHERE clause to gain further
insights from your data.
For example, to get the total budget of movies made in the year 2010 or later:
Page | 5
SELECT SUM(budget)
FROM films
WHERE release_year >= 2010;
A note on arithmetic
In addition to using aggregate functions, you can perform basic arithmetic with
symbols like +, -, *, and /.
So, for example, this gives a result of 12:
SELECT (4 * 3);
However, the following gives a result of 1:
SELECT (4 / 3);
SQL assumes that if you divide an integer by an integer, you want to get
an integer back. So be careful when dividing!
If you want more precision when dividing, you can add decimal places to your
numbers. For example,
SELECT (4.0 / 3.0) AS result;
gives you the result you would expect: 1.333.
This means that the following will erroneously result in 400.0:
SELECT 45 / 10 * 100.0;
This is because 45 / 10 evaluates to an integer (4), and not a decimal number
like we would expect.
So when you're dividing make sure at least one of your numbers has a decimal
place:
SELECT 45 * 100.0 / 10;
The above now gives the correct answer of 450.0 since the numerator (45 * 100.0)
of the division is now a decimal!
It's AS simple AS aliasing
You may have noticed in the first exercise of this chapter that the column name
of your result was just the name of the function you used. For example,
SELECT MAX(budget)
FROM films;
Page | 6
gives you a result with one column, named max. But what if you use two
functions like this?
SELECT MAX(budget), MAX(duration)
FROM films;
Well, then you'd have two columns named max, which isn't very useful!
To avoid situations like this, SQL allows you to do something called aliasing.
Aliasing simply means you assign a temporary name to something. To alias,
you use the AS keyword, which you've already seen earlier in this course.
For example, in the above example we could use aliases to make the result
clearer:
SELECT MAX(budget) AS max_budget,
MAX(duration) AS max_duration
FROM films;
2. Get the title and duration in hours for all films. The duration is in minutes, so you'll need to divide by
60.0 to get the duration in hours. Alias the duration in hours as duration_hours.
3. Get the percentage of people who are no longer alive. Alias the result as percentage_dead. Remember
to use 100.0 and not 100.
4. Get the number of years between the newest film and oldest film. Alias the result as difference.
5. Get the number of decades the films table covers. Alias the result as number_of_decades. The top half of
your fraction should be enclosed in parentheses.
In this chapter you'll learn how to sort and group your results to gain further
insight. Let's go!
Page | 7
ORDER BY
In SQL, the ORDER BY keyword is used to sort results in ascending or descending order according to
the values of one or more columns.
By default ORDER BY will sort in ascending order. If you want to sort the results in descending order,
you can use the DESC keyword. For example,
SELECT title
FROM films
ORDER BY release_year DESC;
gives you the titles of films sorted by release year, from newest to oldest.
SELECT name
FROM people
ORDER BY name;
Get the title of films released in 2000 or 2012, in the order they were released.
OR
select title from films where release_year IN (2000, 2012) order by release_yea
Get all details for all films except those released in 2015 and order them by duration.
SELECT name
FROM people
ORDER BY name DESC;
Page | 8
SELECT birthdate, name
FROM people
ORDER BY birthdate, name;
sorts on birth dates first (oldest to newest) and then sorts on the names in alphabetical
order. The order of columns is important!
GROUP BY
Now you know how to sort results! Often you'll need to aggregate results. For example, you might
want to count the number of male and female employees in your company. Here, what you want is
to group all the males together and count them, and group all the females together and count them.
In SQL, GROUP BY allows you to group a result by one or more columns, like so:
sex count
male 15
female 19
Note that you can combine GROUP BY with ORDER BY to group your results, calculate something about
them, and then order your results. For example,
sex count
female 19
male 15
Page | 9
because there are more females at our company than males. Note also that ORDER
BY always goes after GROUP BY.
Get the release year and average duration of all films, grouped by release year.
Now practice your new skills by combining GROUP BY and ORDER BY with some more aggregate
functions!
Make sure to always put the ORDER BY clause at the end of your query. You can't sort values that
you haven't calculated yet!
Get the release year and lowest gross earnings per release year.
Get the release year, country, and highest budget spent making a film for each year, for each
country. Sort your results by release year and country.
select release_year, max(budget), country from films group by release_year, country order by
release_year, country
SELECT release_year
FROM films
GROUP BY release_year
WHERE COUNT(title) > 10;
This means that if you want to filter based on the result of an aggregate function, you need another
way! That's where the HAVING clause comes in. For example,
SELECT release_year
FROM films
GROUP BY release_year
HAVING COUNT(title) > 10;
shows only those years in which more than 10 films were released.
In how many different years were more than 200 movies released?
count
236
Page | 10
count
203
209
225
221
214
252
238
220
224
226
227
260
Showing 13 out of 13 rows
ANSWER = 13
Page | 11
release_year avg_budget avg_gross
LIMIT
Remember, if you only want to return a certain number of results, you can use the LIMIT keyword to
limit the number of rows returned
Get the country, average budget, and average gross take of countries that have made more than 10 films.
Order the result by country name, and limit the number of results displayed to 5. You should alias the
averages as avg_budget and avg_gross respectively.
JOIN concept
There's one more concept we're going to introduce. You may have noticed that all your results so
far have been from just one table, e.g. films or people.
In the real world however, you will often want to query multiple tables. For example, what if you
want to see the IMDB score for a particular movie?
Page | 12
In this case, you'd want to get the ID of the movie from the films table and then use it to get IMDB
information from the reviews table. In SQL, this concept is known as a join, and a basic join is
shown in the editor to the right.
The query in the editor gets the IMDB score for the film To Kill a Mockingbird!
title imdb_score
Page | 13