Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SQL Project Population

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

SQL Project:

Population

A guide to the SQL analysis project on


population data

Ben Brumm
www.databasestar.com
SQL Project: Population

SQL Project: Population


This guide includes:

● an explanation of what this project is

● links to the file to download

● questions to answer

● SQL queries to answer each question and the results

This will help you follow along with the YouTube video and perform your own data analysis.

Let’s get right into it.

Table of Contents

The Project...................................................................................................................................................................................... 2
Download the Sample Data.............................................................................................................................................. 2
Importing Data.......................................................................................................................................................................2
Questions.........................................................................................................................................................................................3
Queries and Results.....................................................................................................................................................................4
Question 1............................................................................................................................................................................... 4
Question 2............................................................................................................................................................................... 5
Question 3............................................................................................................................................................................... 6
Question 4............................................................................................................................................................................... 7
Question 5............................................................................................................................................................................... 9
Question 6.............................................................................................................................................................................10
Question 7.............................................................................................................................................................................12
Question 8.............................................................................................................................................................................13
Conclusion.................................................................................................................................................................................... 14

www.DatabaseStar.com 1
SQL Project: Population

The Project
This project is about performing some data analysis on a set of data to answer some questions.

The data set is on worldwide population data. This is available from the OurWorldInData website
here: https://ourworldindata.org/population-growth

Download the Sample Data

To download the sample data:

Step 1: Visit the Population Growth page here: https://ourworldindata.org/population-growth

Step 2: Scroll down to the bottom of the chart in the "Explore data on Population Growth" section.

Step 3: Click the Download button

Step 4: Select Full CSV

Step 5: Save the CSV file to somewhere on your computer.

You will now have the sample data in a CSV file. The file contains about 18,000 rows, which includes:

● population data for over 200 countries


● one record every year from 1950 to 2021
● records for the worldwide population, different continents, and categories of countries.

Importing Data

The process to import data is different in each SQL editor and database is different, but the overall
process is the same:

1. Create a database
2. Start the data import wizard or process
3. Select the CSV file
4. Adjust any options as needed
5. Proceed with the import

I've created a few videos on importing CSV files on my YouTube channel, with more to come.

Here are the videos I have at the moment:

MySQL Workbench: https://www.youtube.com/watch?v=sfRwJH04QJc

Oracle SQL Developer: https://www.youtube.com/watch?v=GcB4_0Iz2Zw

DBeaver and Postgres (part of the video for this project): https://youtu.be/MfKRSf5x49E

www.DatabaseStar.com 2
SQL Project: Population

Questions
As part of the data analysis for this project, we started with a list of questions that we wanted to find
the answer to using SQL.

The questions are below.

These are just the questions. If you'd like the answers, which are the SQL queries and the results, refer
to the next section.

1. What is the population of people aged 90+ in each country?

2. Which countries have the highest population growth in the last year (number of people, and by
percentage)?

3. Which single country has the highest population decline in the last year?

4. Which age group has the highest population out of all countries in the last year?

5. What are the top 10 countries with the highest population growth in the last 10 years (based
on the population number not percentage)?

6. Which country has the highest percentage growth since the first year (1950)?

7. Which country has the highest population at age 1 as a percentage of their overall population?

8. What is the population of each continent in each year, and how much has it changed each year?

www.DatabaseStar.com 3
SQL Project: Population

Queries and Results


This section contains the questions, the SQL queries to find the answers, and the results from the data
set.

The queries have been written for PostgreSQL. Most of the queries should work on all other vendors,
but some may need some adjustments for vendor-specific features.

Question 1

Question:

What is the population of people aged 90+ in each country?

Query:

SELECT
p.country_name,
p.population_90_to_99 + p.population_100_above AS pop_90_above
FROM population_and_demography_csv p
WHERE p.population_year = 2021
AND p.record_type = 'Country'
ORDER BY p.country_name ASC;

Result:

country_name population_year pop_90_above


Afghanistan 2021 5,546
Albania 2021 11,158
Algeria 2021 40,430
American Samoa 2021 23
Andorra 2021 612
Angola 2021 7,192
Anguilla 2021 25
Antigua and Barbuda 2021 323
Argentina 2021 258,194
Armenia 2021 10,819
… … …

www.DatabaseStar.com 4
SQL Project: Population

Question 2

Question:

Which countries have the highest population growth in the last year (number of people, and by
percentage)?

Query:

SELECT
country_name,
population_2020,
population_2021,
population_2021 - population_2020 AS pop_growth_num,
ROUND(CAST((population_2021 - population_2020) AS decimal) /
population_2020 * 100, 2) AS pop_growth_pct
FROM (
SELECT
p.country_name,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2020
) AS population_2020,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2021
) AS population_2021
FROM population_and_demography_csv p
WHERE p.record_type = 'Country'
AND p.population_year = 2021
) sub
ORDER BY pop_growth_num DESC;

www.DatabaseStar.com 5
SQL Project: Population

Result:

population population pop_growth pop_growt


country_name _2020 _2021 _num h _pct
India 1,396,387,100 1,407,563,900 11,176,800 0.8
Nigeria 208,327,410 213,401,330 5,073,920 2.44
Pakistan 227,196,740 231,402,110 4,205,370 1.85
Ethiopia 117,190,920 120,283,020 3,092,100 2.64
Democratic Republic of Congo 92,853,170 95,894,120 3,040,950 3.28
Bangladesh 167,420,940 169,356,240 1,935,300 1.16
Indonesia 271,858,000 273,753,180 1,895,180 0.7
Tanzania 61,704,520 63,588,332 1,883,812 3.05
Egypt 107,465,130 109,262,184 1,797,054 1.67
Philippines 112,190,984 113,880,340 1,689,356 1.51
… … … … …

Question 3

Question:

Which single country has the highest population decline in the last year?

Query:

SELECT
country_name,
population_2020,
population_2021,
population_2021 - population_2020 AS pop_growth_num
FROM (
SELECT
p.country_name,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name

www.DatabaseStar.com 6
SQL Project: Population

AND p1.population_year = 2020


) AS population_2020,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2021
) AS population_2021
FROM population_and_demography_csv p
WHERE p.record_type = 'Country'
AND p.population_year = 2021
) sub
ORDER BY pop_growth_num ASC
LIMIT 1;

Result:

pop_growth pop_growth
country_name population _2020 population _2021 _num _pct
Japan 125,244,760 124,612,530 -632,230 -0.5

Question 4

Question:

Which age group has the highest population out of all countries in the last year?

Query:

SELECT
unnest(array[
'population_1_to_9',
'population_10_to_19',
'population_20_to_29',
'population_30_to_39',
'population_40_to_49',
'population_50_to_59',

www.DatabaseStar.com 7
SQL Project: Population

'population_60_to_69',
'population_70_to_79',
'population_80_to_89',
'population_90_to_99'
]) AS age_group,
unnest(array[
population_1_to_4 + population_5_to_9,
population_10_to_14 + population_15_to_19,
population_20_to_29,
population_30_to_39,
population_40_to_49,
population_50_to_59,
population_60_to_69,
population_70_to_79,
population_80_to_89,
population_90_to_99
]) AS population
FROM population_and_demography_csv p
WHERE p.country_name = 'World'
AND p.population_year = 2021
ORDER BY population DESC;

Result:

age_group population
population_10_to_19 1,283,495,100
population_1_to_9 1,223,679,900
population_20_to_29 1,194,528,500
population_30_to_39 1,165,207,300
population_40_to_49 976,407,200
population_50_to_59 851,356,900
population_60_to_69 598,067,140
population_70_to_79 330,491,170
population_80_to_89 131,835,590
population_90_to_99 22,223,974

www.DatabaseStar.com 8
SQL Project: Population

Question 5

Question:

What are the top 10 countries with the highest population growth in the last 10 years (based on the
population number, not percentage)?

Query:

SELECT
country_name,
population_2011,
population_2021,
population_2021 - population_2011 AS pop_growth_num
FROM (
SELECT
p.country_name,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2011
) AS population_2011,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2021
) AS population_2021
FROM population_and_demography_csv p
WHERE p.record_type = 'Country'
AND p.population_year = 2021
) sub
ORDER BY pop_growth_num DESC
LIMIT 10;

www.DatabaseStar.com 9
SQL Project: Population

Result:

population pop_growth_
country_name population _2011 _2021 num
India 1,257,621,200 1,407,563,900 149,942,700
China 1,357,095,400 1,425,893,500 68,798,100
Nigeria 165,463,740 213,401,330 47,937,590
Pakistan 198,602,740 231,402,110 32,799,370
Ethiopia 91,817,940 120,283,020 28,465,080
Democratic Republic of Congo 68,654,270 95,894,120 27,239,850
Indonesia 247,099,700 273,753,180 26,653,480
United States 313,876,600 336,997,630 23,121,030
Egypt 89,200,056 109,262,184 20,062,128
Bangladesh 150,211,000 169,356,240 19,145,240

Question 6

Question:

Which country has the highest percentage growth since the first year (1950)?

Query:

CREATE VIEW population_by_year AS


SELECT
country_name,
population_1950,
population_2011,
population_2020,
population_2021
FROM (
SELECT
p.country_name,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name

www.DatabaseStar.com 10
SQL Project: Population

AND p1.population_year = 1950


) AS population_1950,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2011
) AS population_2011,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2020
) AS population_2020,
(
SELECT p1.population
FROM population_and_demography_csv p1
WHERE p1.country_name = p.country_name
AND p1.population_year = 2021
) AS population_2021
FROM population_and_demography_csv p
WHERE p.record_type = 'Country'
AND p.population_year = 2021
) sub;

SELECT
p.country_name,
p.population_1950,
p.population_2021,
ROUND(CAST((population_2021 - population_1950) AS decimal) /
population_1950 * 100, 2) AS pop_growth_pct
FROM population_by_year p
ORDER BY pop_growth_pct DESC;

www.DatabaseStar.com 11
SQL Project: Population

Result:

country_name population _1950 population _2021 pop_growth _pct


United Arab Emirates 74,613 9,365,149 12,451.63
Qatar 24,310 2,688,239 10,958.16
Western Sahara 13,003 565,590 4,249.69
Sint Maarten (Dutch part) 1,480 44,061 2,877.09
Kuwait 153,754 4,250,111 2,664.23
Jordan 438,397 11,148,288 2,442.97
Djibouti 62,351 1,105,562 1,673.13
Mayotte 20,568 316,022 1,436.47
Andorra 6,028 79,057 1,211.50
French Guiana 23,420 297,462 1,170.12

Question 7

Question:

Which country has the highest population at age 1 as a percentage of their overall population?

Query:

SELECT
p.country_name,
p.population_at_1,
p.population,
ROUND(CAST(population_at_1 AS decimal) / population * 100, 2) AS
pop_ratio
FROM population_and_demography_csv p
WHERE p.record_type = 'Country'
AND p.population_year = 2021
ORDER BY pop_ratio DESC;

Result:

country_name population population_at_1 pop_ratio


Niger 25,252,722 1,037,010 4.11
Somalia 17,065,588 661,946 3.88

www.DatabaseStar.com 12
SQL Project: Population

Chad 17,179,744 667,304 3.88


Democratic Republic of Congo 95,894,120 3,656,823 3.81
Mali 21,904,990 825,292 3.77
Central African Republic 5,457,165 203,644 3.73
Angola 34,503,776 1,228,941 3.56
Uganda 45,853,780 1,580,685 3.45
Mayotte 316,022 10,736 3.4
Tanzania 63,588,332 2,157,051 3.39

Question 8

Question:

What is the population of each continent in each year, and how much has it changed each year?

Query:

SELECT
p.country_name,
p.population_year,
p.population,
p.population - LAG(p.population, 1) OVER(
PARTITION BY p.country_name
ORDER BY p.population_year ASC)
AS population_change
FROM population_and_demography_csv p
WHERE p.record_type = 'Continent'
ORDER BY p.country_name ASC, p.population_year ASC;

Result:

country_name population_year population population _change


Africa (UN) 1950 227,549,260 (null)
Africa (UN) 1951 232,484,000 227,549,260
Africa (UN) 1952 237,586,060 232,484,000
Africa (UN) 1953 242,837,440 237,586,060
Africa (UN) 1,954 248,244,770 242,837,440

www.DatabaseStar.com 13
SQL Project: Population

Africa (UN) 1,955 253,847,730 248,244,770


Africa (UN) 1,956 259,631,400 253,847,730
Africa (UN) 1,957 265,515,230 259,631,400
Africa (UN) 1,958 271,429,570 265,515,230
Africa (UN) 1,959 277,648,200 271,429,570

Conclusion
Hopefully, this guide has been useful to you. If you have any questions, let me know at
ben@databasestar.com.

Thanks,

Ben Brumm

www.DatabaseStar.com

www.DatabaseStar.com 14

You might also like