SQL for Data Analysis.pdf
SQL for Data Analysis.pdf
Introduction to SQL:
• What is SQL?
SQL (Structured Query Language) is a programming
language used to manage and manipulate relational databases. It
allows users to store, retrieve, update and delete data efficiently.
SQL is widely used in data analytics, business intelligence, and data
science.
• Why is SQL Crucial for Data Analytics?
SQL is essential for data analytics because:
o It allows efficient data retrieval from large databases.
o It enables data transformation and cleaning for analysis.
o It supports aggregations and calculations for generating
insights.
o It helps in joining multiple datasets to uncover
relationships.
o It integrates with BI tools like Power BI & Tableau for
reporting.
How SQL Helps in Data Analytics
➢ Data Extraction – Query large datasets efficiently.
➢ Data Cleaning – Remove duplicates, filter values, and
standardize data.
➢ Data Aggregation – Summarize data with SUM(),
AVG(), etc.
➢ Data Relationships – Use joins to merge multiple tables.
➢ Trend Analysis – Apply window functions to track
changes over time.
Database Concepts
• Tables: Collections of related data stored in rows and columns.
• Rows (Records): Individual data entries in a table.
• Columns (Fields): Specific attributes of the data.
Role of SQL in Querying Databases
SQL is used to interact with databases in three main ways:
1️ Retrieving Data:
➢ SQL helps fetch relevant data using SELECT queries.
• Basic SQL Syntax
SELECT column_name(s) FROM table_name WHERE
condition;
➢ SELECT – Choose columns to retrieve.
➢ FROM – Specify the table.
➢ WHERE – Filter records based on conditions.
1. Create database
Code: create database sample2
2. Use the database
Code: use sample2
3. Create table
Code: create table customer
(
customerid int identity(1,1) primary key,
customernumber int not null unique check (customernumber>0),
lastname varchar(30) not null,
firstname varchar(30) not null,
areacode int default 71000,
address varchar(50),
country varchar(50) default 'Malaysia'
)
8. Delete a column
Code: alter table customer
drop column phonenumber
9. Delete record from table --if not put ‘where’, will delete all record
Code: delete
from customer
where country='Thailand'
10. Delete table
Code: drop table customer
Data Manipulation
JOINS: Combining Data from Multiple Tables
JOIN is used to combine rows from two or more tables based on a
related column. This helps in fetching meaningful insights from
multiple datasets.
Types of SQL JOINs
JOIN Type Description
INNER JOIN Returns only matching rows from both tables.
LEFT JOIN Returns all rows from the left table and matching
rows from the right table.
RIGHT JOIN Returns all rows from the right table and matching
rows from the left table.
FULL OUTER Returns all rows when there is a match in either left
or right table.