SQL
SQL
1
CONTENTS
Data Definition
Basic Query Structure
Set Operations
Aggregate Functions
Null Values
Nested Subqueries
Complex Queries
Views
Joined Relations
2
HISTORY
IBM Sequel language developed as part of System R project at the IBM San
Jose Research Laboratory
Renamed Structured Query Language (SQL)
ANSI and ISO standard SQL:
SQL-86
SQL-89
SQL-92
SQL:1999 (language name became Y2K compliant!)
SQL:2003
Commercial systems offer most, if not all, SQL-92 features, plus varying
feature sets from later standards and special proprietary features.
Not all examples here may work on your particular system.
3
DATA DEFINITION LANGUAGE
Allows the specification of:
The schema for each relation, including attribute types.
Integrity constraints
4
CREATE TABLE CONSTRUCT
An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
r is the name of the relation
each Ai is an attribute name in the schema of relation r
Di is the data type of attribute Ai
Example:
create table branch
(branch_name char(15),
branch_city char(30),
assets integer)
5
DOMAIN TYPES IN SQL
char(n). Fixed length character string, with user-specified length n.
varchar(n). Variable length character strings, with user-specified
maximum length n.
int. Integer (a finite subset of the integers that is machine-dependent).
smallint. Small integer (a machine-dependent subset of the integer domain
type).
numeric(p,d). Fixed point number, with user-specified precision of p
digits, with n digits to the right of decimal point.
real, double precision. Floating point and double-precision floating point
numbers, with machine-dependent precision.
float(n). Floating point number, with user-specified precision of at least n
digits.
6
INTEGRITY CONSTRAINTS ON TABLES
not null
primary key (A1, ..., An )
8
DROP AND ALTER TABLE CONSTRUCTS
The drop table command deletes all information about the dropped relation
from the database.
The alter table command is used to add attributes to an existing relation:
alter table r add A D
where A is the name of the attribute to be added to relation r and D is the
domain of A.
All tuples in the relation are assigned null as the value for the new attribute.
The alter table command can also be used to drop attributes of a relation:
alter table r drop A
where A is the name of an attribute of relation r
Dropping of attributes not supported by many databases
9
BASIC QUERY STRUCTURE
A typical SQL query has the form:
select A1, A2, ..., An
from r1, r2, ..., rm
where P
Ai represents an attribute
Ri represents a relation
P is a predicate.
This query is equivalent to the relational algebra expression.
10
THE SELECT CLAUSE
The select clause list the attributes desired in the result of a
query
corresponds to the projection operation of the relational algebra
Example: find the names of all branches in the loan relation:
select branch_name
from loan
In the relational algebra, the query would be:
branch_name (loan)
NOTE: SQL names are case insensitive (i.e., you may use
upper- or lower-case letters.)
E.g.Branch_Name ≡ BRANCH_NAME ≡ branch_name 11
Some people use upper case wherever we use bold font.
THE SELECT CLAUSE (CONTD..)
SQL allows duplicates in relations as well as in query results.
To force the elimination of duplicates, insert the keyword distinct
after select.
Find the names of all branches in the loan relations, and remove
duplicates
select distinct branch_name from loan
The keyword all specifies that duplicates not be removed.
select all branch_name from loan
12
THE SELECT CLAUSE (CONTD..)
An asterisk in the select clause denotes “all attributes”
select * from loan
The select clause can contain arithmetic expressions
13
THE WHERE CLAUSE
The where clause specifies conditions that the result must satisfy
Corresponds to the selection predicate of the relational algebra.
To find all loan number for loans made at the Perryridge branch
with loan amounts greater than $1200.
select loan_number from loan
where branch_name = 'Perryridge' and amount > 1200
Comparison results can be combined using the logical connectives
and, or, and not.
14
THE FROM CLAUSE
The from clause lists the relations involved in the query
Corresponds to the Cartesian product operation of the relational algebra.
Find the Cartesian product borrower X loan
select from borrower, loan
Find the name, loan number and loan amount of all customers
having a loan at the Perryridge branch.
select customer_name, borrower.loan_number, amount
from borrower, loan
where borrower.loan_number = loan.loan_number and
branch_name = 'Perryridge'
15
THE RENAME OPERATION
SQL allows renaming relations and attributes using the
as clause:
old-name as new-name
E.g. Find the name, loan number and loan amount of all
customers; rename the column name loan_number as
loan_id.
select customer_name, borrower.loan_number as
loan_id, amount
from borrower, loan
where borrower.loan_number = loan.loan_number
16
TUPLE VARIABLES
Tuple variables are defined in the from clause via the
use of the as clause.
Find the customer names and their loan numbers and
amount for all customers having a loan at some branch.
select customer_name, T.loan_number, S.amount
from borrower as T, loan as S
where T.loan_number = S.loan_number
17
TUPLE VARIABLES (CONTD..)
Find the names of all branches that have greater assets
than some branch located in Brooklyn
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and S.branch_city =
'Brooklyn’
Keyword as is optional and may be omitted
borrower as T ≡ borrower T
Some database such as Oracle require as to be omitted
18
STRING OPERATIONS
SQL includes a string-matching operator for comparisons on character
strings. The operator “like” uses patterns that are described using two
special characters:
percent (%). The % character matches any substring.
underscore (_). The _ character matches any character.
Find the names of all customers whose street includes the substring “Main”.
select customer_name
from customer
where customer_street like '% Main%'
Match the name “Main%”
like 'Main\%' escape '\'
SQL supports a variety of string operations such as
concatenation (using “||”)
converting from upper to lower case (and vice versa)
finding string length, extracting substrings, etc.
19
ORDERING THE DISPLAY OF TUPLES
List in alphabetic order the names of all customers
having a loan in Perryridge branch
select distinct customer_name
from borrower, loan
where borrower loan_number = loan.loan_number
and
branch_name = 'Perryridge'
order by customer_name
We may specify desc for descending order or asc for
ascending order, for each attribute; ascending order is the
default.
Example: order by customer_name desc 20
DUPLICATES
In relations with duplicates, SQL can define how many
copies of tuples appear in the result.
Multiset versions of some of the relational algebra
operators – given multiset relations r1 and r2:
1. (r1): If there are c1 copies of tuple t1 in r1, and t1 satisfies
selections ,, then there are c1 copies of t1 in (r1).
2. A (r ): For each copy of tuple t1 in r1, there is a copy of tuple
A (t1) in A (r1) where A (t1) denotes the projection of the
single tuple t1.
3. r1 x r2 : If there are c1 copies of tuple t1 in r1 and c2 copies of
tuple t2 in r2, there are c1 x c2 copies of the tuple t1. t2 in r1 x r2 21
DUPLICATES (CONTD..)
Example: Suppose multiset relations r1 (A, B) and r2 (C)
are as follows:
r1 = {(1, a) (2,a)} r2 = {(2), (3), (3)}
Then B(r1) would be {(a), (a)}, while B(r1) x r2 would be
{(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}
SQL duplicate semantics:
25
AGGREGATE FUNCTIONS (CONTD..)
Find the average account balance at the Perryridge
branch.
select avg (balance)
from account
where branch_name = 'Perryridge'
Find the number of tuples in the customer relation.
27
AGGREGATE FUNCTIONS – HAVING
CLAUSE
Find the names of all branches where the average
account balance is more than $1,200.
select branch_name, avg (balance)
from account
group by branch_name
having avg (balance) > 1200
Note: Predicates in the having clause are applied after the
formation of groups whereas predicates in the where
28
NESTED SUBQUERIES
SQL provides a mechanism for the nesting of subqueries.
A subquery is a select-from-where expression that is
nested within another query.
A common use of subqueries is to perform tests for set
membership, set comparisons, and set cardinality.
29
“IN” CONSTRUCT
Find all customers who have both an account and a loan
at the bank.
select distinct customer_name
from borrower
where customer_name in (select customer_name
from depositor )
Find all customers who have a loan at the bank but do
not have an account at the bank.
select distinct customer_name
from borrower
where customer_name not in (select customer_name
30
from depositor )
EXAMPLE QUERY
Find all customers who have both an account and a loan at the
Perryridge branch.
select distinct customer_name
from borrower, loan
where borrower.loan_number = loan.loan_number and
branch_name = 'Perryridge' and
(branch_name, customer_name ) in
(select branch_name, customer_name
from depositor, account
where depositor.account_number =
account.account_number )
Note: Above query can be written in a much simpler manner. The
31
formulation above is simply to illustrate SQL features.
“SOME” CONSTRUCT
Find all branches that have greater assets than some branch
located in Brooklyn.
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and
S.branch_city = 'Brooklyn'
Same query using > some clause
select branch_name
from branch
where assets > some
(select assets
from branch 32
where branch_city = 'Brooklyn')
“ALL” CONSTRUCT
Find the names of all branches that have greater assets
than all branches located in Brooklyn.
select branch_name
from branch
where assets > all
(select assets
from branch
where branch_city = 'Brooklyn')
33
“EXISTS” CONSTRUCT
Find all customers who have an account at all branches located in
Brooklyn.
select distinct S.customer_name
from depositor as S
where not exists (
(select branch_name
from branch
where branch_city = 'Brooklyn')
except
(select R.branch_name
from depositor as T, account as R
where T.account_number = R.account_number and
S.customer_name = T.customer_name ))
Note that X – Y = Ø X Y
34
Note: Cannot write this query using = all and its variants
ABSENCE OF DUPLICATE TUPLES
The unique construct tests whether a subquery has any
duplicate tuples in its result.
Find all customers who have at most one account at the
Perryridge branch.
select T.customer_name
from depositor as T
where unique (
select R.customer_name
from account, depositor as R
where T.customer_name = R.customer_name and
R.account_number = account.account_number
and 35
account.branch_name = 'Perryridge')
EXAMPLE QUERY
Find all customers who have at least two accounts at the
Perryridge branch.
select distinct T.customer_name
from depositor as T
where not unique (
select R.customer_name
from account, depositor as R
where T.customer_name = R.customer_name and
R.account_number = account.account_number and
account.branch_name = 'Perryridge')
Variable from outer level is known as a correlation variable 36
MODIFICATION OF THE DATABASE –
DELETION
Delete all account tuples at the Perryridge branch
delete from account
where branch_name = 'Perryridge'
Delete all accounts at every branch located in the city
‘Needham’.
delete from account
where branch_name in (select branch_name
from branch
where branch_city = 'Needham')
37
EXAMPLE QUERY
Delete the record of all accounts with balances below the
average at the bank.
delete from account
where balance < (select avg (balance )
from account )
Problem: as we delete tuples from deposit, the average balance
changes
Solution used in SQL:
1. First, compute avg balance and find all tuples to delete
2. Next, delete all tuples found above (without recomputing avg or
retesting the tuples)
38
MODIFICATION OF THE DATABASE –
INSERTION
Add a new tuple to account
insert into account
values ('A-9732', 'Perryridge', 1200)
or equivalently
insert into account (branch_name, balance,
account_number)
values ('Perryridge', 1200, 'A-9732')
Add a new tuple to account with balance set to null
update account
set balance = balance 1.05
where balance 10000
The order is important
Can be done better using the case statement (next slide)
41
CASE STATEMENT FOR CONDITIONAL
UPDATES
Same query as before: Increase all accounts with
balances over $10,000 by 6%, all other accounts receive
5%.
update account
set balance = case
when balance <= 10000 then balance *1.05
else balance * 1.06
end
42