Chap_5
Chap_5
Chap_5
◼ SQL has long been the standard for commercial relational DBMSs
◼ A joint effort by:
◼ ANSI (the American National Standards Institute) and
◼ ISO (the International Standards Organization)
◼ SQL is a comprehensive database language which contains
statements for data definition, query, and update.
◼ Hence, SQL is both DDL and DML
◼ It also has facility for specifying security and authorization, for defining
integrity constraints, and for specifying transaction controls.
◼ It also has rules for embedding SQL statements into a general-
purpose programming language such as C or pascal.
2
Data Definition Language (DDL)
❑ The SQL DDL allows the specification of not only a set of relations but
also information about each relation, including:
The schema for each relation.
The domain of values associated with each attribute.
Integrity constraints
Also other information such as:
The set of indices (provides fast access to data items) to be
maintained for each relations.
Security and authorization information for each relation.
The physical storage structure of each relation on disk.
3
Create Table Construct
An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
r is the name of the relation
each Ai is an attribute name in the schema of relation r
Di is the data type of values in the domain of attribute Ai
Example:
create table branch
(branch-name char(15) not null,
branch-city varchar(30),
assets integer) 4
Integrity Constraints in Create Table
not null
primary key (A1, ..., An)
foreign key (Am, ..., An) references r
check (P), where P is a predicate
Example: Declare branch-name as the primary key for branch
and ensure that the values of assets are non-negative.
create table branch
(branch-name char(15) not null,
branch-city char(30)
assets integer,
primary key (branch-name),
check (assets >= 0))
◼ primary key declaration on an attribute automatically ensures not null in SQL-92
onwards. 5
Domain Types in SQL
char(n). Fixed length character string, with user-specified length n.
varchar(n). Variable length character strings, with user-specified maximum length n.
int. Integer (a finite subset of the integers that is machine-dependent).
smallint. Small integer (a machine-dependent subset of the integer domain type).
numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits
to the right of decimal point. numeric(3,1) → 44.5
real, double precision. Floating point and double-precision floating point numbers, with
machine-dependent precision.
float(n). Floating point number, with user-specified precision of at least n digits.
Null values are allowed in all the domain types. Declaring an attribute to be not null
prohibits null values for that attribute.
create domain construct in SQL-92 creates user-defined domain types
create domain person-name char(20) not null
6
Date/Time Types in SQL (Cont.)
date. Dates, containing a (4 digit) year, month and date
E.g. date ‘2016-7-27’
time. Time of day, in hours, minutes and seconds.
E.g. time ’09:00:30’ time ’09:00:30.75’
timestamp: date plus time of day
E.g. timestamp ‘2016-7-27 09:00:30.75’
Interval: period of time
E.g. Interval ‘1’ day
Subtracting a date/time/timestamp value from another gives an interval value
Interval values can be added to date/time/timestamp values
Can extract values of individual fields from date/time/timestamp
E.g. extract (year from r.starttime)
Can cast string types to date/time/timestamp
E.g. cast <string-valued-expression> as date
7
SQL Data Definition for Part of the Bank Database
branch= (branch-name, branch-city, assets)
depositor=(customer-name, account-number)
borrower=(customer-name, loan-number)
8
And a Few More Relation Definitions
create table student (
ID varchar(5),
name varchar(20) not null,
dept_name varchar(20),
tot_cred numeric(3,0),
primary key (ID),
foreign key (dept_name) references department);
10
Data Manipulation Language (DML)
Basic SQL Structure
The basic structure of an SQL expression consists of three clauses:
select, from, and where
A typical SQL has the form:
Select A1, A2, … ,An < attribute list >
From r1, r2, … , rm < table list >
Where P < condition>
➢ An represent attributes
➢ rm represent relations
➢ P is a predicate
➢ Equivalent Relational Structure:
A1, A2, … ,An (P (r1x r2 x … x rm))
• The result of SQL query is a relation without name 11
Banking Enterprise Schema
12
The Select Clause
The select clause is used to list the attributes desired in the result of a
query.
The from clause lists the relations to be scanned in the evaluation of the
expression.
Example: Find the name of all branches in the loan relation
select branch-name
from loan
The result is a relation consisting of a single attribute with the heading
branch-name
An asterisk (*) in the select clause denotes “all attributes”
select *
from loan
The result is a relation consisting of a all attributes of loan.
13
The Select clause
SQL allows duplicates in relations as well as in query results
To force the elimination of duplicates, insert the keyword distinct after
select clause.
Example: Find the name of all branches in the loan relations, and
remove duplicates
select distinct branch-name
from loan
The keyword all specifies that duplicates are not removed
select all branch-name
from loan
14
The Select clause
select loan-number
from loan
where amount between 90000 and 100000
16
The Where clause
17
The from Clause
The from clause lists the relations involved in the query
Corresponds to the Cartesian product operation of the relational algebra.
Find the Cartesian product instructor X teaches
select
from instructor, teaches
generates every possible instructor – teaches pair, with all attributes from both
relations.
For common attributes (e.g., ID), the attributes in the resulting table are
renamed using the relation name (e.g., instructor.ID)
Cartesian product not very useful directly, but useful combined with where-clause
condition (selection operation in relational algebra).
18
Cartesian Product
instructor teaches
19
Example involving
multiple tables
For all customers who have a loan from the bank, find their names, loan
numbers, and loan amount.
Find the name, loan number, and loan amount of all customers having a
loan at the Katy branch
Tuple variables are defined in the from clause via the use of the as
clause.
Find the customer names and their loan numbers for all customers
having a loan at any branch.
select customer-name, T.loan-number, S.amount
from borrower as T, loan as S
where T.loan-number = S.loan-number
Find the names of all branches that have greater assets than some
branch located in Brooklyn.
select distinct T.branch-name
from branch as T, branch as S
where T.assets > S.assets and S.branch-city = ‘ Brooklyn’
22
String Operations
SQL specifies strings by enclosing them in single quotes, for example:
‘Houston’
A single quote character that is part of a string can be specified by two
single quote characters;e.g., “It’s right” can be specified by ‘it’’s right’
The most commonly used operation on strings is pattern matching using the
operator like
SQL includes a string-matching operator for comparisons on character
strings. Patterns are described using two special characters:
percent (%). The % character matches any substring.
underscore (_). The _ character matches any character.
23
String Operations
Find the names of all customers whose street includes the substring
“Main”.
select customer-name
from customer
where customer-street like ‘%Main%’
The order by clause causes the tuples in the result of a query to appear
in sorted order.
Example: List in alphabetic order the name, loan number, and loan
amount of all customers having a loan at the Katy branch.
select customer-name, borrower.loan-number, amount
from borrower, loan
where borrower.loan-number = loan.loan-number and branch-name like ‘Katy’
order by customer-name
We may specify desc for descending order or asc for ascending order, for
each attribute; ascending order is the default.
E.g. order by customer-name desc
25
Set Operations
The set operations union, intersect, and except operate on relations and
correspond to the relational algebra operations −
Find all customers who have a loan, an account, or both:
(select customer-name from depositor)
union
(select customer-name from borrower)
Find all customers who have both a loan and an account.
(select customer-name from depositor)
intersect
(select customer-name from borrower)
Find all customers who have an account but no loan.
(select customer-name from depositor)
except
(select customer-name from borrower)
26
Aggregate Functions
27
Aggregate Functions (Cont.)
28
Aggregate Functions Group by
Group by clause helps to group a relation based on a set of attributes of
that relation.
Find the number of depositors for each branch.
29
Aggregate Functions –
Having Clause
HAVING clause: This clause is used in association with the GROUP BY clause. It is applied to
each group of results or the entire result as a single group.
It is much similar as WHERE clause but the only difference is you cannot use it without
GROUP BY clause.
Find the names and average balance of all branches where the average account balance is
more than $1,200.
Note: predicates in the having clause are applied after the formation of groups whereas
predicates in the where clause are applied before forming groups
30
Null values
31
Nested Sub Queries
Some queries require that existing values in the database be fetched
and then used in a comparison condition.
Such queries can be conveniently formulated by using nested queries.
SQL provides a mechanism for the nesting of subqueries.
A subquery is a select-from-where expression that is nested within
another query.
A common use of subqueries is to perform tests for set membership,
set comparisons, and set cardinality.
32
Example of Nested Query
Find all customers who have both an account and a loan at the bank.
select distinct customer-name
from borrower
where customer-name in (select customer-name
from depositor)
Find all customers who have a loan at the bank but do not have an
account at the bank
select distinct customer-name
from borrower
where customer-name not in ( select customer-name
from depositor)
33
Modification of the Database –
Deletion
SQL has the option to remove, insert, and change records of the database.
34
Example Query
Delete the record of all accounts with balances below the average at the
bank.
delete from account
where balance < (select avg (balance)
from account)
Problem: as we delete tuples from account, the average balance changes
Solution used in SQL:
1. First, compute avg balance and find all tuples to delete
2. Next, delete all tuples found above (without recomputing avg or retesting the
tuples)
35
Modification of the Database –
Insertion
Data can be inserted either by specifying a tuple to be inserted or
by writing a query whose result is a set of tuples.
Insert an account A-9732 at the Perryridge branch that has a
balance of $1200.
insert into account
values (‘A-9732’, ‘Perryridge’,1200)
The values are specified in order which they appear in the
relational schema.
38
Modification of the Database –
Updates
Increase all accounts with balances over $10,000 by 6%, all other accounts
receive 5%.
Write two update statements:
update account
set balance = balance 1.06
where balance > 10000
update account
set balance = balance 1.05
where balance 10000
Same query as before: Increase all accounts with balances over $10,000
by 6%, all other accounts receive 5%.
update account
set balance = case
when balance <= 10000 then balance*1.05
else balance * 1.06
end
40
Joined Relations
Join operations take two relations as input and return as a result a new
relation.
These additional operations are typically used as subquery expressions in
the from clause
Join condition: defines which tuples in the two relations match, and what
attributes are present in the result of the join.
Join type: defines how tuples in each relation that do not match any
tuple in the other relation (based on the join condition) are treated.
Relation loan
loan-number branch-name amount
L-170 Downtown 3000
L-230 Redwood 4000
L-260 Perryridge 1700
• Relation borrower
customer-name loan-number
Jones L-170
Smith L-230
Hayes L-155
• Note: borrower information missing for L-260 and loan
information missing for L-155 42
Joined Relations – Inner Join Examples
Select *
From loan inner join borrower on
loan.loan-number = borrower.loan-number
loan-number branch-name amount customer-name loan-number
Select *
From loan natural inner join borrower on
loan.loan-number = borrower.loan-number
loan-number branch-name amount customer-name
Find all customers who have either an account or a loan (but not both) at the bank.
select customer-name
from (depositor natural full outer join borrower)
where account-number is null or loan-number is null
45
Embedded SQL
The SQL standard defines embeddings of SQL in a variety of
programming languages such as Pascal, PL/I, Fortran, C, C++, Python,
and Java.
A language to which SQL queries are embedded is referred to as a host
language, and the SQL structures permitted in the host language
comprise embedded SQL.
EXEC SQL statement is used to identify embedded SQL request to the
preprocessor
EXEC SQL <embedded SQL statement > END-EXEC
Note: this varies by language. E.g. the Java embedding uses
# SQL { <SQL statement> } ;
All query processing is performed by the database system.
46
Embedded SQL (Cont...)
The open statement causes the query to be evaluated
EXEC SQL open c END-EXEC
The fetch statement causes the values of one tuple in the query result to
be placed on host language variables.
EXEC SQL fetch c into :cn, :cc END-EXEC
Repeated calls to fetch get successive tuples in the query result.
A variable called SQLSTATE in the SQL communication area gets set to
‘02000’ to indicate no more data is available
The close statement causes the database system to delete the temporary
relation that holds the result of the query.
EXEC SQL close c END-EXEC
Note: above details vary with language. E.g. the Java embedding defines Java
iterators to step through result tuples.
47
Dynamic SQL
Allows programs to construct and submit SQL queries at run time.
Example of the use of dynamic SQL from within a C program.
char * sqlprog = “update account
set balance = balance * 1.05
where account-number = ?”
The dynamic SQL program contains a ?, which is a place holder for a value
that is provided when the SQL program is executed.
48
Query Processing
49
49
Query Processing (Cont.)
Alternative ways of evaluating a given query
Equivalent expressions
Different algorithms for each operation
Cost difference between a good and a bad way of evaluating a query can
be enormous
Need to estimate the cost of operations
Depends critically on statistical information about relations which the database must
maintain
Need to estimate statistics for intermediate results to compute cost of complex
expressions
50
Application Architectures
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g., ODBC/JDBC) which allow SQL queries
to be sent to a database
Applications can be built using one of three architectures:
51
Application Architectures
Two Tier Architecture
52
Two-tier Model
E.g. Java code runs at client site and uses JDBC to communicate
with the backend database server to data manipulation.
Benefits:
Flexible, easy to design and implement
Problems:
Security: passwords available at client site, all database operation
possible
More code shipped to client
Not appropriate across organizations, or in large ones like
universities
53
Three Tier Model
CGI Program
Network
55
Examples
Schema:
person(driver-id, name, address)
car(license, model, year)
accident(report-number, date, location)
owns(driver-id, license)
participated(driver-id, license, report-number, damage-amnt)
56
Examples
Answer:
a. Find the total number of people who owned cars that were involved in accidents in
2016.
Note: this is not the same as the total number of accidents in 2016. We must count
people with several accidents only once.
57
Examples
c. Add a new accident to the database; assume any values for required attributes.
We assume the driver was “Jones,” although it could be someone else. Also, we assume
“Jones” owns one Toyota. We assume values “Berkeley” for location, ’2016-09-01’ for
date and date, 4007 for reportnumber and 3000 for damage amount.
First we must find the license of the given car. Then the participated and accident
relations must be updated in order to both record the accident and tie it to the given car.
insert into accident
values (4007, ’2016-09-01’, ’Berkeley’)
59
60