Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chap_5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

CSCI 5333: DBMS

Chap 3: Basic Structured Query


Language (SQL)
Lecture Contents:
• Data Definition Language (DDL)
• Create new tables
• Applying constraints on table
• Data types
• Alter or drop tables
• Data Manipulation Language (DML)
• Basic SQL Structure
• Rename, String, Ordering Operations
• Aggregate Functions
• Nested Sub Queries
• Modification of the Database
• Joined Relations 1
SQL Background

◼ SQL has long been the standard for commercial relational DBMSs
◼ A joint effort by:
◼ ANSI (the American National Standards Institute) and
◼ ISO (the International Standards Organization)
◼ SQL is a comprehensive database language which contains
statements for data definition, query, and update.
◼ Hence, SQL is both DDL and DML
◼ It also has facility for specifying security and authorization, for defining
integrity constraints, and for specifying transaction controls.
◼ It also has rules for embedding SQL statements into a general-
purpose programming language such as C or pascal.
2
Data Definition Language (DDL)
❑ The SQL DDL allows the specification of not only a set of relations but
also information about each relation, including:
 The schema for each relation.
 The domain of values associated with each attribute.
 Integrity constraints
 Also other information such as:
 The set of indices (provides fast access to data items) to be
maintained for each relations.
 Security and authorization information for each relation.
 The physical storage structure of each relation on disk.

3
Create Table Construct
 An SQL relation is defined using the create table command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
 r is the name of the relation
 each Ai is an attribute name in the schema of relation r
 Di is the data type of values in the domain of attribute Ai

 Example:
create table branch
(branch-name char(15) not null,
branch-city varchar(30),
assets integer) 4
Integrity Constraints in Create Table
 not null
 primary key (A1, ..., An)
 foreign key (Am, ..., An) references r
 check (P), where P is a predicate
Example: Declare branch-name as the primary key for branch
and ensure that the values of assets are non-negative.
create table branch
(branch-name char(15) not null,
branch-city char(30)
assets integer,
primary key (branch-name),
check (assets >= 0))
◼ primary key declaration on an attribute automatically ensures not null in SQL-92
onwards. 5
Domain Types in SQL
 char(n). Fixed length character string, with user-specified length n.
 varchar(n). Variable length character strings, with user-specified maximum length n.
 int. Integer (a finite subset of the integers that is machine-dependent).
 smallint. Small integer (a machine-dependent subset of the integer domain type).
 numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits
to the right of decimal point. numeric(3,1) → 44.5
 real, double precision. Floating point and double-precision floating point numbers, with
machine-dependent precision.
 float(n). Floating point number, with user-specified precision of at least n digits.
 Null values are allowed in all the domain types. Declaring an attribute to be not null
prohibits null values for that attribute.
 create domain construct in SQL-92 creates user-defined domain types
create domain person-name char(20) not null

6
Date/Time Types in SQL (Cont.)
 date. Dates, containing a (4 digit) year, month and date
 E.g. date ‘2016-7-27’
 time. Time of day, in hours, minutes and seconds.
 E.g. time ’09:00:30’ time ’09:00:30.75’
 timestamp: date plus time of day
 E.g. timestamp ‘2016-7-27 09:00:30.75’
 Interval: period of time
 E.g. Interval ‘1’ day
 Subtracting a date/time/timestamp value from another gives an interval value
 Interval values can be added to date/time/timestamp values
 Can extract values of individual fields from date/time/timestamp
 E.g. extract (year from r.starttime)
 Can cast string types to date/time/timestamp
 E.g. cast <string-valued-expression> as date
7
SQL Data Definition for Part of the Bank Database
branch= (branch-name, branch-city, assets)

customer= (customer-name, customer-street, cust-city)

account=(account-number, branch-name, balance)

loan= (loan-number, branch-name, amount)

depositor=(customer-name, account-number)

borrower=(customer-name, loan-number)

8
And a Few More Relation Definitions
 create table student (
ID varchar(5),
name varchar(20) not null,
dept_name varchar(20),
tot_cred numeric(3,0),
primary key (ID),
foreign key (dept_name) references department);

 create table takes (


ID varchar(5), Note: sec_id can be dropped from primary
course_id varchar(8), key above but it helps to ensure a student
sec_id varchar(8), cannot be registered for two sections of
semester varchar(6), the same course in the same semester.
year numeric(4,0),
grade varchar(2),
primary key (ID, course_id, sec_id, semester, year),
foreign key (ID) references student,
foreign key (course_id, sec_id, semester, year) references section);
Drop and Alter Table Constructs
 The drop table command deletes all information about the dropped
relation from the database. Ex: drop table student
 The alter table command is used to add attributes to an existing relation.
All tuples in the relation are assigned null as the value for the new
attribute. The form of the alter table command is
alter table r add A D
where A is the name of the attribute to be added to relation r and D is
the domain of A.
 The alter table command can also be used to drop attributes of a relation
alter table r drop A
where A is the name of an attribute of relation r
 Dropping of attributes not supported by many databases

10
Data Manipulation Language (DML)
Basic SQL Structure
 The basic structure of an SQL expression consists of three clauses:
select, from, and where
 A typical SQL has the form:
Select A1, A2, … ,An < attribute list >
From r1, r2, … , rm < table list >
Where P < condition>
➢ An represent attributes
➢ rm represent relations
➢ P is a predicate
➢ Equivalent Relational Structure:
 A1, A2, … ,An (P (r1x r2 x … x rm))
• The result of SQL query is a relation without name 11
Banking Enterprise Schema

The following relation schemas will be used for queries:

Account = (account-number, branch-name, balance)


Branch = (branch-name, branch-city, assets)
Customer = (customer-name, cust-street, cust-city)
Loan = (loan-number, branch-name, amount)
Borrower = (customer-name, loan-number)
Depositor = (customer-name, account-number)

12
The Select Clause
 The select clause is used to list the attributes desired in the result of a
query.
 The from clause lists the relations to be scanned in the evaluation of the
expression.
 Example: Find the name of all branches in the loan relation
select branch-name
from loan
 The result is a relation consisting of a single attribute with the heading
branch-name
 An asterisk (*) in the select clause denotes “all attributes”
select *
from loan
 The result is a relation consisting of a all attributes of loan.
13
The Select clause
 SQL allows duplicates in relations as well as in query results
 To force the elimination of duplicates, insert the keyword distinct after
select clause.
 Example: Find the name of all branches in the loan relations, and
remove duplicates
select distinct branch-name
from loan
 The keyword all specifies that duplicates are not removed
select all branch-name
from loan

14
The Select clause

 The select clause can contain arithmetic expressions involving the


operation, +, -, *, and / operating on constants or attributes of tuples.
 Example:
select loan-number, branch-name, amount*100
from loan
 Result of the query is a relation which is the same as the loan relation,
except that the attribute amount is multiplied by 100.
 NOTE: SQL names are case insensitive (i.e., you may use upper- or
lower-case letters.)
 E.g., Name ≡ NAME ≡ name
 Some people use upper case wherever we use bold font.
15
The Where clause

 The where clause consists of a predicate (condition) involving


attributes of the relations that appear in the from clause.
 Example: Find the loan-number of those loan amounts between
$90,000 and $100,000 (that is,  $90,000 and  $100,000)

select loan-number
from loan
where amount between 90000 and 100000

16
The Where clause

 Where clause allows Tuple comparison

select name, course_id


from instructor, teaches
where (instructor.ID, dept_name) = (teaches.ID, ’Biology’);

17
The from Clause
 The from clause lists the relations involved in the query
 Corresponds to the Cartesian product operation of the relational algebra.
 Find the Cartesian product instructor X teaches
select 
from instructor, teaches
 generates every possible instructor – teaches pair, with all attributes from both
relations.
 For common attributes (e.g., ID), the attributes in the resulting table are
renamed using the relation name (e.g., instructor.ID)
 Cartesian product not very useful directly, but useful combined with where-clause
condition (selection operation in relational algebra).

18
Cartesian Product
instructor teaches

19
Example involving
multiple tables

 For all customers who have a loan from the bank, find their names, loan
numbers, and loan amount.

select customer-name, borrower.loan-number, amount


from borrower, loan
where borrower.loan-number = loan.loan-number

 Find the name, loan number, and loan amount of all customers having a
loan at the Katy branch

select customer-name, borrower.loan-number, amount


from borrower, loan
where borrower.loan-number = loan.loan-number and branch-name = ‘Katy’
20
The Rename Operation
 The SQL allows renaming relations and attributes using as clause.
old-name as new-name
 The as clause can appear in both select and from clauses
 Consider the following example:
select customer-name, borrower.loan-number, amount
from borrower, loan
where borrower.loan-number = loan.loan-number
 The result of the query is a relation with the following attributes:
customer-name, loan-number, amount
 Now if we want the attribute name loan-number to be replaced with loan-id, we can
rewrite the query as:
select customer-name, borrower.loan-number as loan-id, amount
from borrower, loan
where borrower.loan-number = loan.loan-number
21
The Rename Operation
Declaring Tuple Variables

 Tuple variables are defined in the from clause via the use of the as
clause.
 Find the customer names and their loan numbers for all customers
having a loan at any branch.
select customer-name, T.loan-number, S.amount
from borrower as T, loan as S
where T.loan-number = S.loan-number

 Find the names of all branches that have greater assets than some
branch located in Brooklyn.
select distinct T.branch-name
from branch as T, branch as S
where T.assets > S.assets and S.branch-city = ‘ Brooklyn’
22
String Operations
 SQL specifies strings by enclosing them in single quotes, for example:
‘Houston’
 A single quote character that is part of a string can be specified by two
single quote characters;e.g., “It’s right” can be specified by ‘it’’s right’
 The most commonly used operation on strings is pattern matching using the
operator like
 SQL includes a string-matching operator for comparisons on character
strings. Patterns are described using two special characters:
 percent (%). The % character matches any substring.
 underscore (_). The _ character matches any character.

23
String Operations
 Find the names of all customers whose street includes the substring
“Main”.
select customer-name
from customer
where customer-street like ‘%Main%’

 SQL allows escape character( \ ) for handling special pattern characters


(that is, %, _ ) in a pattern. The escape character is used immediately
before a special pattern character.

 Match the name “Main_”


like ‘Main\_’ escape ‘\’ matches all string beginning with “Main_”
like ‘ab\\cd’ escape ‘\’ matches all string beginning with “ab\cd”
24
Ordering the Display of Tuples

 The order by clause causes the tuples in the result of a query to appear
in sorted order.
 Example: List in alphabetic order the name, loan number, and loan
amount of all customers having a loan at the Katy branch.
select customer-name, borrower.loan-number, amount
from borrower, loan
where borrower.loan-number = loan.loan-number and branch-name like ‘Katy’
order by customer-name
 We may specify desc for descending order or asc for ascending order, for
each attribute; ascending order is the default.
 E.g. order by customer-name desc

25
Set Operations
 The set operations union, intersect, and except operate on relations and
correspond to the relational algebra operations   −
 Find all customers who have a loan, an account, or both:
(select customer-name from depositor)
union
(select customer-name from borrower)
 Find all customers who have both a loan and an account.
(select customer-name from depositor)
intersect
(select customer-name from borrower)
 Find all customers who have an account but no loan.
(select customer-name from depositor)
except
(select customer-name from borrower)
26
Aggregate Functions

 These functions operate on the multiset of values of a column of a


relation, and return a value.
 That is, these functions take a collection of values as input and return
a single value.
 SQL offers five built-in aggregate functions:
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values

27
Aggregate Functions (Cont.)

 Find the average account balance at the Katy branch.


select avg (balance) as avg_balance
from account
where branch-name like ‘Katy’
 Find the number of tuples in the customer relation.
select count (*)
from customer
 Find the number of depositors in the bank.
select count (distinct customer-name)
from depositor

28
Aggregate Functions Group by
 Group by clause helps to group a relation based on a set of attributes of
that relation.
 Find the number of depositors for each branch.

select branch-name, count (distinct customer-name)


from depositor, account
where depositor.account-number = account.account-number
group by branch-name

Note: Attributes in select clause outside of aggregate functions must


appear in group by list

29
Aggregate Functions –
Having Clause
 HAVING clause: This clause is used in association with the GROUP BY clause. It is applied to
each group of results or the entire result as a single group.
 It is much similar as WHERE clause but the only difference is you cannot use it without
GROUP BY clause.
 Find the names and average balance of all branches where the average account balance is
more than $1,200.

select branch_name, avg (balance) as avg_balance


from account
group by branch_name
having avg (balance) > 1200

Note: predicates in the having clause are applied after the formation of groups whereas
predicates in the where clause are applied before forming groups

30
Null values

 SQL allows the use of null values to indicate absence of information


about the value of an attribute
select loan-number
from loan
where amount is null
 The predicate not null tests for the absence of a null value

31
Nested Sub Queries
 Some queries require that existing values in the database be fetched
and then used in a comparison condition.
 Such queries can be conveniently formulated by using nested queries.
 SQL provides a mechanism for the nesting of subqueries.
 A subquery is a select-from-where expression that is nested within
another query.
 A common use of subqueries is to perform tests for set membership,
set comparisons, and set cardinality.

32
Example of Nested Query

 Find all customers who have both an account and a loan at the bank.
select distinct customer-name
from borrower
where customer-name in (select customer-name
from depositor)

 Find all customers who have a loan at the bank but do not have an
account at the bank
select distinct customer-name
from borrower
where customer-name not in ( select customer-name
from depositor)
33
Modification of the Database –
Deletion
 SQL has the option to remove, insert, and change records of the database.

 Delete all account records at the Katy branch


delete from account
where branch-name like ‘Katy’
 Delete all accounts at every branch located in Houston city.
delete from account
where branch-name in (select branch-name
from branch
where branch-city like ‘Houston’)

34
Example Query
 Delete the record of all accounts with balances below the average at the
bank.
delete from account
where balance < (select avg (balance)
from account)
 Problem: as we delete tuples from account, the average balance changes
 Solution used in SQL:
1. First, compute avg balance and find all tuples to delete
2. Next, delete all tuples found above (without recomputing avg or retesting the
tuples)

35
Modification of the Database –
Insertion
 Data can be inserted either by specifying a tuple to be inserted or
by writing a query whose result is a set of tuples.
 Insert an account A-9732 at the Perryridge branch that has a
balance of $1200.
insert into account
values (‘A-9732’, ‘Perryridge’,1200)
 The values are specified in order which they appear in the
relational schema.

insert into account


values (‘Perryridge’, 1200, ‘A-9732’)
36
Modification of the Database
– Insertion
 For the benefit of users, SQL allows the attributes to be specified as
part of the insert statement.
insert into account (branch-name, balance, account- number)
values (‘Perryridge’, 1200, ‘A-9732’)
or equivalently,
insert into account (account-number, branch-name, balance)
values (‘A-9732’, ‘Perryridge’, 1200)

 It is possible to assign null values to attributes by insert statement.


 Example: Add a new tuple to account with balance set to null
insert into account
values (‘A-777’,‘Perryridge’, null)
37
Modification of the Database –
Insertion
 It’s also possible to insert data on the basis of the result of a query.
 Present a new $200 savings account as a gift to all loan customers of the
Perryridge branch. (consider loan number as account number for the
savings account)

insert into account


(select loan-number, branch-name, 200
from loan
where branch-name like ‘Perryridge’)

38
Modification of the Database –
Updates
 Increase all accounts with balances over $10,000 by 6%, all other accounts
receive 5%.
 Write two update statements:

update account
set balance = balance  1.06
where balance > 10000

update account
set balance = balance  1.05
where balance  10000

 The order is important


 Can be done better using the case statement (next slide)
39
Case Statement for
Conditional Updates

 Same query as before: Increase all accounts with balances over $10,000
by 6%, all other accounts receive 5%.

update account
set balance = case
when balance <= 10000 then balance*1.05
else balance * 1.06
end

40
Joined Relations
 Join operations take two relations as input and return as a result a new
relation.
 These additional operations are typically used as subquery expressions in
the from clause
 Join condition: defines which tuples in the two relations match, and what
attributes are present in the result of the join.
 Join type: defines how tuples in each relation that do not match any
tuple in the other relation (based on the join condition) are treated.

Join Types Join Conditions


inner join natural
left outer join on <predicate>
right outer join using (A1, A2, ..., An)
full outer join 41
Joined Relations – Datasets for Examples

 Relation loan
loan-number branch-name amount
L-170 Downtown 3000
L-230 Redwood 4000
L-260 Perryridge 1700
• Relation borrower
customer-name loan-number

Jones L-170
Smith L-230
Hayes L-155
• Note: borrower information missing for L-260 and loan
information missing for L-155 42
Joined Relations – Inner Join Examples
Select *
From loan inner join borrower on
loan.loan-number = borrower.loan-number
loan-number branch-name amount customer-name loan-number

L-170 Downtown 3000 Jones L-170


L-230 Redwood 4000 Smith L-230

Select *
From loan natural inner join borrower on
loan.loan-number = borrower.loan-number
loan-number branch-name amount customer-name

L-170 Downtown 3000 Jones


L-230 Redwood 4000 Smith
43
Joined Relations – Outer Join Examples
◼ loan left outer join borrower on
loan.loan-number = borrower.loan-number
loan-number branch-name amount customer-name loan-number
L-170 Downtown 3000 Jones L-170
L-230 Redwood 4000 Smith L-230

L-260 Perryridge 1700 null null

◼ loan natural right outer join borrower on


loan.loan-number = borrower.loan-number

loan-number branch-name amount customer-name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-155 null null Hayes 44
44
Joined Relations – Outer Join Examples
 loan natural full outer join borrower using (loan-number)

loan-number branch-name amount customer-name


L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes

 Find all customers who have either an account or a loan (but not both) at the bank.

select customer-name
from (depositor natural full outer join borrower)
where account-number is null or loan-number is null

45
Embedded SQL
 The SQL standard defines embeddings of SQL in a variety of
programming languages such as Pascal, PL/I, Fortran, C, C++, Python,
and Java.
 A language to which SQL queries are embedded is referred to as a host
language, and the SQL structures permitted in the host language
comprise embedded SQL.
 EXEC SQL statement is used to identify embedded SQL request to the
preprocessor
EXEC SQL <embedded SQL statement > END-EXEC
Note: this varies by language. E.g. the Java embedding uses
# SQL { <SQL statement> } ;
 All query processing is performed by the database system.

46
Embedded SQL (Cont...)
 The open statement causes the query to be evaluated
EXEC SQL open c END-EXEC
 The fetch statement causes the values of one tuple in the query result to
be placed on host language variables.
EXEC SQL fetch c into :cn, :cc END-EXEC
Repeated calls to fetch get successive tuples in the query result.
 A variable called SQLSTATE in the SQL communication area gets set to
‘02000’ to indicate no more data is available
 The close statement causes the database system to delete the temporary
relation that holds the result of the query.
EXEC SQL close c END-EXEC
Note: above details vary with language. E.g. the Java embedding defines Java
iterators to step through result tuples.

47
Dynamic SQL
 Allows programs to construct and submit SQL queries at run time.
 Example of the use of dynamic SQL from within a C program.
char * sqlprog = “update account
set balance = balance * 1.05
where account-number = ?”

EXEC SQL prepare dynprog from :sqlprog;


char account [10] = “A-101”;
EXEC SQL execute dynprog using :account;

 The dynamic SQL program contains a ?, which is a place holder for a value
that is provided when the SQL program is executed.

48
Query Processing

1. Parsing and translation


2. Optimization
3. Evaluation

49
49
Query Processing (Cont.)
 Alternative ways of evaluating a given query
 Equivalent expressions
 Different algorithms for each operation
 Cost difference between a good and a bad way of evaluating a query can
be enormous
 Need to estimate the cost of operations
 Depends critically on statistical information about relations which the database must
maintain
 Need to estimate statistics for intermediate results to compute cost of complex
expressions

50
Application Architectures
 Application programs generally access databases through one of
 Language extensions to allow embedded SQL
 Application program interface (e.g., ODBC/JDBC) which allow SQL queries
to be sent to a database
 Applications can be built using one of three architectures:

 Two tier model


 Application program running at user site directly uses JDBC/ODBC to
communicate with the database
 Three tier model
 Users/programs running at user sites communicate with an application
server. The application server in turn communicates with the database
 N-tier model

51
Application Architectures
Two Tier Architecture

52
Two-tier Model
 E.g. Java code runs at client site and uses JDBC to communicate
with the backend database server to data manipulation.
 Benefits:
 Flexible, easy to design and implement
 Problems:
 Security: passwords available at client site, all database operation
possible
 More code shipped to client
 Not appropriate across organizations, or in large ones like
universities

53
Three Tier Model
CGI Program

Application/HTTP JDBC Database


Servlets Server
Server

HTTP/Application Specific Protocol

Network

Client Client Client


54
Three-tier Model (Cont.)
 E.g. Web client + Java Servlet using JDBC to talk with database
server
 Client sends request over http or application-specific protocol
 Application or Web server receives request
 Request handled by CGI program or servlets
 Feedback is sent by the CGI program or servlets to the client.
That is, HTML is pushed onto the client’s web browser.
 Security handled by application at server
 Better security
 Simple client:
 Less code
 Runs faster

55
Examples

Schema:
person(driver-id, name, address)
car(license, model, year)
accident(report-number, date, location)
owns(driver-id, license)
participated(driver-id, license, report-number, damage-amnt)

56
Examples
Answer:
a. Find the total number of people who owned cars that were involved in accidents in
2016.

select count (distinct name)


from accident, participated, person
where accident.report-number = participated.report-number
and participated.driver-id = person.driver-id
and date between date ’2016-00-01’ and date ‘2016-12-31’

 Note: this is not the same as the total number of accidents in 2016. We must count
people with several accidents only once.

57
Examples
c. Add a new accident to the database; assume any values for required attributes.
We assume the driver was “Jones,” although it could be someone else. Also, we assume
“Jones” owns one Toyota. We assume values “Berkeley” for location, ’2016-09-01’ for
date and date, 4007 for reportnumber and 3000 for damage amount.
First we must find the license of the given car. Then the participated and accident
relations must be updated in order to both record the accident and tie it to the given car.
insert into accident
values (4007, ’2016-09-01’, ’Berkeley’)

insert into participated


select o.driver-id, c.license, 4007, 3000
from person p, owns o, car c
where p.name = ’Jones’ and p.driver-id = o.driver-id and
o.license = c.license and c.model = ’Toyota’
58
Examples
d. Delete the Mazda belonging to “John Smith”.
 Since model is not a key of the car relation, we can either assume that
only one of John Smith’s cars is a Mazda, or delete all of John Smith’s
Mazdas (the query is the same).

delete from car


where model = ’Mazda’ and license in
(select license
from person p, owns o
where p.name = ’John Smith’ and p.driver-id = o.driver-id)

59
60

You might also like