Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Lecture 2 - SQL - template

DBMS NOTES

Uploaded by

myotherface16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 2 - SQL - template

DBMS NOTES

Uploaded by

myotherface16
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Lecture 2 – Querying in SQL

COMP 519: Advanced Databases

Topics:
Overview of Querying in SQL
Joins
Set Operations
Renaming
Predicates
Ordering Tuples
Aggregate Functions
Null Values
Nested subqueries
Scalar subqueries
Join Expressions
Instance Modification
Tuple Deletion
Tuple Insertion
Tuple Modification

Reading:
Chapter 3

Querying in SQL

Typical SQL queries have the form:

select A1, A2, ... , An


from r1, r2, ... , rm
where P

In relational algebra, we would express this query as:


Example: find the name and id of everyone who has taken COMP 519.

Structure of SQL expressions:

• the select clause: lists the attributes that will appear in the result relation. (select is
like project in relational algebra.)
o select * is a shorthand for selecting all attributes of relations appearing in the
from clause.
o select customer.* is a shorthand for selecting all attributes of the customer
relation (assuming that customer appears in the from clause).
• the from clause: lists the relations whose cartesian product is needed in the query
• the where clause: like select in relational algebra, although it has slightly different
syntax. An omitted where clause is equivalent to where true.

The result of an SQL query is a relation.

Review how works SQL queries are executed (in theory):

1. Generate a Cartesian product of the relations listed in the from clause.

2. Apply the predicates specified in the where clause on the result of Step 1.

3. For each tuple in the result of Step 2, output the attributes (or results of expressions)
specified in the select clause.

In practice, it is not executed this way.


Q: Why?
Joins

SQL supports natural join syntax as:


select A1, A2, …, An
from r1 natural join r2 natural join … natural join rm
where P;

Recall Natural Join: Operates on two relations and produces a relation as a result, considering
only pairs of tuples with the same values on those attributes that appear in the schemas of
both relations.

Note: More generally the from clause can be of the form:

from E1, E2, … En

Where each Ei can be a single relation or an expression involving natural joins

Example: Given the schema of a University Database from the textbook, list the names of
instructors along with the titles of courses that they teach.

Is this correct?

select name, title


from instructor natural join teaches natural join course;
Duplicates and Set Operations

Because removing duplicates is expensive, SQL does not remove duplicates by default.

student
ID name dept_name tot_cred
1 Alice math 15
2 Chuck cmpsc 9
3 Bob math 45

Then the result of:

select dept_name
from student

Would be:

dept_name
math
cmpsc
math

To force SQL to remove duplicates, use the keyword distinct, e.g.:

select dept_name
from student
Set Operations

SQL includes the set operations union (), intersection () and except (-). These operations can
be applied to the results of queries of the basic form.

Example: find the IDs, names, and departments of students and instructors.

select ID, name, dept_name


from student
union
select ID, name, dept_name
from instructor

Notes:

Example - Retain duplicates in the previous query:

select ID, name, dept_name


from student
union
select ID, name, dept_name
from instructor
Renaming

SQL has a notion of tuple variables that range over relations.

Tuple variables can be used to rename relations in much the same way that the rename
operation is used in relational algebra.

The keyword as is used to associate a tuple variable with a relation.

Example: find all instructors in the same department as Alice.

The keyword as can also be used to rename attributes in the result of a query.

Example: we wanted to compute the Cartesian product of student id’s and instructor id’s and
have unique attribute names in the result

SQL Predicates

SQL uses different syntax for predicates (in the where clause) than that used in selection
predicates in relational algebra

SQL Predicates Relational Algebra Predicates


P and Q

P or Q

not P

x between a and b

x not between a and b


SQL arithmetic operators: + - * /

Relational operators: < <= > >= = <> (inequality).

String literals are delimited by single quotes (')

Pattern Matching in SQL: Use keyword like and wild card characters % (any string) and _ (any
character).

Example: Find the names of all students whose name has a q in it.

If you need % or _ to be treated as a regular character and not a metacharacter, use the
escape keyword after the string literal to specify an escape character.

Example: find the names of all students whose name has a q as the second character.
Ordering Tuples

Use the order by clause at the end of a query statement to order the tuples in a relation, as
in:
...
order by A1 [desc][asc], A2 [desc][asc], ..., An [desc][asc]

The keywords asc and desc can be used to control whether ascending or descending order is
used, respectively. The default is ascending.

Example: List (in alphabetic order) all departments that offer courses. Do allow duplicates in
the result.

Note: Multiple arguments to order by can be used to specify how to break ties.

Example: For students with more than 120 credits, list the students in descending order by
tot_cred. If two students have the same tot_cred, list them in order by their name.
Join Expressions

SQL provides various kinds of joins including the natural join that can be used in the from
clause of a query.

Inner Join

An inner join is a join in which tuples from one or both of the argument relations may not
participate in the result.

select * from (r natural inner join s);

Note the order of attributes:

Outer Join

An outer join is a join that guarantees that all tuples from one or both of the argument relations
appear in the result of the join.

This is accomplished by:


1. computing the inner join
2. padding tuples that did not participate with nulls appropriately
3. adding the padded tuples to the result

Example: Find the names all students who have never enrolled in a class.
Join Conditions

Join conditions in SQL include:

• natural
Attributes with same name must have same value for tuples to be matched.

• on P
where P is a predicate
only tuples for which the predicate is true will be matched

• using (A1,A2, . . . ,An)


where A1, A , . . . , An are attributes common to the relations being joined
these tuples must have same value in two relations for the tuples to be matched

Note:

The using Condition

The using condition provides a join similar to a natural join, except that joining occurs only on
the attributes listed.

Example: Rewrite the following query using a using condition instead of the natural join.

select *
from (customer natural inner join purchased2);
Example: Find the students and instructors whose IDs match, even if they have different names
or dept_names:

The schema of the result of this query is:

Q: What attribute is missing?

The on Condition

The on condition provides a theta join - only tuples that satisfy the specified condition are
joined.

Example: Find the natural join of student and takes using an on condition.

Note:

Example: Rewrite previous query using the Cartesian product of customer and purchased and
no join conditions.
Final Notes on Joins

Defaults:

When the join clause is used, the default join type is an inner join. Thus the following are
equivalent:

select * from customer join purchased using (custnum);

select * from customer inner join purchased using (custnum);

Similarly a natural join is equivalent to a natural inner join.

Q: Are following equivalent:

select * from customer left outer join purchased on customer.custnum =


purchased.custnum;

select * from customer left outer join purchased on true where


customer.custnum = purchased.custnum;
Aggregate Functions

Aggregate functions allow SQL queries to compute composite information about an entire
relation.

• avg

• min

• max

• sum

• count

Example: Find the average tot_cred’s of all students.

Notes:

Multiple Aggregates in Select Clause

For queries with no group by clause, no other attributes can be used with an aggregate in the
select clause. However, multiple aggregates can be used.

Example: Find the maximum and minimum tot_cred of students in the math department.
Count and the distinct Keyword

With the count aggregate, you can use the distinct keyword before the attribute name to
get a count of unique values for an attribute.

Example: How many students are there?

Example: How many students have enrolled in at least one course?

Counting the Number of Tuples in a Relation

To simply count the number of tuples in a relation, use count(*).

Example: How many students are there?

Aggregation with Grouping

The group by clause can be used with aggregate functions to compute information about
subsets of a relation.

Example: Find the number of students in each department.

Notes:
Example: Find the number of different courses taken by each student.

Example: Find the names of students and the number of courses (not necessarily unique) that
they have enrolled in.

If we’d prefer to have the names of customers rather than customer numbers:

Notes:

The having Clause

The having clause is similar to the where clause, except that conditions in a having clause
apply to groups created by a group by clause, rather than individual tuples.

The having clause can contain aggregates and the attributes listed in the group by clause, but
no other attributes.

Example: Find the number of classes enrolled in by each student in the math department for
students who have enrolled in at least 5 classes.

Notes:
Null Values

SQL uses a special value, called the null value, to mean that a particular value isn’t known or
doesn’t exist.

Null values complicate comparisons and computations.

Consider the query:

select *
from student
where tot_cred > 10.0;

What happens if a student’s tot_cred is null?

AND operation True False Unknown


True False Unknown
True
False False False
False
Unknown False Unknown
Unknown

OR operation True False Unknown


True True True
True
True False Unknown
False
True Unknown Unknown
Unknown

Note:
is null and is not null Keywords

The keywords is null and is not null can be used in the where clause to explicitly test for the
presence and absence, respectively, of null values.

The clauses is known and is unknown can be used to test whether the result of a comparison is
unknown. In Oracle, these are not supported

Example: Find all students with an unknown or missing tot_cred:

coalesce function

Example: The person table has the schema (id, name, spouse), where spouse is null for
unmarried individuals. Return a relation with id’s, name’s, and spouse’s, where spouse is “NA”
rather than null for unmarried individuals.

Note: sum() function returns null when no tuples are selected.

Example: Calculate the sum of all credits for the courses in the “ABCD” department. If there are
no courses yet defined for this department, the sum should be 0.
Nested Subqueries

Tests for Set Membership

The in and not in connectives test the membership of a tuple in the result of an SQL query. This
allows SQL queries to be nested.

Example: Find the name and ID of every student who has taken course ID 1.

Question: How could we write the previous query without a nested subquery?
Constructing Tuples

Tuples can be constructed using (). That is, (v1, v2, . . . , vn) is a tuple of arity n containing values
v1, v2, . . . , vn.

Example: Express the previous query as a nested subquery with a tuple variable in the where
clause:

Set Comparison

Nested queries can be used with a comparison operator and keywords some or all to compare
an attribute against all values appearing in some other relation.

Example - Find all students having maximum tot_cred

Q: What would be returned by the query above if there are null values for tot_cred?

Example: Find all students who have tot_cred greater than the minimum tot_cred.
Test for Empty Relations

The keywords exists and not exists are used to test whether or not the result of a nested
query is an empty relation. (a relation containing no tuples)

Example: Find all course ID’s models that no one has taken.

Note:

Subqueries in the From Clause

SQL permits subqueries to be used in the from clause of another query.

Example: Find the maximum number of courses taken by any student.


Example: Find the maximum number of courses taken by a student and the name of that
student.

The with Clause

The with clause of SQL:1999 (and Oracle) provides a temporary relation for use within one
query. This is useful for simplifying complex queries and in cases where the same subquery is
used multiple times.

with RELATION_NAME as (QUERY) …

Example: Use a with clause to simplify the previous query:


Scalar Subqueries

A scalar subquery is a subquery that returns a relation with one column and one row, that is
treated as a single value.

SQL:2003 (and Oracle) allow scalar subqueries to be used wherever a value of the same type
can be used.

Examples: Find all students who have more than the average tot_cred:

Database Modification
Deletion

To delete tuple(s) from a relation, use the following syntax:


delete from r
where P;

which deletes all tuples from relation r that satisfy predicate P.

Example: Delete Elvis from the student relation.

Example: Delete all tuples from the student relation.


Example: Delete all students who have not taken a course in the math department.

Insertion

Inserting a Single Tuple

To insert values into a relation, use the following syntax:


insert into r(A1,A2, ...,An)
values (v1, v2, ..., vn);

Where r is the target relation, Ai's are the attributes, and vi's are the value

Notes: The attributes of relation r must be (A1,A2, . . . ,An), and each vi must be a value in the
domain of Ai.

The attributes and values need not be listed in the order of the attributes of r.

Example: Add a new student with ID of 1, name of Bob, department of math, and tot_credit of
0.

If the attribute values in the values clause are ordered in the same order as the attributes of the
relation schema, the attribute list in the insert clause can be omitted.

Example:
Inserting Multiple Tuples

The values can be replaced with a select statement to insert the result of a query into a
relation.

Example: Insert all instructors into the student relation, keeping their ID’s, names, and
dept_names the same, and giving them 0 tot_cred.

PROBLEM!

Fix:

Updates

To modify one or more tuples in a relation, use the following syntax:

update r
set A1 = v1, A2 = v2, ...,An = vn
where P;

where …

Example: Increase the budget of all departments in Olmsted by 5%, and change their name to
CMPSC.
Updates and Nested Queries

The where clause of an update statement can contain a nested query, and the nested query
can reference the relation being updated.

Example: Increase the budget of all departments with less than the average budget by 5%.
SQL Merge Statement

Syntax

MERGE INTO <target_table> [AS t]


USING <table_source> [AS s]
ON <join_condition>
[WHEN MATCHED [AND condition]
THEN { merge_update | merge_delete | DO NOTHING } ]
[WHEN NOT MATCHED [AND condition]
THEN { merge_insert | DO NOTHING } ]

merge_insert:

INSERT [( column_name [, ...] )]


{ VALUES ( { expression | DEFAULT } [, ...] ) }

merge_update:

UPDATE SET { column_name = { expression | DEFAULT } |


...] }

merge_delete:

DELETE

Example:

Suppose that there is a temporary table new_student with the schema (id, name, dept). Ensure that any
students in this table also appear in student, and any existing students whose name or dept has changed
have these changes appear in the student table.

You might also like