DBMS Lecture02
DBMS Lecture02
Process
Users
3-schema
Conceptual
architecture
Conceptual
view
Schema
Internal
Internal view
Schema DBMS
DB
Benefits of 3-Schema Architecture
External Level: Each user can access the data, but have their own view of the
data independent of other users.
Logical data independence - conceptual schema changes do not affect
external views.
Conceptual Level: Single shared data representation for all applications and
users which is independent of physical data storage.
Users do not have to understand physical data representation details.
The DBA can change the storage structures without affecting users or
applications. Physical data independence - conceptual schema not affected
by physical changes such as adding indexes or distributing data.
Internal (Physical) Level: Provides standard facilities for interacting with
operating system for space allocation and file manipulation.
MS Access and the 3-Schema Architecture
External Level: Microsoft Access does not call them views, but you can store
queries and use the results in other queries (like a view).
External schema is the query (view) name and the attribute metadata.
Conceptual Level: All tables and field definitions are in the schema (accessible
from the Tables tab).
Note that conceptual schema is not the data but the metadata.
Physical Level:
Access represents all data in a single file whose layout it controls.
The system processes this raw data file by knowing locations and offsets
of relations and fields.
Question?
One of the first relational database systems, System R, developed at IBM led
to several important breakthroughs:
the first version of SQL
various commercial products such as Oracle and DB2
extensive research on concurrency control, transaction management,
and query processing and optimization
relation
attributes
tuples
Question: Given the three definitions, select the ordering that contains their
related definitions.
1. relation
2. tuple
3. attribute
Question: A database table has 10 rows and 5 columns. Select one true statement.
A. The table's degree is 50.
B. The table's cardinality is 5.
C. The table's degree is 10.
D. The table's cardinality is 10.
Question?
The relational model may be visualized as tables and fields, but it is formally
defined in terms of sets and set operations.
A relation schema R with attributes A =<A1, A2, …, An> is denoted R (A1, A2, …,
An) where each Ai is an attribute name that ranges over a domain Di denoted
dom(Ai).
Example: Product (id, name, supplierId, categoryId, price)
R = Product (relation name)
Set A = {id, name, supplierId, categoryId, price}
dom(price) is set of all possible positive currency values
dom(name) is set of all possible strings that represent people's names
Relation Schemas and Instances
A relation instance denoted r(R) over a relation schema R(A1, A2, …, An) is a set of n-
tuples <d1, d2, ..., dn> where each di is an element of dom(Ai) or is null.
The relation instance is the extension of the relation.
A value of null represents a missing or unknown value.
Cartesian Product
The Cartesian product written as D1 D2 is a set operation that takes two sets D1 and D2 and
returns the set of all ordered pairs such that the first element is a member of D1 and the
second element is a member of D2. Example:
D1 = {1,2,3}
D2 = {A,B}
D1 x D2 = {(1,A), (2,A), (3,A), (1,B), (2,B), (3,B)}
Practice Questions:
1) Compute D2 x D1.
2) Compute D2 x D2.
3) If |D| denotes the number of elements in set D, how many elements are there in D1 x D2
4) What is the cardinality of D1 x D2 x D1 x D1?
Relation Instance
A relation instance r(R) can also be defined as a subset of the Cartesian product of
the domains of all attributes in the relation schema. That is,
r(R) ⊆ dom(A1) x dom(A2) x … x dom(An)
Example:
R = Person(id, firstName, lastName)
dom(id) = {1,2}, dom(firstName) = {Joe, Steve}, dom(lastName) = {Jones, Perry}
dom(id) x dom(firstName) x dom(lastName) =
{ (1,Joe,Jones), (1,Joe,Perry), (1,Steve,Jones), (1,Steve,Perry), (2,Joe,Jones),
(2,Joe,Perry), (2,Steve,Jones), (2,Steve,Perry)}
Assume our DB stores people Joe Jones and Steve Perry, then
r(R) = { (1,Joe, Jones), (2,Steve,Perry)}.
Properties of Relation
1. Each relation name is unique. (No two relations have the same name.)
2. Each cell of the relation (value of a domain) contains exactly one atomic (single)
value.
3. Each attribute of a relation has a distinct name.
4. The values of an attribute are all from the same domain.
5. Each tuple is distinct. There are no duplicate tuples. (This is because relations are
sets. In SQL, relations are bags.)
6. The order of attributes is not really important.
Note that this is different that a mathematical relation and our definitions which specify an
ordered tuple. The reason is that the attribute names represent the domain and can be
reordered.
7. The order of tuples has no significance.
Relational Keys
o A foreign key is a set of attributes in one relation referring to the primary key of another
relation.
True or false: It is possible to have more than one key for a table and
the keys may have different numbers of attributes.
A. true
B. false
Keys and Superkey Question
A. true
B. false
Example Relations
Employee-Project Database:
Employees have a unique number, name, title, and salary.
Projects have a unique number, name, and budget.
An employee may work on multiple projects and a project may have multiple
employees. An employee on a project has a particular responsibility and duration
on the project.
Relations:
Emp (eno, ename, title, salary)
Domain constraint - Every value for an attribute must be an element of the attribute's
domain or be null.
Null represents a value that is currently unknown or not applicable.
null is not the same as zero or an empty string.
Entity integrity constraint - In a base relation, no attribute of a primary key can be null.
Referential integrity constraint - If a foreign key exists in a relation, then the foreign key value
must match a primary key value of a tuple in the referenced relation or be null.
Foreign Key Example
Emp Relation WorksOn Relation
eno ename title salary eno pno resp dur
E1 J. Doe EE 30000 WorksOn.eno is E1 P1 Manager 12
E2 M. Smith SA 50000 FK to Emp.eno E2 P1 Analyst 24
E3 A. Lee ME 40000 E2 P2 Analyst 6
E4 J. Miller PR 20000 E3 P3 Consultant 10
E5 B. Casey SA 50000 E3 P4 Engineer 48
WorksOn.pno is E4 P2 Programmer 18
E6 L. Chu EE 30000
FK to Proj.pno
E7 R. Davis ME 40000 E5 P2 Manager 24
E8 J. Jones SA 50000 E6 P4 Manager 48
E7 P3 Engineer 36
Proj Relation E7 P5 Engineer 23
pno pname budget E8 P3 Manager 40
P1 Instruments 150000
P2 DB Develop 135000
P3 CAD/CAM 250000
P4 Maintenance 310000
P5 CAD/CAM 500000
Integrity Constraint Question
Question: What constraint says that a primary key field cannot be null?
A. domain constraint
Question: A primary key has three fields. Only one field is null. Is the
entity integrity constraint violated?
A. Yes
B. No
Referential Integrity Constraint Question
Question: A foreign key has a null value in the table that contains the
foreign key fields. Is the referential integrity constraint violated?
A. Yes
B. No
Integrity Question
Emp Relation WorksOn Relation
eno ename title salary eno pno resp dur
E1 J. Doe EE AS E1 P0 null 12
E2 null SA 50000 E2 P1 Analyst null
E3 A. Lee 12 40000 null P2 Analyst 6
E4 J. Miller PR 20000 E3 P3 Consultant 10
E5 B. Casey SA 50000 E9 P4 Engineer 48
null L. Chu EE 30000 E4 P2 Programmer 18
E7 R. Davis ME null E5 null Manager 24
E8 J. Jones SA 50000 E6 P4 Manager 48
E7 P6 Engineer 36
Proj Relation E7 P4 Engineer 23
pno pname budget null null Manager 40
P1 Instruments 150000
P2 DB Develop 135000
Question:
P3 CAD/CAM 250000
How many violations of integrity constraints?
P4 Maintenance 310000
P5 null null
General Constraints
There are more general constraints that some DBMSs can enforce. These
constraints are often called enterprise constraints or semantic integrity constraints.
Examples:
An employee cannot work on more than 2 projects.
An employee cannot make more money than their manager.
An employee must be assigned to at least one project.
Ensuring the database follows these constraints is usually achieved using triggers.
Relational Algebra
A query language is used to update and retrieve data that is stored in a data model.
Just like algebra with numbers, relational algebra consists of operands (which are
relations) and a set of operators.
Every relational operator takes as input one or more relations and produces a relation as
output.
Relational Operators:
Selection σ
Projection Π
Cartesian product ×
Join
Union ∪
Difference -
Intersection ∩
The selection operation is a unary operation that takes in a relation as input and
returns a new relation as output that contains a subset of the tuples of the input
relation.
That is, the output relation has the same number of columns as the input relation,
but may have less rows.
To determine which tuples are in the output, the selection operation has a
specified condition, called a predicate, that tuples must satisfy to be in the
output.
The predicate is similar to a condition in an if statement.
Selection Operation – Formal Definition
WorksOn Relation
eno pno resp dur Write the relational algebra expression that:
E1 P1 Manager 12 1. Returns all rows with an employee working
E2 P1 Analyst 24 on project P2.
E2 P2 Analyst 6 2. Returns all rows with an employee who is
E3 P3 Consultant 10 working as a manager on a project.
E3 P4 Engineer 48 3. Returns all rows with an employee working as
E4 P2 Programmer 18 a manager for more than 40 months.
E5 P2 Manager 24
E6 P4 Manager 48 Show the resulting relation for each case.
E7 P3 Engineer 36
E7 P5 Engineer 23
E8 P3 Manager 40
Projection Operator
The projection operation is a unary operation that takes in a relation as input and
returns a new relation as output that contains a subset of the attributes of the
input relation and all non-duplicate tuples.
The output relation has the same number of tuples as the input relation unless
removing the attributes caused duplicates to be present.
Question: When are we guaranteed to never have duplicates when performing
a projection operation?
Besides the relation, the projection operation takes as input the names of the
attributes that are to be in the output relation.
Projection Operation Formal Definition
ΠA1,…,Am(R)={t[A1,…, A m] | t∈R}
where
R is a relation, t is a tuple variable
{A1,…,Am} is a subset of the attributes of R over which the projection will
be performed.
Order of A1,…, Am is significant in the result.
Cardinality of Π A ,…,A (R) is not necessarily the same as R because
1 m
of duplicate removal.
Projection Example
WorksOn Relation
eno pno resp dur
E1 P1 Manager 12 Write the relational algebra expression that:
E2 P1 Analyst 24
1. Returns only attributes resp and dur.
E2 P2 Analyst 6
2. Returns only eno.
E3 P3 Consultant 10
3. Returns only pno.
E3 P4 Engineer 48
E4 P2 Programmer 18
E5 P2 Manager 24 Show the resulting relation for each case.
E6 P4 Manager 48
E7 P3 Engineer 36
E7 P5 Engineer 23
E8 P3 Manager 40
Page 40