Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Lecture 4

▪ The relational model represents the database as a collection of


relations.

▪ When a relation is thought of as a table of values, each row in the


table represents a collection of related data values.

▪ A row represents a fact that typically corresponds to a real-world


entity or relationship.
▪ The table name and column names are used to help to interpret
the meaning of the values in each row.

▪ All values in a column are of the same data type.

▪ Ach column has a domain of values.

▪ A domain D is a set of atomic and indivisible values.


▪ A common method of specifying a domain is to specify a data
type from which the data values forming the domain are drawn.
▪ Examples:

▪ Usa_phone_numbers. The set of a valid ten-digit phone numbers

▪ Social_security_numbers. The set of valid nine-digit Social Security


numbers.
▪ Employee_ages. Possible ages of employees in a company; each
must be an integer.
▪ The preceding are called logical definitions of domains.
▪ A data type or format is also specified for each domain.
▪ For example,
▪ the data type for the domain Usa_phone_numbers can be
declared as a character string of the form (ddd)ddd-dddd,
where each d is a numeric (decimal) digit and the first three
digits form a valid telephone area code.
▪ The data type for Employee_ages is an integer number
between 15 and 80.
▪ A relation schema R, denoted by R(A1, A2, … , An), is made
up of a relation name R and a list of attributes, A1, A2, … , An.
▪ Each attribute Ai is the name of a role played by some
domain D in the relation schema R.
▪ D is called the domain of Ai and is denoted by dom(Ai).

▪ The degree (or arity) of a relation is the number of attributes n of its


relation schema.
▪ A relation of degree seven, which stores information about university
students, would contain seven attributes describing each student as
follows:
STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa)
▪ Using the data type of each attribute, the definition is sometimes
written as:
STUDENT(Name: string, Ssn: string, Home_phone: string, Address:
string, Office_phone: string, Age: integer, Gpa: real).
▪ A relation (or relation state) r(R) is a mathematical
relation of degree n on the domains dom(A1), dom(A2), …,
dom(An), which is a subset of the Cartesian product
(denoted by ×) of the domains that define
R: r(R) ⊆ (dom(A1) × dom(A2) × . . . × (dom(An))
▪ The Cartesian product specifies all possible combinations
of values from the underlying domains.
▪ A relation is defined as a set of tuples. Mathematically,
elements of a set have no order among them;
▪ hence, tuples in a relation do not have any particular order.

▪ When a relation is implemented as a file or displayed as a


table, a particular ordering may be specified on the
records of the file or the rows of the table.
▪ a tuple can be considered as a set of (<attribute>, <value>)
pairs, where each pair gives the value of the mapping from
an attribute Ai to a value vi from dom(Ai).
▪ The ordering of attributes is not important, because the
attribute name appears with its value.
▪ When the attribute name and value are included together
in a tuple, it is known as, self-describing data.
▪t = < (Name, Dick Davidson),(Ssn, 422-11-
2320),(Home_phone, NULL),(Address, 3452 Elgin Road),
(Office_phone, (817)749-1253),(Age, 25),(Gpa, 3.53)>
▪ and t = < (Address, 3452 Elgin Road),(Name, Dick
Davidson),(Ssn, 422-11-2320),(Age, 25), (Office_phone,
(817)749-1253),(Gpa, 3.53),(Home_phone, NULL)>
▪ are identical.
▪ An important concept is that of NULL values, which are
used to represent the values of attributes that may be
unknown or may not apply to a tuple.
▪ we can have several meanings for NULL values, such as
value unknown, value exists but is not available, or attribute
does not apply to this tuple (also known as value
undefined).
▪ The relation schema can be interpreted as a declaration or
a type of assertion.
▪ For example, the schema of the STUDENT relation of asserts
that a student entity has a Name, Ssn, Home_phone,
Address, Office_phone, Age, and Gpa.
▪ Each tuple in the relation can then be interpreted as a fact
or a particular instance of the assertion.
Notice that
▪ some relations may represent facts about entities,
whereas other relations may represent facts about
relationships.
▪ For example, a relation schema MAJORS (Student_ssn,
Department_code) asserts that students major in academic
disciplines.
▪ A tuple in this relation relates a student to his or her major
discipline. Hence, the relational model represents facts
about both entities and relationships uniformly as relations.
▪ the various restrictions on data that can be specified on a
relational database can generally be divided into three
main categories:
1. Inherent model-based constraints or Implicit constraints.
2. schema-based constraints or explicit constraints.
3. application-based or semantic constraints or business rules
Inherent model-based constraints or Implicit
constraints are inherent in the data model.
▪ For example, the constraint that a relation cannot have
duplicate tuples is an inherent constraint
schema-based constraints or explicit constraints
▪ can be directly expressed in the schemas of the data model

▪ Include:
▪ domain constraints
▪ key constraints
▪ Superkey, key, candidate key, primary key, prime and nonprime attribute.
▪ Constraints on NULLs
▪ Entity integrity constraints
▪ Referential integrity constraints
domain constraints
▪ A must be an atomic value from the domain dom(A).

▪ key constraints
▪ Superkey:
▪ A superkey of a relation schema R = {A1, A2, … , An} is a set of
attributes S ⊆ R with the property that no two tuples t1 and t2 in any
legal relation state r of R will have t1[S] = t2[S].
▪ Every relation has at least one default superkey, the set of all its
attributes.
▪ key constraints
▪ Key:
▪ A key K is a superkey with the additional property that removal
of any attribute from K will cause K not to be a superkey
anymore.
▪ A key has to be minimal
▪ A key with multiple attributes must require all its attributes
together to have the uniqueness property.
▪ key constraints
▪ Candidate key:
▪ In general, a relation schema may have more than one key. In
this case, each of the keys is called a candidate key
▪ It is common to arbitrarily designate one of the candidate keys
as the primary key of the relation and the others are called
secondary keys.
▪ key constraints
▪ prime and nonprime attribute:
▪ An attribute of relation schema R is called a prime attribute of
R if it is a member of some candidate key of R.
▪ An attribute is called nonprime if it is not a member of any
candidate key.
▪ Constraints on NULLs
▪ specifies whether NULL values are or are not permitted.

▪ For example, if every STUDENT tuple must have a valid, non-NULL value
for the Name attribute.

▪ Entity integrity constraints


▪ states that no primary key value can be NULL.
▪ Referential integrity constraints
▪ specified between two relations.
▪ states that a tuple in one relation that refers to another relation
must refer to an existing tuple in that relation.
▪ To define referential integrity, first we define the concept of a
foreign key.
▪ Referential integrity constraints
▪ Foreign key:
▪ A set of attributes FK in relation schema R1 is a foreign key of R1
that references relation R2 if it satisfies the following rules:
▪ The attributes in FK have the same domain(s) as the primary key
attributes PK of R2.
▪ t1[FK] = t2[PK].
▪ R1 is called the referencing relation
▪ and R2 is the referenced relation
▪ application-based or semantic constraints or business rules
▪ cannot be directly expressed in the schemas of the data model

▪ hence must be expressed and enforced by the application programs


or in some other way.
▪ Examples of such constraints are
▪ the salary of an employee should not exceed the salary of the employee’s
supervisor
▪ the maximum number of hours an employee can work on all projects per
week is 56
▪ Anotherimportant category of constraints is data
dependencies.
▪ which include functional dependencies and multivalued
dependencies.
▪ They are used mainly for testing the “goodness” of the
design of a relational database and are utilized in a process
called normalization, which is discussed later.
▪ A relational database schema S
▪ is a set of relation schemas S = {R1, R2, …, Rm} and a set of
integrity constraints IC.
▪ A relational database state DB of S is a set of relation states DB
= {r1, r2, …, rm}
▪ such that each ri is a state of Ri
▪ and such that the ri relation states satisfy the integrity constraints
specified in IC.
▪ A relational database schema S
▪ A database state that does not obey all the integrity
constraints is called not valid.
▪ a state that satisfies all the constraints in the defined set of
integrity constraints IC is called a valid state.
▪ There are three basic operations that can change the states of
relations in the database: Insert, Delete, and Update (or
Modify).
COMPANY relational
database schema with
the referential integrity
constraints

You might also like