04 Adbms PDF
04 Adbms PDF
04 Adbms PDF
Management System
By
Bishnu Gautam
New Summit College
The Relational Model
A candidate key is an attribute (or set of attributes) that uniquely identifies a row. A candidate key must
possess the following properties:
•Unique identification - For every row the value of the key must uniquely identify that row.
•Non redundancy - No attribute in the key can be discarded without destroying the property of unique identification.
A primary key is the candidate key which is selected as the principal unique identifier. Every relation must
contain a primary key. The primary key is usually the key selected to identify a row when the database is
physically implemented. For example, a part number is selected instead of a part description.
A superkey is any set of attributes that uniquely identifies a row. A superkey differs from a candidate key in that
it does not require the non redundancy property.
Key….
A technical or surrogate or artificial key is a key for which the possible values
have no obvious meaning to the user or the data. These are used instead of
semantic keys for any of the following reasons:
• When the value in a semantic key is likely to be changed by the user, or can have duplicates. For
example, on a PERSON table it is unwise to use PERSON_NAME as the key as it is possible to have
more than one person with the same name, or the name may change such as through marriage.
• When none of the existing attributes can be used to guarantee uniqueness. In this case adding an
attribute whose value is generated by the system, e.g from a sequence of numbers, is the only way
to provide a unique value. Typical examples would be ORDER_ID and INVOICE_ID. The value '12345'
has no meaning to the user as it conveys nothing about the entity to which it relates.
Keys…
Relation R1 Relation R2 The data values for attribute B in this context will be identical in R1
A B C B D E
and R2.
1 5 3 4 7 4 The instances of R1 and R2 are projections of the instances of
2 4 5 6 2 3 R(A,B,C,D,E) onto the attributes (A,B,C) and (B,D,E) respectively.
8 3 5 5 7 8
A projection will not eliminate data values - duplicate rows are
9 3 3 7 2 3 removed, but this will not remove a data value from any attribute
1 6 5 3 2 2
Relation R1 x R2
5 4 3
A B C D E
2 7 5
1 5 3 7 8
2 4 5 7 4
The join of relations R1 and R2 is possible because B is a common attribute. 8 3 5 2 2
1 6 5 2 3
5 4 3 7 4
2 7 5 2 3
Relation
• The row (2 4 5 7 4) was formed by joining the row (2 4 5) from
relation R1 to the row (4 7 4) from relation R2.
• The two rows were joined since each contained the same value for
the common attribute B.
• The row (2 4 5) was not joined to the row (6 2 3) since the values of
the common attribute (4 and 6) are not the same.
• The relations joined in the preceding example shared exactly one
common attribute.
• However, relations may share multiple common attributes. All of
these common attributes must be used in creating a join.
• For example, the instances of relations R1 and R2 in the following
example are joined using the common attributes B and C
Before join R1 and R2 After join R1 and R2
Relation R1 Relation R2 The row (6 1 4 9) was formed by joining the Relation R1 x R2
8 1 4 1 4 2
The join was created since the common set of 6 1 4 2
5 1 2 1 2 1
attributes (B and C) contained identical values 8 1 4 9
2 7 1
(1 and 4) 8 1 4 2
7 1 2
The row (6 1 4) from R1 was not joined to the
7 1 3 5 1 2 1
row (1 2 1) from R2 since the common
2 7 1 2
attributes did not share identical values - (1 4)
2 7 1 3
in R1 and (1 2) in R2.
The join operation provides a method for reconstructing a relation that was decomposed into two
relations during the normalisation process.
The join of two rows, however, can create a new row that was not a member of the original relation.
Green Algebra A
Now suppose that a list of courses with their corresponding room
numbers is required.
R1 x R4
Algebra 400
The correct result is obtained since the sequence (R1 x r3) x R4 satisfies the lossless (gainless?) join
property
• A relational database is in 4th normal form when the
lossless join property can be used to answer unanticipated
queries.
• However, the choice of joins must be evaluated carefully.
• Many different sequences of joins will recreate an instance
of a relation.
• Some sequences are more desirable since they result in the
creation of less invalid data during the join operation.
• Suppose that a relation is decomposed using functional
dependencies and multi-valued dependencies.
• Then at least one sequence of joins on the resulting
relations exists that recreates the original instance with no
invalid data created during any of the join operations.
For example, suppose that a list of grades by room number is desired.
R1 x R3 R1 x R3
The required information is contained with relations R2 and R4, but these relations
cannot be joined directly.