Lecture1.1 Database Concepts
Lecture1.1 Database Concepts
CPSC 4270/5270
CPSC 4670 4
Textbooks
Wont follow any book, but you may want to
consult them if you need more details to
understand a topic
Sam Afyouni, Database Security and Auditing:
Protecting Data Integrity and Accessibility.
Thomson. ISBN: 0-619-21559-3, 2005.
The Complete Book, GarciaMolina, Ullman,
Widom
Ron Ben-Natan, Implementing Database
Security and Auditing, Elsevier digital press.
ISBN: 1-55558-334-2. 2005.
Database Basics
CPSC 4270/5270
Database
What is a database ?
9
Users, Database
A user of a database A database is a self-describing
system will collection of related records
Self-describing
Use a database
application to track The database itself
contains the definition of its
things structure
Use forms to enter, Metadata is data describing
read, delete and the structure of the
query data database data
Produce reports Tables within a relational
database are related to each
other
10
Database Management System (DBMS)
A big C program written by someone else that
allows us to manage efficiently a large
database and allows it to persist over long
periods of time
A database management system (DBMS)
serves as an intermediary between database
applications and the database
The DBMS manages and controls database
activities
The DBMS creates, processes and administers
the databases it controls
11
Database Applications
A database application is a set of one or more
computer programs that serves as an
intermediary between the user and the DBMS
A database application
Create and process forms
Process user queries
Create and process reports
Execute application logic
Control database applications
12
Desktop Database Systems
Desktop database systems typically:
Have one application
Have only a few tables
Are simple in design
Involve only one computer
Support one user at a time
13
Organizational Database Systems
Organizational database
systems typically:
Support several users
simultaneously
Include more than one
application
Involve multiple computers
Are complex in design
Have many tables
Have many databases
14
DBMS Products
Example Desktop DBMS Products
Microsoft Access
Example Organizational DBMS Products
MySQL
Oracles Oracle
Microsofts SQL Server
IBMs DB2
InterSystems Cach
http://www.intersystems.com/cache/
15
Functionality
1. Create/store large datasets
2. Search/query/update
3. Change the structure
4. Concurrent access to many user
5. Recover from crashes
6. Security (not here, but in other apps)
Possible Organizations
Files
Spreadsheets
DBMS
1. Create/store Large Datasets
Yes, but
Files
Yes
DBMS
2. Search/Query/Update
Simple query:
In what year was Rain man produced ?
Multi-table query:
Find all movies by Coppola
Complex query:
For each actor, count her/his movies
Updating
Insert a new movie; add an actor to a
movie; etc
2. Search/Query/Update
Simple queries
Files
Multi-table queries
Spreadsheets (maybe)
All
DBMS
Updates: generally OK
3. Change the Structure
Add Address to each Actor
Very hard
Files
Yes
Spreadsheets
Yes
DBMS
4. Concurrent Access
Multiple users access/update the data concurrently
connection
(ODBC, JDBC)
Database server
(someone elses
C program)
Data files Applications
DBMS = Collection of Tables
Directors: Movie_Directors:
id fName lName id mid
15901 Francis Ford Coppola 15901 130128
... ...
Movies:
mid Title Year
130128 The Godfather 1972
...
UPDATE Accounts
SET amount = amount + 100
WHERE number = 7199
COMMIT
Transactions
Transactions have the ACID properties:
A = atomicity
C = consistency
I = isolation
D = durability
4. Concurrent Access
Serializable execution of transactions
The I (=isolation) in ACID
36
Entity
An entity is something of importance to
a user that needs to be represented in
a database
An entity represents one theme or topic
37
Relation
A relation is a two-dimensional table
that has specific characteristics
The table dimensions, like a matrix,
consist of rows and columns
38
Characteristics of a Relation
Rows contain data about an entity
Columns contain data about attributes of the
entity
Cells of the table hold a single value
All entries in a column are of the same kind
Each column has a unique name
The order of the columns is unimportant
The order of the rows is unimportant
No two rows may be identical
39
A Sample Relation
40
A Nonrelation Example
Cells of the table hold multiple values
43
A Key and Uniqueness of Keys
A key is one (or more) columns of a relation
that is (are) used to identify a row
44
A Composite Key
A composite key is a key that contains two or
more attributes
For a key to be unique, often it must become a
composite key
For example,
To identify a family member, you need to know a
FamilyID, a FirstName, and a Suffix (e.g., Jr.)
The composite key is:
(FamilyID, FirstName, Suffix)
One needs to know the value of all three columns to
uniquely identify an individual
45
A Candidate and Primary Key
A candidate key is called candidate because it is
a candidate to become the primary key
A candidate key is a unique key
A primary key is a candidate key chosen to be the
main key for the relation
If you know the value of the primary key, you will
be able to uniquely identify a single row
46
Relationships Between Tables
A table may be related to other tables
For example
An Employee works in a Department
A Manager controls a Project
To preserve relationships, you may need to
create a foreign key
A foreign key is a primary key from one table
placed into another table
The key is called a foreign key in the table that
received the key
47
Foreign Key Example
Project Manager
ProjName MgrName
MgrID
Foreign Key
48
Foreign Key Example
Department Employee
DeptName DeptID
Foreign Key
Location EmpName
49
Referential Integrity
Referential integrity states that every
value of a foreign key must match a
value of an existing primary key
For example (see previous slide)
If EmpID = 4 in EMPLOYEE has a DeptID
= 7 (a foreign key), a Department with
DeptID = 7 must exist in DEPARTMENT
50
Referential Integrity
Another perspective
The value of the Foreign Key EmployeeID
in EQUIPMENT
must exist in
51
A Surrogate Key
A Surrogate Key is a unique, numeric
value that is added to a relation to
serve as the Primary Key
Surrogate Key values have no meaning
to users and are usually hidden in
forms, queries and reports
A Surrogate Key is often used in place
of a composite primary key
52
Surrogate Key Example
If the Family Member Primary Key is FamilyID, FirstName,
Suffix, it would be easier to append and use a surrogate
key of FamMemberID
FamilyID, FirstName and Suffix remain in the relation
Referential Integrity:
Use
(FamMemberID) in School must exist in
(FamMemberID) in FamilyMember
Instead of:
(FamilyID, FirstName, Suffix) in School must exist in
(FamilyID, FirstName, Suffix) in FamilyMember
53
Functional Dependency
A relationship between attributes in
which one attribute (or group of
attributes) determines the value of
another attribute in the same table
Illustration
The price of one cookie can determine the
price of a box of 12 cookies
(CookiePrice, Qty) BoxPrice
54
Determinants
The attribute (or attributes) that we use
as the starting point (the variable on
the left side of the equation) is called a
determinant
(CookiePrice, Qty) BoxPrice
Determinant
55
Candidate/Primary Keys and
Functional Dependency
By definition
A candidate key of a relation will functionally
determine all other attributes in the row
Likewise, by definition
A primary key of a relation will functionally
determine all other attributes in the row
56
Primary Key and Functional
Dependency Example
(EmpLastName,
(EmployeeID)
EmpPhone)
(ProjectID) (ProjectName,
StartDate)
57
Normalization
Normalization is a process of analyzing
a relation to ensure that it is well-
formed
More specifically, if a relation is
normalized (well-formed), rows can be
inserted, deleted, or modified without
creating update anomalies
58
Normalization Principles
Relational design principles for
normalized relations:
To be a well-formed relation, every
determinant must be a candidate key
Any relation that is not well formed should
be broken into two or more well-formed
relations.
59
Normalization Example
(StudentID) (StudentName,
DormName, DormCost)
However, if
(DormName) (DormCost)
Then DormCost should be placed into its own relation,
resulting in the relations:
(StudentID) (StudentName,
DormName)
(DormName) (DormCost)
60
Normalization Example
(AttorneyID, (ClientName,
ClientID) MeetingDate, Duration)
However, if
(ClientID) (ClientName)
Then ClientName should be placed into its own relation,
resulting in the relations:
(AttorneyID, (MeetingDate, Duration)
ClientID)
(ClientID) (ClientName)
61