Fundamentals of database system Module (1)
Fundamentals of database system Module (1)
FACULTY OF TECHNOLOGY
Prepared by:
Nibretu Kebede
July 2022
DTU
Fundamentals of database system module 2022
UNIT ONE
Introduction to Database Systems
Unit description
This unit deals with Database system and File System, Characteristics of the Database Approach, application
area of database. To address these contents Brainstorming, peer & group discussion and gap lecture will be
used more. Question & answer, group works are among the methods to be used.
Objectives: At the end of this unit, students will be able to:
Define database system &File SystemL1(K)
Compare database system &File SystemL5(K)
Differentiate the Characteristics of the Database ApproachL3 (A )
Contents:
Definition of database system &File System
Characteristics of Database Approach
Actors on the Scene
Application of database
Method of Teaching: brain storming, gap lecture, group discussion
Brian storming: what is database?
Database can be defined as:
➢ A shared collection of logically related data, designed to meet the information needs of multiple users in
an organization
➢ It usually refers to data organized and stored on a computer that can be searched and retrieved by a
computer program. This computer program is called Database management system (DBMS)
➢ A collection of information organized and presented to serve a specific purpose. (A telephone book is a
common database.) A computerized database is an updated, organized file of machine readable
information that is rapidly searched and retrieved by computer.
➢ An organized collection of information in computerized format.
➢ A collection of related information about a subject organized in a useful manner
➢ That provides a base or foundation for procedures such as retrieving information, drawing conclusions,
and making decisions.
➢ A Computerized representation of any organizations flow of information and storage of data.
➢ Duplication or redundancy of data (money and time cost and loss of data integrity)
➢ Data dependency on the application
➢ Update anomalies
✓ Modification Anomalies: a problem experienced when one ore more data value is modified on one
application program but not on others containing the same data set.
✓ Deletion Anomalies: a problem encountered where one record set is deleted from one application but
remain untouched in other application programs.
✓ Insertion Anomalies: a problem experienced whenever there is new data item to be recorded, and the
recording is not made in all the applications. And when same data item is inserted at different applications,
there could be errors in encoding which makes the new data item to be considered as a totally different
object.
3. Database Approach
➢ Database is just a computerized record keeping system or a kind of electronic filing cabinet.
➢ Database is a repository for collection of computerized data files.
➢ Database is a shared collection of logically related data and description of data designed to meet the
information needs of an organization.
➢ Database is a collection of logically related data where these logically related data comprises entities,
attributes, relationships, and business rules of an organization's information.
➢ Since a database contains information about the data (metadata), it is called a self-descriptive collection
of integrated records.
➢ The purpose of a database is to store information and to allow users to retrieve and update that information
on demand.
➢ Database is designed once and used simultaneously by many users.
➢ Keeps the data in central position
Compare database system and file system
The early database systems, which appeared in the late 1760’s, evolved from file system. However, file
systems do not be generally guarantee that data cannot be lost if it is not backed up, and do not support efficient
access to data-items whose location in a particular files is not known. Further, file systems do not directly
support item. Their support for a schema for the data-is limited to the creation of directory structures for files.
Finally, file systems do not satisfy.
3. Casual Users
➢ Users who access the database occasionally.
➢ Need different information from the database each time.
➢ Use sophisticated database queries to satisfy their needs.
➢ Casual user most of the time middle to high level managers.
Application of Database
Application and format of database varies from organization to organization and from person to person.
Database can be applied in different areas some of them are:
➢ Bank, ➢ Metrology
➢ airlines, ➢ Geographical Analysis etc.
➢ insurance
Presentation:
Discuss in detail about Database Approach (File oriented Vs Database oriented approach)
Read and present in the class about data base approach.
Database Management System (DBMS)
What is DBMS?
➢ Database Management System (DBMS) is a Software package used for providing EFFICIENT,
CONVENIENT and SAFE MULTI-USER storage and access to MASSIVE amounts of PERSISTENT
(data outlives programs that operate on it) data. A DBMS also provides a systematic method for creating,
updating, storing, retrieving data in a database. DBMS also provides the service of controlling data access,
enforcing data integrity, managing concurrency control, and recovery. Having this in mind, a full scale
DBMS should at least have the following services to provide to the user.
➢ A database management system (DBMS) is a collection of programs that enables users to create and
maintain a database. The DBMS is hence a general-purpose software system that facilitates the processes
of defining, constructing, and manipulating databases for various applications. Defining a database
involves specifying the data types, structures, and constraints for the data to be stored in the database.
Constructing the database is the process of storing the data itself on some storage medium that is
controlled by the DBMS. Manipulating a database includes such functions as querying the database to
retrieve specific data, updating the database to reflect changes in the mini-world, and generating reports
from the data.
➢ Less storage area: Theoretically, all occurrences of data items need be stored only once, thereby
eliminating the storage of redundant data. System developers and database designers often use data
normalization to minimize data redundancy.
➢ Data duplication is reduced: As data is integrated, present on different locations so chances of data
duplication are much reduced and date is updated form.
➢ Data is easy to understand: As data is managed according to the needs of the user and it is in very easy
format so that you have no difficulty in using the data through database management system
Components of Database Management system:
➢ Data: is the unprocessed fact.
➢ DBMS: is a collection of software (tool) that is used to manage the database and its user.
➢ Hardware: It consists of secondary storage disks on which the database resides.
➢ People: this component is composed of the people in organization that are responsible or play a role
designing, implementing, managing, administrating and using the resource in the database.
➢ Procedure: this is rules and regulation on how to use and design the database.
Questions
1. Which one of the following true about database?
A. is a collection of related record in the folders and subfolders
B. Organized collection of information in computerized format.
C. Disorganized collection of information in computerized format.
D. Organized collection of information in bookshelf format.
2. What are the approaches of database?
3. What is data?
UNIT TWO
Database System Concepts and Architecture
Unit description:
In this unit Data Models, Schema and Instances, DBMS Architecture and Data Independence, Database
Language and Interface, the Database System Environment, and Classification of DBMS are contents to be
covered. To deliver these contents brain storming and interactive lecture, Peer teaching, group discussion,
presentation and class work methods are used.
Objectives: At the end of this unit students will be able to:
describe Data Models, Schema and Instances Database Language and Interface
Describe The Database System Environment
Described DBMS
Contents:
Data Models, Schema and Instances
DBMS Architecture and Data Independence
Database Language and Interface
The Database System Environment
Classification of DBMS
Method of Teaching: brain storming, gap lecture, group discussion
Brian storming: what is the term data model, schema & instance?
Gap lecture:
Database Model, Database schema and Database instance:
Database model describes an abstract way how data is represented in an information system or a database
management system
As you can see, the second definition comes after the first one, since it covers the physical implementation of
the data – we could say that a database model is the physical model of a conceptual data model.
There are four main types of model
➢ The hierarchical database model.
➢ The network database model.
➢ The relational database model.
➢ The object-oriented database model.
Assignment
Write the difference between the above four types of database models.
Database schema
Database schema: is the overall description of the database, include explanation of the database constraints
that should hold on the database.
The Three level of schema according there abstraction: entity describes
➢ External schema: at the external level to describe the various user views. Usually uses the same data
model as the conceptual level.
➢ Conceptual schema: at the conceptual level to describe the structure and constraints for the whole
database for a community of users. Uses a conceptual or an implementation data model.
➢ Internal schema: at the internal level to describe physical storage structures and access paths. Typically
uses a physical data model
Database Instances
Instance: is the collection of data in the database at a particular point of time (snap- shot).
➢ Also called State or Snap Shot or Extension of the database
➢ Refers to the actual data in the database at a specific point in time. `
➢ State of database is changed any time we add, delete or update an item.
➢ Since Instance is actual data of database at some point in time, changes rapidly.
Group discussion:
Discus the main difference between the database schemas?
DBMS Architecture and Data Independence
Three important characteristics of the database approaches are
(1) Insulation of programs and data (program-data and program-operation independence)
(2) Support of multiple user views and
(3) Use of a catalog to store the database description (schema).
In this section we specify architecture for database systems, called the three-schema architecture which was
proposed to help achieve and visualize these characteristics.
The Three-Schema Architecture
The goal of the three-schema architecture is to separate the user applications and the physical database. In this
architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of the database.
The internal schema uses a physical data model and describes the complete details of data storage and access
paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole database for a
community of users. The conceptual schema hides the details of physical storage structures and concentrates
on describing entities, data types, relationships, user operations, and constraints. A high-level data model or
an implementation data model can be used at this level.
3. The external or view level includes a number of external schemas or user views. Each external schema
describes the part of the database that a particular user group is interested in and hides the rest of the database
from that user group. A high-level data model or an implementation data model can be used at this level.
Data Independence
The three-schema architecture can be used to explain the concept of data independence, which can be defined
as the capacity to change the schema at one level of a database system without having to change the schema
at the next higher level. We can define two types of data independence:
1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the database (by
adding a record type or data item), or to reduce the database (by removing a record type or data item). In the
latter case, external schemas that refer only to the remaining data should not be affected.
2. Physical data independence is the capacity to change the internal schema without having to change the
conceptual (or external) schemas. Changes to the internal schema may be needed because some physical files
had to be reorganized—for example, by creating additional access structures—to improve the performance of
retrieval or update. If the same data as before remains in the database, we should not have to change the
conceptual schema.
Database Languages
In this section, it is explained how 'a data gets into a database system' and 'how the information gets to the
users'. More correctly formulated the following questions will be answered:
A. How does an application interact with a database management system?
B. How does a user look at a database system?
C. How can a user query a database system and view the results in his/her application?
Data Definition Language (DDL)
For describing data and data structures a suitable description tool, a data definition language (DDL), is needed.
With this help a data scheme can be defined and also changed later.
Typical DDL operations (with their respective keywords in the structured query language SQL):
➢ Creation of tables and definition of attributes (CREATE TABLE ...)
➢ Change of tables by adding or deleting attributes (ALTER TABLE …)
➢ Deletion of whole table including content (!) (DROP TABLE …) etc
account authorization, changing a schema, and reorganizing the storage structures of a database.
The Database System Environment
Reading Assignment
Read and prepare short notes about Database System Environment and Classification of Database
Management Systems.
Questions
1. A collection of data in the data base at the particular time?
a. Database model c. Database schema
b. Database instance d. Database architecture
2. What is database schema?
3. Responsible person who identify the appropriate structure of the database.
4. List types of DBMS Interfaces and discuss each of them.
UNIT THREE
Database Modelling
Unit description
In this unit E/R Model ,Design principles, Network and hierarchical model,Data Modeling using Entity
Relationship, Database Design Using High level Data Models,Entity types and Sets, Attributes and Keys,
Database Abstraction, Relationships will be discussed. To deliver these contents brain Storming and gap
lecture, group discussion, question and answer methods will be used. And the way of assessment will takes
place in the form of questioning and answer, group work, individual assignment, lab assignment, test.
Objectives: At the end of this, unit students will be able to:
Define database Modelling
define Entity, Attributes, Keys, Relationships(components of ERD)
differentiate the types of entities
differentiate E/R Diagram naming conventions, and Design issues
Construct ERD
Contents:
Introduction to Database Modelling
E/R Model
define Entity, Attributes, Keys, Relationships(components of ERD)
differentiate the types of entities
differentiate E/R Diagram naming conventions, and Design issues
Construct ERD
III. Method of Teaching : brain storming , gap lecture, group discussion
Introduction:
Brian storming: what is database Modelling?
Mini lecture:
Data Model: a set of concepts to describe the structure of a database, and certain constraints that the database
should obey.
It is a description of the way that data is stored in a database. Data model helps to understand the relationship
between entities and to create the most effective structure to hold data.
It is a collection of tools or concepts for describing
➢ Data ➢ Data semantics
➢ Data relationships ➢ Data constraints
The main purpose of Data Model is to represent the data in an understandable way.
Categories of data models:
1. Hierarchical Model
➢ The simplest data model
➢ Record type is referred to as node or segment
➢ The top node is the root node
➢ Nodes are arranged in a hierarchical structure as sort of upside-down tree
➢ A parent node can have more than one child node
➢ A child node can only have one parent node
➢ The relationship between parent and child is one-to-many
➢ Relation is established by creating physical link between stored records (each is stored with a predefined
access path to other records)
➢ To add new record type or relationship, the database must be redefined and then stored in a new form.
2. Network Model
➢ Allows record types to have more than one parent unlike hierarchical model
➢ A network data models sees records as set members
➢ Each set has an owner and one or more members
➢ Allow no many to many relationship between entities
➢ Like hierarchical model network model is a collection of physically linked records.
➢ Allow member records to have more than one owner
Reading assignment
Database Design Using High level Data Models.
Components of Entity-Relational (ER) Model
1. Entities 3. Relationships
2. Attributes 4. Relational constraints
1. Entities
The basic object that the ER model represents is an entity, which is a "thing" in the real world with an
independent existence.
➢ Entity Types and Entity sets
An entity type defines a collection (or set) of entities that have the same attributes. A few individual entities
of each type are also illustrated, along with the values of their attributes. The collection of all entities of a
particular entity type in the database at any point in time is called an entity set; the entity set is usually referred
to using the same name as the entity type
2. Attribute
Each entity has attributes—the particular properties that describe it. For example, an employee entity may
be described by the employee’s name, age, address, salary, and job.
Types of Attributes
Several types of attributes occur in the ER model: simple versus composite; single-valued versus multi-
valued; and stored versus derived. We first define these attribute types and illustrate their use via examples.
We then introduce the concept of a null value for an attribute.
➢ Composite versus Simple (Atomic) Attributes
Composite attributes can be divided into smaller subparts, which represent more basic attributes with
independent meanings. For example, the Address attribute of the employee entity can be sub-divided into
City, Region, and Zip. Attributes that are not divisible are called simple or atomic attributes. The value of a
composite attribute is the concatenation of the values of its constituent simple attributes.
Composite attributes are useful to model situations in which a user sometimes refers to the composite attribute
as a unit but at other times refers specifically to its components. If the composite attribute is referenced only
as a whole, there is no need to subdivide it into component attributes. For example, if there is no need to refer
to the individual components of an address (Zip, Street, and so on), then the whole address is designated as a
simple attribute.
KEYS: A key is an attribute or set of attributes in a relation that uniquely identifies each tuple
in the relation.
Types of keys
➢ Super keys ➢ Alternate key
➢ Candidate Keys ➢ Foreign key
➢ Primary key
➢ Composite primary key.
Reading assignment
Read and prepare short note about each types of keys listed above.
3. Relationships
The relationship between entities which exist must be taken into account when processing information. In
any business processing one object may be associated with another object due to some event. Such kind of
association is what we call a RELATIONSHIP between entity objects
➢ One external event or process may affect several related entities.
➢ Related entities require setting of LINKS from one part of the database to another.
➢ A relationship should be named by a word or phrase which explains its function
➢ Role names are different from the names of entities forming the relationship: one entity may take on many
roles, the same role maybe played by different entities
➢ For each RELATIONSHIP, one can talk about the Number of Entities and the Number of Tuples
participating in the association. These two concepts are called DEGREE and CARDINALITY of a
relationship respectively.
Degree of Relationship
➢ An important point about a relationship is how many entities participate in it. The number of entities
participating in a relationship is called the DEGREE of the relationship.
➢ Among the Degrees of relationship, the following are the basic:
• UNARY/RECURSIVE RELATIONSHIP: Tuples /records of a Single entity are related with each other.
• BINARY RELATIONSHIPS: Tuples/records of two entities are associated in a relationship
• TERNARY RELATIONSHIP: Tuples/records of three different entities are associated
• And a generalized one: N-NARY RELATIONSHIP: Tuples from arbitrary number of entity sets are
participating in a relationship.
Cardinality of Relationship
Another important concept about relationship is the number of instances/ tuples that can be associated with a
single instance from one entity in a single relationship. The number of instances participating or associated
with a single instance from an entity in a relationship is called the CARDINALITY of the relationship. The
major cardinalities of a relationship are:
➢ ONE-TO-ONE: one tuple is associated with only one other tuple.
➢ ONE-TO-MANY, one tuple can be associated with many other tuples, but not the reverse.
➢ MANY-TO-MANY: one tuple is associated with many other tuples and from the other side, with a
different role name one tuple will be associated with many tuples.
Assessments:
1. Discuss the role of a high-level data model in the database design process.
2. Define the following terms: entity, attribute, attribute value, relationship instance, composite attribute,
multi-valued attribute, derived attribute, complex attribute, key attribute, value set (domain).
3. What is an entity type? What is an entity set? Explain the differences among an entity, an entity type, and
an entity set.
4. E/R Diagram naming conventions, and Design issues
UNIT FOUR
Record Storage and Primary File Organization
Unit description
In these unit Operations on Files, Files of Unordered Records (Heap Files), Files of Ordered Records (Sorted
Files), Hashing Techniques, Index Structure for Files, Single level Ordered Index, multi-level Ordered Index
on B tree and B+ trees are contents to be covers. To deliver these contents brain storming and presentation,
group discussion, demonstration methods will be used. And the way of assessment will takes place in the form
of questioning and answer, group work, individual assignment, lab assignment, test
Objectives: At the end of this unit, students will be able to:
Define file , record& file operation
compare single level ordered index and multilevel ordered index
differentiate Files of Unordered Records (Heap Files)& Files of Ordered Records (Sorted Files)
.differentiate Hashing Techniques
Understand Index Structure for Files
Contents:
Operations on Files
Files of Unordered Records (Heap Files)
Files of Ordered Records (Sorted Files)
Hashing Techniques
Index Structure for Files
➢ Single level ordered index and multilevel ordered index
➢ Dynamic Multilevel indexes using B-Trees and B+ Trees
➢ Indexes on Multiple Indexes
Method of Teaching: brain storming, gap lecture, group discussion group presentation.
Brian storming: what is file?
File Organization
A file is organized logically as a sequence of records. These records are mapped onto disk blocks. Files are
provided as a basic construct in operating systems.
Organization of Records in Files
An instance of a relation is a set of records. Given a set of records, the next question is how to organize them
in a file. Several of the possible ways of organizing records in files are:
➢ Heap files organization. Any record can be placed anywhere in the file where there is space for the
record. There is no ordering of records. Typically, there is a single file for each relation
➢ Sequential file organization. Records are stored in sequential order, according to the value of a “search
key” of each record.
➢ Hashing file organization. A hash function is computed on some attribute of each record. The result of
the hash function specifies in which block of the file the record should be placed. Generally, a separate
file is used to store the records of each relation.
➢ Clustering file organization, records of several different relations are stored in the same file; further,
related records of the different relations are stored on the same block, so that one I/O operation fetches
related records from all the relations. For example, records of the two relations can be considered to be
related if they would match in a join of the two relations
There are two basic kinds of indices:
Ordered indices. Based on a sorted ordering of the values.
Hash indices. Based on a uniform distribution of values across a range of buckets. The bucket to which a
value is assigned is determined by a function, called a hash function.
We shall consider several techniques for both ordered indexing and hashing. No one technique is the best.
Rather, each technique is best suited to particular database applications. Each technique must be evaluated on
the basis of these factors:
➢ Access types: The types of access that are supported efficiently. Access types can include finding records
with a specified attribute value and finding records whose attribute values fall in a specified range.
➢ Access time: The time it takes to find a particular data item, or set of items, using the technique in question.
➢ Insertion time: The time it takes to insert a new data item. This value includes the time it takes to find
the correct place to insert the new data item, as well as the time it takes to update the index structure.
➢ Deletion time: The time it takes to delete a data item. This value includes the time it takes to find the item
to be deleted, as well as the time it takes to update the index structure.
➢ Space overhead: The additional space occupied by an index structure. Provided that the amount of
additional space is moderate, it is usually useful to sacrifice the space to achieve improved performance.
Ordered Indices
To gain fast random access to records in a file, we can use an index structure. Each index structure is associated
with a particular search key. Just like the index of a book or a library catalog, an ordered index stores the
values of the search keys in sorted order, and associates with each search key the records that contain it.
The records in the indexed file may themselves be stored in some sorted order, just as books in a library are
stored according to some attribute.
File operations:
1. Open: Prepares the file for reading or writing. Allocates appropriate buffers (typically at least two) to hold
file blocks from disk, and retrieves the file header. Sets the file pointer to the beginning of the file.
2. Reset: Sets the file pointer of an open file to the beginning of the file.
3. Find (or Locate): Searches for the first record that satisfies a search condition. Transfers the block
containing that record into a main memory buffer (if it is not already there). The file pointer points to the
record in the buffer and it becomes the current record. Sometimes, different verbs are used to indicate
whether the located record is to be retrieved or updated.
4. Read (or Get): Copies the current record from the buffer to a program variable in the user program. This
command may also advance the current record pointer to the next record in the file, which may necessitate
reading the next file block from disk.
5. Find Next: Searches for the next record in the file that satisfies the search condition. Transfers the block
containing that record into a main memory buffer (if it is not already there). The record is located in the
buffer and becomes the current record.
6. Delete: Deletes the current record and (eventually) updates the file on disk to reflect the deletion.
7. Modify: Modifies some field values for the current record and (eventually) updates the file on disk to
reflect the modification.
8. Insert: Inserts a new record in the file by locating the block where the record is to be inserted, transferring
that block into a main memory buffer (if it is not already there), writing the record into the buffer, and
(eventually) writing the buffer to disk to reflect the insertion.
9. Close: Completes the file access by releasing the buffers and performing any other needed cleanup
operations.
Multilevel Indices
Indices with two or more levels are called multilevel indices. Searching for records with a multilevel
index requires significantly fewer I/O operations than does searching for records by binary search. Each
level of index could correspond to a unit of physical storage. Thus, we may have indices at the track,
cylinder, and disk levels.
Presentation:
Read more on this chapter and present in the class
Questions
1. How Heap file organization place the record?
2. What are the possible ways of organizing records in files?
UNIT FIVE
Relational algebra operation and Relational calculus
Unit description
In this unit relational algebra operations, Relational calculus are contents to be covers, to deliver these contents
brain storming and gap lecture, group discussion, and demonstration methods will be used. And the way of
assessment will takes place in the form of questioning and answer, group assignment,
Objectives: At the end of this unit students will be able to:
Explain relational algebra operations
list types of relational Algebra
understand Relational calculus
Contents:
relational algebra operations
types of relational algebra
Relational calculus
Method of Teaching: brain storming, gap lecture, group discussion, and Presentation
Brian storming: what is relation?
The basic set of operations for the relational model is known as the relational algebra. These operations enable
a user to specify basic retrieval requests.
The result of the retrieval is a new relation, which may have been formed from one or more relations. The
algebra operations thus produce new relations, which can be further manipulated using operations of the
same algebra.
A sequence of relational algebra operations forms a relational algebra expression; whose result will also be
a relation that represents the result of a database query (or retrieval request).
✓ Relational algebra is a theoretical language with operations that work on one or more relations to define
another relation without changing the original relation.
✓ The output from one operation can become the input to another operation (nesting is possible)
✓ There are different basic operations that could be applied on relations on a database based on the
requirement.
Selection Join
Operators - Write
❖ INSERT - provides a list of attribute values for a new tuple in a relation. This operator is the same as SQL.
❖ DELETE - provides a condition on the attributes of a relation to determine which tuple(s) to remove from
the relation. This operator is the same as SQL.
❖ MODIFY - changes the values of one or more attributes in one or more tuples of a relation, as identified
by a condition operating on the attributes of the relation. This is equivalent to SQL UPDATE.
Operators - Retrieval
There are two groups of operations:
• Mathematical set theory based relations: UNION, INTERSECTION, DIFFERENCE, and CARTESIAN
PRODUCT.
• Special database operations: SELECT, PROJECT, and JOIN.
The SELECT Operation: The SELECT operation is used to select a subset of the tuples from a relation
that satisfy a selection condition. One can consider the SELECT operation to be a filter that keeps only
those tuples that satisfy a qualifying condition.
Set Operations
Consider two relations R and S.
UNION of R and S: The union of two relations is a relation that includes all the tuples that are either in
R or in S or in both R and S. Duplicate tuples is eliminated.
INTERSECTION of R and S: The intersection of R and S is a relation that includes all tuples that are
both in R and S.
DIFFERENCE of R and S: The difference of R and S is the relation that contains all the tuples that are
in R but that are not in S.
SET Operations - requirements
For set operations to function correctly the relations R and S must be union compatible. Two relations are
union compatible if
they have the same number of attributes
The domain of each attribute in column order is the same in both R and S.
Natural Join
Invariably the JOIN involves an equality test, and thus is often described as an equi-join. Such joins result in
two attributes in the resulting relation having exactly the same value. A `natural join' will remove the duplicate
attribute(s).
In most systems a natural join will require that the attributes have the same name to identify the attribute(s)
to be used in the join. This may require a renaming mechanism.
If you do use natural joins make sure that the relations do not have two attributes with the same name by
accident.
OUTER JOINs
Notice that much of the data is lost when applying a join to two relations. In some cases this lost data might
hold useful information. An outer join retains the information that would have been lost from the tables,
replacing missing data with nulls.
There are three forms of the outer join, depending on which data is to be kept.
❖ LEFT OUTER JOIN - keep data from the left-hand table
❖ RIGHT OUTER JOIN - keep data from the right-hand table
Relational Calculus
Reading Assignment
Read about Relational Calculus
UNIT SIX
Database Design
Unit description
In this unit Introduction to database design, Functional Dependency and Normalization will be discussed. To
deliver these contents brain storming and presentation, group discussion, demonstration methods will be used.
And the way of assessment will takes place in the form of questioning and answer, group work, individual
assignment, lab assignment, test
Objectives: At the end of this unit students will be able to:
Describe database design
Describe Functional Dependency
Understand about Normalization
Contents:
Introduction to database design Normalization
Functional Dependency Forms of Normalization
Method of Teaching: brain storming, gap lecture, group discussion
Brian storming: what is database design?
Database design is the process of coming up with different kinds of specification for the data to be stored in
the database. The database design part is one of the middle phases we have in information systems
development where the system uses a database approach. Design is the part on which we would be engaged
to describe how the data should be perceived at different levels and finally how it is going to be stored in a
computer system.
Database Development Life Cycle
As it is one component in most information system development tasks, there are several steps in designing a
database system. Here more emphasis is given to the design phases of the system development life cycle.
Information System with Database application consists of several tasks which include:
❖ Planning of Information systems Design
❖ Requirements Analysis,
❖ Design (Conceptual, Logical and Physical Design)
❖ Implementation
❖ Testing and deployment
❖ Operation and Support
Conceptual design
Physical design
Strong entity
Weak entity
Attributes
Derived Attribute
Key FK
Database Normalization
Database Normalization is a series of steps followed to obtain a database design that allows for consistent
storage and efficient access of data in a relational database. These steps reduce data redundancy and the risk
of data becoming inconsistent
Normalization is the process of identifying logical association between data item and designing a database
that will represent such abscission but without suffering the update anomalies which are:
➢ Insertion Anomalies
➢ Deletion Anomalies
➢ Modification Anomalies
Normalization may reduce system performance since data will be cross referenced from many tables. Thus
demoralization is sometimes used to improve performance, at the cost of reduced consistency guarantees.
Normalization normally is considered as good if it is lossless decomposition.
All the normalization rules will eventually remove the update anomalies that may exist during data
manipulation after the implementation. The update anomalies are;
The type of problems that could occur in insufficiently normalized table is called update anomalies which
includes:
1. Insertion anomalies
An "insertion anomaly" is a failure to place information about a new database entry into all the places in the
database where information about that new entry needs to be stored.
In a properly normalized database, information about a new entry needs to be inserted into only one place in
the database; in an inadequately normalized database, information about a new entry may need to be inserted
into more than one place and, human fallibility being what it is, some of the needed additional insertions may
be missed.
2. Deletion anomalies
A "deletion anomaly" is a failure to remove information about an existing database entry when it is time to
remove that entry. In a properly normalized database, information about an old, to-be-gotten-rid-of entry
needs to be deleted from only one place in the database; in an inadequately normalized database, information
about that old entry may need to be deleted from more than one place, and, human fallibility being what it is,
some of the needed additional deletions may be missed.
3. Modification anomalies
A modification of a database involves changing some value of the attribute of a table. In a properly normalized
database table, whatever information is modified by the user, the change will be effected and used accordingly.
Functional Dependency (FD)
Before moving to the definition and application of normalization, it is important to have an understanding of
"functional dependency."
Three Type of Functional Dependency
1. Partial Dependency
If an attribute which is not a member of the primary key is dependent on some part of the primary key (if we
have composite primary key) then that attribute is partially functionally dependent on the primary key.
2. Full Dependency
If an attribute which is not a member of the primary key is not dependent on some part of the primary key but
the whole key (if we have composite primary key) then that attribute is fully functionally dependent on the
primary key.
3. Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of the following form: "If A implies B, and
if also B implies C, then A implies C."
Steps (Forms) of Normalization:
We have various levels or steps in normalization called Normal Forms. The level of complexity, strength of
the rule and decomposition increases as we move from one lower level Normal Form to the higher.
Questions
1. Which one of the following is not done in 1NF
A. There are no duplicated rows in the table.
B. Each cell is single-valued
C. Entries in a column are of the same kind
D. non-key attributes are dependent on the entire primary key
2. What is database normalization?
3. the failure to remove information about an existing database entry when it is time to remove that entry is
called----------------
A. insertion anomalies
B. deletion anomalies
C. modification anomalies
D. dependency anomalies
UNIT SEVEN
STRUCTURAL QUERY LANGUAGES (SQL)
WHAT IS SQL?
SQL stands for Structured Query Language. It was developed in the 1970s at IBM as a way to provide
computer users with a standardized method for selecting data from various database formats. The intent was
to build a language that was not based on any existing programming language, but could be used within any
programming language as a way to update and query information in databases.
SQL statements are just that--statements. Each statement can perform operations on one or more database
objects (tables, columns, indexes, and so on). Most SQL statements return results in the form of a set of data
records, commonly referred to as a view. SQL is not a particularly friendly language. Many programs that use
SQL statements hide these statements behind point-and-click dialogs, query-by-example grids, and other user-
friendly interfaces. Make no mistake, however, that if the data you are accessing is stored in a relational
database, you are using SQL statements, whether you know it or not.
SQL is a powerful manipulation language used by Visual Basic and the Microsoft Access Jet database engine
as the primary method for accessing the data in your databases.
Use of SQL
• SQL can execute queries against a database
• SQL can retrieve data from a database
• SQL can insert records in a database
• SQL can update records in a database
• SQL can delete records from a database
• SQL can create new databases
• SQL can create new tables in a database
• SQL can create views in a database
• SQL can set permissions on tables, procedures, and views
SQL Data Type
In SQL there are three main data types: text, number, and Date/Time types.
Text types:
Data type Description
CHAR(size) Holds a fixed length string (can contain letters, numbers, and special characters). The
VARCHAR(size) Holds a variable length string (can contain letters, numbers, and special characters).
The maximum size is specified in parenthesis. Can store up to 255 characters. Note: If
you put a greater value than 255 it will be converted to a TEXT type
BLOB For BLOBs (Binary Large OBjects). Holds up to 65,535 bytes of data
LONGBLOB For BLOBs (Binary Large OBjects). Holds up to 16,777,215 bytes of data
SET For BLOBs (Binary Large OBjects). Holds up to 4,294,967,295 bytes of data
Number types:
Data type Description
FLOAT(size,d) A small number with a floating decimal point. The maximum number of digits may be
specified in the size parameter. The maximum number of digits to the right of the decimal
point is specified in the d parameter
DOUBLE(size,d) A large number with a floating decimal point. The maximum number of digits may be
specified in the size parameter. The maximum number of digits to the right of the decimal
point is specified in the d parameter
DECIMAL(size,d) A DOUBLE stored as a string, allowing for a fixed decimal point. The maximum number of
digits may be specified in the size parameter. The maximum number of digits to the right of
the decimal point is specified in the d parameter
Date types:
Data type Description
TIMESTAMP() *A timestamp. TIMESTAMP values are stored as the number of seconds since the
Unix epoch ('1970-01-01 00:00:00' UTC). Format: YYYY-MM-DD HH:MM:SS
Note: The supported range is from '1970-01-01 00:00:01' UTC to '2038-01-09
03:14:07' UTC
The FOREIGN KEY constraint is used to prevent actions that would destroy link between tables.
The FOREIGN KEY constraint also prevents that invalid data is inserted into the foreign key column, because
it has to be one of the values contained in the table it points to.
SQL CHECK Constraint
The CHECK constraint is used to limit the value range that can be placed in a column.
If you define a CHECK constraint on a single column it allows only certain values for this column.
If you define a CHECK constraint on a table it can limit the values in certain columns based on values in other
columns in the row.
SQLDROP TABLE, and DROP DATABASE
3. The DROP TABLE Statement
The DROP TABLE statement is used to delete a table.
5.2 To delete a column in a table, use the following syntax (notice that some database systems don't allow
deleting a column):
5.3 To change the data type of a column in a table, use the following syntax:
The second form specifies both the column names and the values to be inserted:
INSERT INTO table_name (column1, column2, column3,...)
VALUES (value1, value2, value3,...)
UPDATE table_name
SET column1=value, column2=value2,...
WHERE some_column=some_value
Note: Notice the WHERE clause in the UPDATE syntax. The WHERE clause specifies which record or
records that should be updated. If you omit the WHERE clause, all records will be updated!
SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern
Note: Notice the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or
records that should be deleted. If you omit the WHERE clause, all records will be deleted!
Delete All Rows
It is possible to delete all rows in a table without deleting the table. This means that the table structure,
attributes, and indexes will be intact:
DELETE FROM table_name
Group discussion:
Discus on SQL statements by giving different examples.
Questions
1. What are the two relational algebra operators?
2. Which one of the following is the allowed operator in where close to say not equal
A. <>
B. !=
C. ==
D. !==
P_Id LastName FirstName Address City
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
If you want to add a column named "DateOfBirth" in theabove table called "Persons" table.
Here is the SQL statement you are going to used:
Possible answer
1. Mathematical set theory and relational database it self
2. A