Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

DBMS Unit-2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

DBMS UNIT-2

Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process
to obtain the result of the query. It uses operators to perform queries.
Types of Relational operation/ algebra
1. Select Operation: The select operation selects tuples that satisfy a given
predicate. It is denoted by sigma (σ).
Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula AND OR and NOT. These
relational can use as relational operators like =, ≠, ≥, <, >, ≤.

1
DBMS UNIT-2
2. Project Operation: This operation shows the list of those attributes that we
wish to appear in the result. Rest of the attributes are eliminated from the
table. It is denoted by ∏.
Notation: ∏ A1, A2, An (r) Where A1, A2, A3 is used as an
attribute name of relation r.
Example: Consider Table 1. Suppose we want columns B and C
from Relation R.
π(B,C)R will show following columns.
3. Union Operation: Suppose there are two tuples R and S. The
union operation contains all the tuples that are either in R or S or both in R &
S.
It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S
Condition: R and S must have the attribute of the same number.
FRENCH GERMAN

π(Student_Name)FRENCH U π(Student_Name)GERMAN

2
DBMS UNIT-2
4. Set Intersection: Suppose there are two tuples R and S. The set
intersection operation contains all tuples that are in both R & S. It is denoted
by intersection ∩.
Notation: R ∩ S
π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN
5. Set Difference: Suppose there are two tuples R and S.
The set intersection operation contains all tuples that are
in R but not in S. It is denoted by intersection minus (-).
Notation: R - S
π(Student_Name)FRENCH - π(Student_Name)GERMAN
6. Cartesian product: The Cartesian product is used to
combine each row in one table with each row in the other
table. It is also known as a cross product. It is denoted by
X.
A B AXB

7. Rename(ρ): Rename is a unary operation used for renaming attributes of a


relation. ρ(a/b)R will rename the attribute 'b' of the relation by 'a'.

3
DBMS UNIT-2
Difference between Tuple Relational Calculus (TRC) and Domain
Relational Calculus (DRC)

S. Basis of Tuple Relational Domain Relational


No. Comparison Calculus (TRC) Calculus (DRC)

The Tuple Relational DRC is used to specify the


Calculus (TRC) is used desired values or conditions
1. Definition
to select tuples from a for attributes within a
relation. relation.

In TRC, the variables


In DRC, the variables
Representation represent the tuples
2. represent the value drawn
of variables from specified
from a specified domain.
relations.

A tuple is equivalent A domain is equivalent to


3. Tuple/ Domain
to row data type. column data type.

This filtering is based This filtering is based on


4. Filtering
on tuple of relations. the domain of attributes.

It returns those tuples It returns the required


5. Return Value that met the attribute that met the
condition. condition.

The query cannot be


The query can be
Membership expressed using a
6. expressed using a
condition membership
membership condition.
condition.

The QUEL or Query


The QBE or Query-By-
Language is a query
7. Query Language Example is query language
language related to
related to it.
it,

4
DBMS UNIT-2

It reflects traditional
It is more similar to logic as
8. Similarity pre-relational file
a modeling language.
structures.

Notation: {T | P (T)} Notation: { a1, a2, a3, …,


9. Syntax
or {T | Condition (T)} an | P (a1, a2, a3, …, an)}

{T | EMPLOYEE (T)
{ | < EMPLOYEE >
10. Example AND T.DEPT_ID =
DEPT_ID = 10 }
10}

Focuses on selecting Focuses on selecting values


11. Focus
tuples from a relation from a relation

Uses tuple variables Uses scalar variables (e.g.,


12. Variables
(e.g., t) a1, a2, …, an)

13. Expressiveness Less expressive More expressive

Easier to use for More difficult to use for


14. Ease of use
simple queries. simple queries.

Useful for selecting


Useful for selecting specific
tuples that satisfy a
values or for constructing
15. Use case certain condition or
more complex queries that
for retrieving a subset
involve multiple relations.
of a relation.

5
DBMS UNIT-2

{T | EMPLOYEE (T)
AND T.DEPT_ID =
{ | < EMPLOYEE >
10}
DEPT_ID = 10 }
This selects all the select EMP_ID and
16. Example
tuples of employee EMP_NAME of employees
names who work for who work for department
Department 10. 10.

SQL Commands - SQL commands are divided into four subgroups, DDL, DML,
DCL, and TCL.
DDL - DDL stands for Data Definition Language. It deals with database schemas
and descriptions, of how the data should reside in the database.

1. CREATE - to create a database and its objects like (table, index, views, store
procedure, function, and triggers).

6
DBMS UNIT-2

2. ALTER - alters the structure of the existing database.

The "department" column has been added to the table, and it is of type
VARCHAR(50), capable of storing a string of up to 50 characters.

3. DROP - delete objects from the database.

It will delete the entire "employees" table from the database. After executing
this command, the table, along with all its data and structure, will be
permanently removed.

4. TRUNCATE - remove all records from a table, including all spaces allocated
for the records are removed.
7
DBMS UNIT-2

After executing this TRUNCATE command, all


records in the "employees" table will be removed.

Executing this SELECT statement will return an


empty result set since all records have been
deleted by the TRUNCATE command.

5. COMMENT - add comments to the data dictionary.

8
DBMS UNIT-2
Executing this COMMENT command will add the provided comment with the
"employees" table in the database's data dictionary.

6. RENAME - rename an object.

After executing this RENAME


command, the "employees" table
will be renamed to "staff".
DML - DML stands for Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT, INSERT,
UPDATE, DELETE, etc., and it is used to store, modify, retrieve, delete and update
data in a database.
1. SELECT - retrieve data from a database
2. INSERT - insert data into a table
3. UPDATE - updates existing data within a table

9
DBMS UNIT-2

10
DBMS UNIT-2
4. DELETE - Delete all records from a database table

DCL – DCL stands for Data Control Language which includes commands such as
GRANT (related to rights, permissions) and other controls of the database system.
1. GRANT - allow users access privileges to the database.

After executing this GRANT command, the user "user1" will be granted the SELECT
privilege on the "employees" table. This means that "user1" will be able to query
and retrieve data from the "employees" table but won't have permission to
perform other operations like INSERT, UPDATE, or DELETE on the table.

2. REVOKE - withdraw users access privileges given by using the GRANT


command.

After executing this REVOKE command, the SELECT privilege previously granted
to "user1" on the "employees" table will be revoked. This means that "user1" will
no longer have the permission to query and retrieve data from the "employees"
table.

11
DBMS UNIT-2
TCL – TCL stands for Transaction Control Language which deals with a transaction
within a database.
1. COMMIT - commits a Transaction.

If you have made some modifications to the "employees" table, such as inserting,
updating, or deleting records, you can use the COMMIT command to permanently
save those changes in the database. The COMMIT command will finalize the
transaction and make the changes permanent.
2. ROLLBACK - rollback a transaction in case of any error occurs.

If you have started a transaction and made some modifications to the "employees"
table, such as inserting, updating, or deleting records, you can use the ROLLBACK
command to undo those changes and revert the table to its previous state.
3. SAVEPOINT - to roll back the transaction making points within groups.

If you have started a transaction and made some modifications to the "employees"
table, you can use the SAVEPOINT command to mark a specific point within the
transaction. This allows you to later rollback to that save point if needed, while
12
DBMS UNIT-2
keeping other changes made after the save point. "my_savepoint" marks the point
at which you can later rollback to if needed.
4. SET TRANSACTION - specify characteristics of the transaction.

Domain - A domain refers to the set of possible values that a column or attribute
can have in a table. It defines the data type, constraints, and allowable values for
a particular attribute. Eg - in a table for storing employee information, the "age"
column may have a domain that restricts its values to positive integers.
Data dependency refers to the relationship between different data elements or
attributes in a database. It describes how changes or updates to one piece of data
can affect or depend on other data elements within the database.
Normalization - Normalization is the process of organizing the data in the
database. Normalization is used to minimize the redundancy from a relation or set
of relations. It eliminates undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
Types of Normal Forms: Normalization works through a series of stages called
Normal forms. The normal forms apply to individual relations. The relation is said
to be normal form if it satisfies constraints.
I. First Normal Form (1NF)
• A relation will be 1NF if it contains an atomic value.
• It states that an attribute of a table cannot hold multiple values. It must hold
only single-valued attribute.
• Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE.

13
DBMS UNIT-2
The decomposition of the EMPLOYEE table into 1NF has been shown below:

II. Second Normal Form (2NF)


• In the 2NF, relational must be in 1NF.
• In the second normal form, all non-key attributes are fully functional
dependent on the primary key.
• No partial dependency is allowed.
Example: Let's assume, a school can store the data of teachers and the subjects
they teach. In a school, a teacher can teach more than one subject.

In the given table, non-prime attribute TEACHER_AGE is dependent on


TEACHER_ID which is a proper subset of a candidate key. That's why it violates
the rule for 2NF. To convert the given table into 2NF, we decompose it into two
tables:

14
DBMS UNIT-2

Third Normal Form (3NF)


• In the 3NF, relational must be in 2NF.
• A relation will be in 3NF if it does not contain any transitive partial
dependency. If there is no transitive dependency for non-prime attributes,
then the relation must be in third normal form.

Candidate key:
{EMP_ID}
Non-prime attributes:
In the given table, all
attributes except
EMP_ID are non-
prime.

15
DBMS UNIT-2

EMP_STATE &
EMP_CITY dependent
on EMP_ZIP and
EMP_ZIP dependent
on EMP_ID. The non-
prime attributes
(EMP_STATE,
EMP_CITY)
transitively
dependent on super
key(EMP_ID). It
violates the rule of
third normal form.

Boyce Codd normal form (BCNF)


BCNF is the advance version of 3NF. It is stricter than 3NF.
X->Y be any non-trivial functional dependency over the R is BCNF if X is a
Candidate Key or a SuperKey.
or
X->Y is a trivial functional dependency (i.e, Y subset of X).

16
DBMS UNIT-2

In the above table Functional dependencies are as follows:


EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
Functional dependencies:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

17
DBMS UNIT-2

Fourth normal form (4NF)


A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
For a dependency A → B, if for a single value of A, multiple values of B exists, then
the relation will be a multi-valued dependency.

18
DBMS UNIT-2

19
DBMS UNIT-2
The given STUDENT
table is in 3NF, but
the COURSE and
HOBBY are two
independent entity.
Hence, there is no
relationship
between COURSE
and HOBBY. So to
make the above
table into 4NF, we
can decompose it
into two tables:

Fifth normal form (5NF)


A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.

20
DBMS UNIT-2
5NF is satisfied when all the tables are broken into as many tables as possible in
order to avoid redundancy.
5NF is also known as Project-join normal form (PJ/NF).
In the above table, John
takes both Computer
and Math class for
Semester 1 but he
doesn't take Math class
for Semester 2. In this
case, combination of all
these fields required to
identify a valid data.

21
DBMS UNIT-2

22
DBMS UNIT-2
Difference between Generalization and Specialization?
GENERALIZATION SPECIALIZATION

1. Generalization works in Bottom- Specialization works in top-down


Up approach. approach.

2. In Generalization, size of schema In Specialization, size of schema


gets reduced. gets increased.

3. Generalization is normally We can apply Specialization to a


applied to group of entities. single entity.

4. Generalization can be defined as Specialization can be defined as


a process of creating groupings process of creating subgrouping
from various entity sets within an entity set

Specialization is reverse of
5. In Generalization process, it
Generalization. Specialization is a
takes the union of two or more
process of taking a subset of a
lower-level entity sets to produce
higher level entity set to form a
a higher-level entity sets.
lower-level entity set.

6. Generalization process starts


Specialization process starts from a
with the number of entity sets
single entity set and it creates a
and it creates high-level entity
different entity set by using some
with the help of some common
different features.
features.

7. In Generalization, the difference


and similarities between lower In Specialization, a higher entity is
entities are ignored to form a split to form lower entities.
higher entity.

8. There is no inheritance in There is inheritance in


Generalization. Specialization.

23
DBMS UNIT-2

Decomposition in DBMS
The term decomposition refers to the process in which we break down a table in
a database into various elements or parts. If the relation has no proper
decomposition, then it may lead to problems like loss of information.
Decomposition is used to eliminate some of the problems of bad design like
anomalies, inconsistencies, and redundancy.
Types of Decomposition - Decomposition is of two major types in DBMS:
1. Lossless 2. Lossy
Lossless Decomposition
• If the information is not lost from the relation that is decomposed, then
the decomposition will be lossless.
• The lossless decomposition guarantees that the join of relations will result
in the same relation as it was decomposed.
• The relation is said to be lossless decomposition if natural joins of all the
decomposition give the original relation.

24
DBMS UNIT-2

When these two relations are joined on the common column "EMP_ID", then the
resultant relation will look like:

25
DBMS UNIT-2
Employee ⋈ Department

Lossy Decomposition
• Lossy decomposition is a process in database design where a relation is
decomposed into multiple smaller relations in a way that results in the loss
of information.
• Lossy decomposition does not guarantee that the original relation can be
reconstructed from the decomposed relations.
• In lossy decomposition, certain attributes or data from the original relation
are discarded or altered during the decomposition process.

26
DBMS UNIT-2

In this example, the original "Sales"


table has been decomposed into two
tables: "Sales by Date and Product"
and "Sales by Product and Region".
The first table shows sales amounts by
date and product, while the second
table shows sales amounts by product
and region. However, in the process of
decomposition, we have lost
information about the specific regions
where each sale occurred.

Characteristics of decomposition
1. Dependency Preserving
• It is an important constraint of the database.
• In the dependency preservation, at least one decomposed table must satisfy
every dependency.

27
DBMS UNIT-2
• If a relation R is decomposed into relation R1 and R2, then the dependencies
of R either must be a part of R1 or R2 or must be derivable from the
combination of functional dependencies of R1 and R2.
• For example, suppose there is a relation R (A, B, C, D) with functional
dependency set (A->BC). The relational R is decomposed into R1(ABC) and
R2(AD) which is dependency preserving because FD A->BC is a part of
relation R1(ABC).
2. Minimization of Redundancy: Decomposition should aim to minimize
redundancy in the resulting relations. Redundancy occurs when the same
information is stored multiple times in different parts of the database. By
reducing redundancy, we can improve data consistency, storage efficiency, and
overall performance.
3. Maintainability and usability: The smaller tables resulting from
decomposition should be easy to maintain and use. This means that they
should have clear and meaningful names, and they should be logically
organized.
4. Consistency and accuracy: The smaller tables should be consistent and
accurate. This means that the data in each table should be internally consistent
and should accurately represent the original data.
5. Preservation of Functional Dependencies: Decomposition should preserve
the functional dependencies present in the original relation. A functional
dependency states the relationship between attributes in a relation. When
decomposing a relation, it is essential to ensure that the functional
dependencies hold in the resulting smaller relations as well.
Multivalued Dependency
Multivalued dependency occurs when two attributes in a table are independent of
each other but, both depend on a third attribute.
A multivalued dependency consists of at least two attributes that are dependent
on a third attribute that's why it always requires at least three attributes.
Suppose a person named Geeks is working on 2 projects Microsoft and Oracle
and has 2 hobbies namely Reading and Music. This can be expressed in a tabular
format in the following way.

28
DBMS UNIT-2

Conditions for MVD :


1. Any attribute say a multiple define another attribute b; if any legal relation
r(R), for all pairs of tuples t1 and t2 in r, such that,
t1[a] = t2[a]
Then there exists t3 and t4 in r such that.
t1[a] = t2[a] = t3[a] = t4[a]
t1[b] = t3[b]; t2[b] = t4[b]

29
DBMS UNIT-2
t1 = t4; t2 = t3
Then multivalued (MVD) dependency exists.
2. Condition-2 for MVD –
t1[b] = t3[b]
And
t2[b] = t4[b]
3. Condition-3 for MVD –
∃c ∈ R-(a ∪ b) where R is the set of attributes in the relational table.
t1 = t4
And
t2=t3
To check the MVD in given table, we apply the conditions stated above and we
check it with the values in the given table.

30
DBMS UNIT-2
Condition-1 for MVD –
t1[a] = t2[a] = t3[a] = t4[a]
Finding from table,
t1[a] = t2[a] = t3[a] = t4[a] = Geeks
So, condition 1 is Satisfied.
Condition-2 for MVD –
t1[b] = t3[b] = MS
And
t2[b] = t4[b] = Oracle
So, condition 2 is Satisfied.
Condition-3 for MVD –
t1 = t4 = Reading
And
t2 = t3 = Music
So, condition 3 is Satisfied.
All conditions are satisfied, therefore,
a --> --> b
According to table we have got,
name --> --> project
And for,
a --> --> C
We get,
name --> --> hobby
Hence, we know that MVD exists in the above table and it can be stated by,

31
DBMS UNIT-2

name --> --> project


name --> --> hobby

Difference Between OODBMS and ORDBMS in tabular form?

OODBMS ORDBMS

It stands for Object It stands for Object Relational


Oriented Database Database Management
1. Stands for Management System. System.

It follows Object-oriented It follows Object-relational


2. Data Model data model. data model.

It has complex objects as


persistent entities (e.g., It has relational and object-
3. Structure objects, classes) oriented entities.

It is Schema less or semi-


4. Schema schema less. It is Schema-based.

It supports object-oriented
programming like It supports SQL and object-
languages (e.g., Java, oriented programming
5. Language C++). languages.

Supports relationships
Supports complex and between objects and tables
6. Data dynamic relationships (e.g., using standard SQL join
Relationships inheritance, association). operations.

It provides flexibility for It extends relational model


adding new classes and with object- oriented features
7. Extensibility methods. and datatypes.

32
DBMS UNIT-2
It is efficient for complex
object manipulations and It is efficient for tabular data
8. Performance traversal. and relational operations.

It is well-suited for handling It is well-suited for traditional


large-scale complex data relational data and
9. Scalability and relationships. relationships.

It supports persistence of It supports persistence of


10. Data object states and tabular data and associated
Persistence behaviors. relations.

It uses SQL with extended


11. Query It uses Object query object-relational features
Language languages (e.g., OQL) (e.g., user-defined types).

Every object-oriented Keys, entity integrity, and


system has a different set referential integrity are
of constraints that it can constraints of an object-
12. Constraints accommodate. oriented database.

13. Query
The efficiency of query Processing of queries is quite
processing
processing is low. effective.
efficiency

PostgreSQL, Oracle, Microsoft


14. Examples db4o, ObjectStore, Versant SQL

33

You might also like