A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database.
### Key Components of a DBMS:
1. **Database Engine**: Responsible for data storage, retrieval, and updating.
A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database.
### Key Components of a DBMS:
1. **Database Engine**: Responsible for data storage, retrieval, and updating.
A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database.
### Key Components of a DBMS:
1. **Database Engine**: Responsible for data storage, retrieval, and updating.
A **Database Management System (DBMS)** is software that provides an interface for interacting with databases. Its primary role is to store, retrieve, and manage data efficiently. The DBMS serves as an intermediary between the user and the database, allowing users to access and manipulate data using queries without needing to understand the underlying physical structure of the database.
### Key Components of a DBMS:
1. **Database Engine**: Responsible for data storage, retrieval, and updating.
Download as PPTX, PDF, TXT or read online from Scribd
Download as pptx, pdf, or txt
You are on page 1of 82
Module 3
First Normal Form
• Defi ned to disallow multivalued attributes, composite attributes, a n d their combinations It states that the domain of a n attribute must include only atomic (simple, indivisible) values a n d that the value of any attribute in a tuple must b e a single value from the domain of that attribute 1NF disallows relations within relations or relations as attribute values within tuples The only attribute values permitted by 1NF are single • atomic (or indivisible) values. • Consider the DEPARTMENT relation schema shown in Figure below
Primary key is Dnum ber
We assume that e a c h department c a n h a v e a number of locations
The DEPARTMENT s ch em a a n d a sample relation state are
shown in Figure below
As w e c a n see, this is no t in 1NF b e c a use Dlo c a tio ns is no t
an atomi c attribute, as illustrated by the first tuple in Figure There are two ways w e c a n look at the Dlocations attribute:
• The domain of Dlocations contains atomic values, but
some tuples c a n have a set of these values. In this case, Dlocations is not functionally dependent on the primary key Dnumber
• The domain of Dlocations contains sets of values a n d
h e n c e is nonatomic. In this case, Dnumber→Dlocations because e a c h set is considered a single member of the attribute domain In either case, the DEPARTMENT relation is not in 1NF There are three main techniques to ach i ev e first normal form for such a
relation:
1. Remove the attribute Dlocations that violates 1NF a n d p l a c e it in a
separate relation DEPT_LOCATIONS along with the primary key
Dnumber of DEPARTMENT. The primary key of this relation is the
combination {Dnumber, Dlocation}. A distinct tuple in
DEPT_LOCATIONS exists for each location of a department. This
decomposes the non-1NF relation into two 1NF relations.
2. Expand the key so that there will b e a separate tuple
in the original DEPARTMENT relation for e a c h location of a
DEPARTMENT. In this case, the primary key b e c o m e s the combination
{Dnumber, Dlocation}. This solution has the disadvantage of
introducing redundancy in the relation
3. If a maximum number of values is known for the attribute—for example, if it is known that at most three locations c a n exist for a department— r e p l a c e the Dlocations attribute b y three atomic attributes: Dlocation1, Dlocation2, a n d Dlocation3. This solution has the disad van ta ge of introducing NULL values if most departments h a v e fewer than three locations. Querying on this attribute b e c o m es m o re d iffic ult; • Of the three solutions, the first is generally considered best b e c a u s e it does not suffer from redundancy a n d it is completely general, having no limit p l a c e d on a maximum number of values Second Normal Form Second normal form (2NF) is b a s e d on the c o n c e p t of full functional d e p e n d e n c y A functional d e p e n d e n c y X → Y is a full functional dependency if removal of a n y attribute A from X means that the d e p e n d e n c y does not hold a n y more; that is, for an y attribute A ε X, (X – {A}) does not functionally determine Y A functional d e p e n d e n c y X→Y is a partial dependency if some attribute A ε X c a n b e removed from X a n d the d e p e n d e n c y still holds; that is, for some A ε X, (X – {A}) → Y Inthe a b ove figure , {Ssn, Pnum ber} → Hours is a full dependency (neither Ssn → Hours nor Pnumber→Hours holds) {Ssn, Pnumber}→Ename is partial because Ssn→Ename holds
•Definition. A relation schema R is in 2NF if every nonprime
attribute A in R is fully functionally dependent on the primary key of R The test for 2NF involves testing for functional dependencies
whose left-hand side attributes are part of the primary key
If the primary key contains a single attribute, the test n e e d not b e
applied at all The EM P_PRO J rela tio n is in 1NF b ut is no t in 2NF.
The nonprime attribute Ename violates 2NF bec a use of FD2, as d o
the nonprime attributes Pname a n d Plocation be c a use of FD3
The func tio nal d e p enden c ies FD2 and FD3 m a ke Enam e , Pname, and Plocation partially dependent on the primary key
• {Ssn, Pnumber} of EMP_PROJ, thus violating the 2NF test.
If a relation schema is not in 2NF, it c a n b e second normalized. Relation R can be decomposed into 2NF Therefore, the functional dependencies FD1, FD2, a n d FD3 lead to the decomposition of EMP_PROJ into the three relation schemas EP1, • EP2, a n d EP3 shown in Figure below, e a c h of which is in 2NF. Third Normal Form Transitive functional dependency: A functional d e p e n d e n c y X→Y in a relation schema R is a transitive dependency if there exists a set of attribute Z that are neither a primary nor a subset of any key of R(candidate key) a n d both X Z a n d Z Y holds Example:
🠶 SSN DMGRSSN is a transitive FD sinc e
SSN DNUMBER a n d DNUMBER DMGRSSN hold Dnumber is neither a key itself nor a subset of the key of EMP_DEPT 🠶 SSN ENAME is non-transitive sinc e there is no set o f a ttrib utes X where SSN X a n d X ENAME Definition: A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key.
The relation sch em a EMP_DEPT is in 2NF, since no partial
d ep en d en cies on a key exist. However, EMP_DEPT is not in 3NF b e c a u s e of the transitive d e p e n d e n c y of Dmgr_ssn (and also Dname) on Ssn via Dnumber
We c a n normalize EMP_DEPT by decomposing it into the two
3NF relation schemas ED1 a n d ED2 General Definition of Second and Third Normal Form • Takes into a c c o u n t all c a n d i d a t e keys of a relation into account Definition of 2NF: A relation s c h e m a R is in second normal form (2NF) if every nonprime attribute A in R is not partially d e p e n d e n t on any key of R •Definition: A relation schema R is in 2NF if every nonprime attribute A in R is fully functionally dependent on the primary key of R • Consid er the relatio n sc hem a LOTS whic h desc rib es parcels of land for sale in various counties of a state • Suppose there are two candidate keys: Property_id# and {country_name, Lot#}, that is Lot# are unique within each county but property_id# are unique across countries for the entire state Based on the two candidate keys Property_id# a n d {County_name, Lot#}, the functional dependencies FD1 a n d FD2 hold 🠶 FD1: Prop erty_id → { C ounty_na me,Lot#,Area,Pric e ,Ta x_rate } 🠶 FD2:{County_name,Lot#} →{Property_id, Area,Price,Tax_rate} 🠶 FD3: County_name → Tax_rate 🠶 FD4: Area → Price We choose Property_id# as the primary key, but no special consideration will b e given to this key over the other candidate key FD3 says that the tax rate is fixed for a given county (does not vary lot by lot within the same county) FD4 says tha t the p ric e of a lot is d e termine d by its a rea regardless of whic h c ounty it is in. The LOTS relation schema violates the general definition of 2NF b e c a use Ta x_rate is pa rtia lly d e p enden t on the c a ndid a te key {County_name, Lot#}, due to FD3 To normalize LOTS into 2NF, w e d e c o m p o s e it into the two relations LOTS1 a n d LOTS2
We construct LOTS1 by removing the attribute Tax_rate that violates
2NF from LOTS a n d placing it with County_name (the left- h an d side of FD3 that causes the partial dependency) into another relation LOTS2. Both LOTS1 a n d LOTS2 are in 2NF. Definition of 3NF: A relation s c h e m a R is in third normal form (3NF) if, whenever a nontrivial functional d e p e n d e n c y X→A holds in R, either (a) X is a superkey of R, or (b) A is a prime attribute of R A c c ording to this d e finition, LOTS2 is in 3NF FD4 in LOTS1 violates 3NF b e c a u s e Area is not a superkey a n d Price is not a prime attribute in LOTS1 To normalize LOTS1 into 3NF, w e d e c o m p o s e it into the relation schemas LOTS1A a n d LOTS1B We construct LOTS1A by removing the attribute Price that violates 3NF from LOTS1 a n d placing it with Area (the lefthand side of FD4 that causes the transitive d e p e n d e n c y ) into another relation LOTS1B.
Both LOTS1A a n d LOTS1B are in 3NF
Boyce-Codd Normal Form Boyce-Codd normal form (BCNF) was proposed as a simpler form of 3NF, but it was found to b e stricter than 3NF Every relation in BCNF is also in 3NF; however, a relation in 3NF is not n e c essarily in BC NF Suppose that w e have thousands of lots in the relation but the lots are from only two counties: DeKalb a n d Fulton Suppose also that lot sizes in DeKalb County are only 0.5, 0.6, 0.7, 0.8, 0.9, a n d 1.0 acres, whereas lot sizes in Fulton County are restricted to 1.1, 1.2, ..., 1.9, a n d 2.0 acres In such a situation we would have the additional functional d e p e n d e n c y FD5: Area→County_name 🠶 If we a d d this to the other dependencies, the relation schema LOTS1A still is in 3NF because County_name is a