Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Functional Dependency and Normalization

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Unit:9 FUNCTIONAL DEPENDENCY AND NORMALIZATION

Functional Dependency
Definition: A functional dependency occurs when one attribute in a relation uniquely determines another attribute. This can be written A -> B which would be the same as stating "B is functionally dependent upon A." Examples: In a table listing employee characteristics including Social Security Number (SSN) and name, it can be said that name is functionally dependent upon SSN (or SSN -> name) because an employee's name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not true because more than one employee can have the same name but different SSNs.

Functional Dependency and Normalization: Chandan Deo

Functional dependencies also arise in relationships. Let C be the primary key of an entity and D be the primary key of another entity. Let the two entities have a relationship. If the relationship is one-to-one, we must have C -> D and D -> C. If the relationship is many-to-one, we would have C -> D but not D -> C. For many-to-many relationships, no functional dependencies hold. For example, if C is student number and D is subject number, there is no functional dependency between them. If however, we were storing marks and grades in the database as well, we
would have

(student_number, subject_number) -> marks and we might have marks -> grades

The second functional dependency above assumes that the grades are dependent only on the marks. This may sometime not be true since the instructor may decide to take other considerations into account in assigning grades, for example, the class average mark. For example, in the student database that we have discussed earlier, we have the following functional dependencies:
sno -> sname sno -> address cno -> cname cno -> instructor instructor -> office

Functional Dependency and Normalization: Chandan Deo

Functional dependencies allow us to express constraints that we cannot express with superkeys. Consider the schema
Loan-info-schema = (loan-number, branch-name, customer-name, amount) which is simplification of the Lending-schema that we saw earlier. The set of functional dependencies that we expect to hold on this relation schema is loan-number amount loan-number branch-name We would not, however, expect the functional dependency loan-number customer-name

Properties of functional dependencies


Given that X, Y, and Z are sets of attributes in a relation R, one can derive several properties of functional dependencies. Among the most important are Armstrong's axioms, which are used in database normalization: Subset Property (Axiom of Reflexivity): If Y is a subset of X, then X Y Augmentation (Axiom of Augmentation): If X Y, then XZ YZ Transitivity (Axiom of Transitivity): If X Y and Y Z, then X Z Union: If X Y and X Z, then X YZ Decomposition: If X YZ, then X Y and X Z Pseudotransitivity: If X Y and WY Z, then WX Z

From these rules, we can derive these secondary rules:

Anomalies in Database:
Database anomalies are the problems in relations that occur due to redundancy in the relations. These anomalies affect the process of inserting, deleting and modifying data in the relations. Some important data may be lost if a elations is updated that contains database anomalies. It is important to remove these anomalies in order to perform different processing on the relations without any problem. Tables that have redundant data have problems known as anomalies.So data redundancy is a cause of an anomaly. Redundancy is the duplication of the data. There are 3 types of anomalies.
Insert Anomaly:When you insert a record without having it stored on the related record. Delete Anomaly:When you delete some information and lose valuable related information at the same time. Update Anomaly: Any change made to your data will require you to scan all records to make the changes multiple time. Example of a database Anomaly: Suppose you have a hospital database and due to poor normalization, all patients and doctors are in same table. As doctors and patients are separate entities, so when you delete doctor's record, patient record is also deleted and vice versa.

Functional Dependency and Normalization: Chandan Deo

Properties of Normalized Relations:


Ideal relations after normalization should have the following properties so that the problems mentioned above do not occur for relations in the (ideal) normalized form: 1. No data value should be duplicated in different rows unnecessarily. 2. A value must be specified (and required) for every attribute in a row. 3. First relation should be self-contained. In other words, if a row from a relation is deleted, important information should not be accidentally lost. 4. When a new record is added to a relation, other relations in the database should not be affected. 5. A value of an attribute in a tuple may be changed independent of other tuples in the relation and other relations. The idea of normalizing relations to higher and higher normal forms is to attain the goals of having a set of ideal relations meeting the above criteria.

Normalization:
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored. Reason for normalization: to prevent possible corruption of DB stemming from update anomalies (insertion, deletion, modification).
It is a formal method that identifies relations based on their primary key and the functional dependencies among their attributes (Constraint between attributes). Functional dependency: Describes the relationship between attributes in a relation. If A and B are attributes of a relation R, B is functionally dependent on A (den. A B), if each value of A in R is associated with exactly one value of B in R.

Determinant: attribute or set of attributes on the left hand side of the arrow.

Types of Normalizations:
Level First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Rule An entity type is in 1NF when it contains no repeating groups of data. An entity type is in 2NF when it is in 1NF and when all of its non-key attributes are fully dependent on its primary key. An entity type is in 3NF when it is in 2NF and when all of its attributes are directly dependent on the primary key.

FIRST NORMAL FORM


The relation shown in the below table is said to be in First Normal Form, abbreviated as 1NF. This form is also called a flat file. There are no composite attributes, and every attribute is single and describes one property. Converting a relation to the INF form is the first essentials step normalization. There are successive higher normal forms known as 2NF, 3NF, BCTG, 4NF ma 5NF. Each form is an improvement over the earlier form. In other words, 2NF is an improvement on INF, 3NF is an improvement on 2NF, and so on. A higher normal form relation is a subset of lower normal form as shown in the following figure 4. The higher normalization steps are based on three important concepts:

Functional Dependency and Normalization: Chandan Deo

1. Dependence among attributes in a relation 2. Identification of an attribute or a set of attributes as the key of a relation 3. Multivalued dependency between auributes

Table in 1st Normal form

SECOND NORMAL FORM


We will now define a relation in the Second Normal Form (2NF). A relation is said to be in 2NF if it is in 1NF and non-key attributes are functionillly dependent on the key attribute(s). Further. if the key has more than one attribute then no non-key attributes should be functionally dependent upon a part of the key attributes. Considcr, for example, the relation given in table 1. This relation is in 1NF. Thc key is (Order no.. Item code). The dependency diagram for atmbutes of this relation is shown in figure 5. The nonkey attribute Price~Unit is functionally dependent on Item code which is part of the rclalion key. Also, the non-key auribute Order date is functionally dependent on Order no. which is a part of the relation key.

Thus the relation is not in 2NF. It can be transformed to 2NF by splitting it into three relations as shown in table 3. In table 3 the relation Orders has Order no. as the key. The relation Order details has the composite key Order no. and Item code. In both relations the non-key attributes are functionally dependent on the whole key. Observe that by transforming to 2NF relations the

Functional Dependency and Normalization: Chandan Deo

repetition of Order date (table 1) has been removed. Further, if an order for an item is cancelled. the price of an item is not lost. For example. if Order no. 1886 for Item cd,e 4629 is cancelled in table 1, then the fourth row will be removed and the price of the item is lost' In table 3 only the fourth row of the table 3(b) is omitted. The item price is not lost as it is available in table 3(c). The data of the order is also not lost as it is in table 3(a).

These relations in 2NF form meet all the "ideal" conditions specified. Observe that the three relations obtained as self-contained. There is no duplication of data within a relation.

THIRD NORMAL FORM


A Third Normal Form normalization will be needed where all attributes in a relation tuple are not functionally dependent only on the key attribute. If two non-key attributes are functionally dependent, then there will be unnecessary duplication of data. Consider the relation given in table 4. Here. Roll no. is the key and all other attributes are functionally dependent on it.

Thus it is in 2NF. If it is known that in the college all first year students are accommodated in Ganga hostel, all second year students in Kaveri, all third year students in Krishna, and all fourth year students in Godavari, then the non-key attribute Hostel name is dependent on the non-key attribute Year. This dependency is shown in figure 6.

Functional Dependency and Normalization: Chandan Deo

Observe that given the year of student, his hostel is known and vice versa. The dependency of hostel on year leads to duplication of data as is evident from table 4. If it is decided to ask all first year students to move to Kaveri hostel, and all second year students to Ganga hostel. This change should be made in many places in table 4. Also, when a student's year of study changes, his hostel change should also be noted in Table 4. This is undesirable. Table 4 is said to be in 3NF if it is in 2NF and no non-key attribute is functionally dependent on any other non-key attribute. Table 4 is thus not in 3NF. To transform it to 3NF, we should introduce another relation which includes the functionally related non-key attributes. This is shown in table 5.

Let us consider another example of a relation. The relation Employee is given below and its dependency diagram in figure 7. Employee (Employee code, Employee name, Dcpt., Salary. Project no.. Termination date of project). As can be seen from the figure, the termination date of a project is dependent on the Project no. Thus this reIation is not in 3NF. The 3NF relations are: Employee (Employee code, Employee name. Salary, Project no.) Project (Project no. Termination date)

BOYCE-CODD NORMAL FORM (BCNF)


Assume that a relation has more than one possible key. Assunre further that the composite keys have a common attribute. If an attribute of a composite key is dependent on an attribute of the other composite key, a normalii-ation called BCNF is needed. Consider. as an example, the relation Professor: Professor (Professor code, Dep~. Head of Dept.. Parent time) It is assumed that 1. A professor can work in more than one department 2. The percentage of the time he spends in each department is given. 3. Each department has only one Head of Department. The relationship diagram for the above relation is given in figure 8. Table 6 gives the relation attributes. The two possible composite keys are professor code and Dept. or Professor Code and Head of Dept. Observe that department as well as Head of Dept. are not non-key attributes. They are a part of a composite key

Functional Dependency and Normalization: Chandan Deo

The relation given in table 6 is in 3NF. Observe, however, that the names of Dept. and Head of Dept. are duplicated. Further, if Professor P2 resigns, rows 3 and 4 are deleted. We lose the information that Rao is the Head of Department of Chemistry. The normalization of the relation is done by creating a new relation for Dept. and Head af Dept. and deleting Head of Dept. From Professor relation. The normalized relations are shown in the following table 7.

and the dependency diagrams for these new relations in figure 8. The dependency diagram gives the important clue to this normalization step as is clear from figures 8 and 9.

Functional Dependency and Normalization: Chandan Deo

FOURTH AND FIFTH NORMAL FORM


When attributes in a relation have multivalued dependency, further Normalization to 4NF and 5NF are required. We will illustrate this with an example. Consider a vendor supplying many items to many projects in an organization. The following are the assumptions: 1. 2. 3. 4. A vendor is capable of supplying many items. A project uses many items. A vendor supplies to many projects. An item may be supplied by many vendors.

Table 8 gives a relation for this problem and figure 10 the dependency diagram(s).

The relation given in table 8 has a number of problems. For example:

Functional Dependency and Normalization: Chandan Deo

Functional Dependency and Normalization: Chandan Deo

10

You might also like