Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

DATABASE MANAGEMENT SYSTEMS

Data Modeling Using the Entity-Relationship Model


UNIT:2 Chapter:3

Outline of Database Design:

The main phases of database design are depicted in Figure

• Requirements Collection and Analysis: purpose is to produce a description of the


users' requirements.
• Conceptual Design: purpose is to produce a conceptual schema for the
database, including detailed descriptions of entity types, relationship types, and
constraints. All these are expressed in terms provided by the data model being used.
• Implementation: purpose is to transform the conceptual schema (which is at a
high/abstract level) into a (lower-level) representational/implementation model
supported by whatever DBMS is to be used.
• Physical Design: purpose is to decide upon the internal storage structures, access
paths (indexes), etc., that will be used in realizing the representational model
produced in previous phase.

Entity-Relationship (ER) Model

Our focus now is on the second phase, conceptual design, for which The Entity-Relationship
(ER) Model is a popular high-level conceptual data model.

In the ER model, the main concepts are entity, attribute, and relationship.

1
DATABASE MANAGEMENT SYSTEMS

Entities and Attributes:


Entity:
An entity is a “thing” or “object” in the real world that is distinguishable from all other
objects.
For example, each person in an enterprise is an entity.
An entity has a set of properties, and the values for some set of properties may uniquely
identify an entity.
For instance, a person may have a person-id property whose value uniquely identifies that
person.
An entity may be concrete, such as a person or a book, or it may be abstract, such as a loan,
or a holiday, or a concept.
The ER model refers to a specific table row as an entity instance or entity occurrence
An entity in ER model is represented by a rectangle containing the entity’s
name. The entity name, a noun, is usually written in capital letters.
Entity Types:
The Entities are of two types:
Strong Entity: An entity which is not depend upon any other entity for its existence then
that type of entity can be known as Strong Entity. A Strong entity can be represented by
using a rectangle.

EX:
EMPLOYEE

Weak Entity: An entity which is depend upon another entity for its existence. Then that type
of entity can be known as a weak entity
A weak entity can be represented by double rectangle DEPENDENT

NOTE: A strong entity is having primary key attributes whereas a weak entity cannot have
primary key attributes.

Entity Set:

2
DATABASE MANAGEMENT SYSTEMS

Entity Set:
An entity set is a set of entities of the same type that share the same properties, or
attributes.
Ex: The set of all persons who are customers at a given bank.
Attributes:
Attributes are characteristics or properties of entities.
For example, the EMPLOYEES entity includes the attributes SSN, NAME, and LOT
In the original Chen notation, attributes are represented by ovals and are connected
to the entity rectangle with a line. Each oval contains the name of the attribute it
represents.

Required Attribute:
A Required attribute is an attribute that must have a value i.e., it cannot be left empty. I n
Crow’s foot notation, such attributes are indicated in bold face
Optional Attribute:
An Optional attribute is an attribute that does not require a value. Therefore, it can be left
empty.
Domains:
Attributes have a domain. A Domain is a set of possible values for a given attributes.
For example, the domain for the (numeric) attribute grade point average (GPA) is written
(0, 4) because the lowest possible GPA value is 0, and the highest possible value is 4. The
domain for the attribute SEX consists of only two possibilities, M or F.

Types of Attributes:

1. Composite attribute versus atomic attribute/simple attribute:

A composite attribute is one that is composed of smaller parts.

Example 1: A Birth_Date attribute can be viewed as being composed of (sub-) attributes


for month, day, and year.
Example 2: An Address attribute can be viewed as being composed of (sub-) attributes for
street address, city, state, and zip code. A street address can itself be viewed as being
composed of a number, street name, and apartment number. As this suggests,
composition can extend to a depth of two (as here) or more.

To describe the structure of a composite attribute, one can draw a tree as shown in below
figure.

3
DATABASE MANAGEMENT SYSTEMS

In case we are limited to using text, it is customary to write its name followed by a
parenthesized list of its sub-attributes. For the examples mentioned in below figure, we
would write
BirthDate(Month,Day,Year)
Address(StreetAddr(StrNum, StrName, AptNum), City, State, Zip)

Atomic attribute:
An atomic attribute is indivisible or indecomposable. For example, age, sex, marital status
would be classified as Simple attributes.

2. Single- vs. multi-valued attribute:


Single valued attribute:
These attributes contain one single value for a particular attribute.
Examples for single valued attributes are age, regno, ssn, part number etc.
Multi-valued Attributes:
Multi-valued attributes are attributes that can have many values.
For instance, a person may have several college degrees.
Another example: if a department is located in different places then the attribute
dept_location in the department entity may take the following multiple values.
Eg: 1. dept_location (Bangalore, Hyderabad, Mysore) 2. Color of car (grey, blue, black).

Chen’s representation of multi-valued attribute

To distinguish a multi-valued attribute from a single-valued one, it is customary to enclose the


former within curly braces (which makes sense, as such an attribute has a value that is a set,
and curly braces are traditionally used to denote sets). Using the PERSON example from
above, wewould depict its structure intext as

PERSON (SSN, Name, BirthDate(Month, Day, Year), { AcademicDegrees(School, Level, Year)},


{Dependents }, ...)
Assume that each academic degree is described by a University, level (e.g., B.S., Ph.D.), and

4
DATABASE MANAGEMENT SYSTEMS

year. Thus, AcademicDegrees is not only multi-valued but also composite. We refer to an
attribute that involves some combination of multi-valuedness and compositeness as a
complex attribute.

3. Stored vs. derived attribute:


Derived Attributes:
A Derived attribute is an attribute whose value is calculated or derived from other
attributes.
For Example, if the customer entity set also has an Attribute date-of-birth, we can
calculate age from date-of-birth and the current Date. Thus, age is a derived attribute.
A derived attribute is indicated in the Chen’s notation by a dash line connecting the
attribute and the entity.
Derived attributes are sometimes referred to as computed attributes

Stored Attribute:
The value of some attributes can’t be derived such type of attributes is known as
Stored Attributes.
Example: The attributes accno, DOB can’t be derived using some other attributes.
Advantages and Disadvantages of Storing Derived attributes

Stored Not Stored

Advantages Saves CPU processing cycles. Stores storage space.


Saves data access time. Computation always yields current
Data value is readily available. value.
Can be used to keep track of
historical data.
Disadvantages Requires constant maintenance to Uses CPU processing cycles.
ensure derived value is current, Increases data access time.
especially if any values used in the Adds coding complexity to
calculation change queries

4. The Null value: In some cases, a particular entity might not have an applicable value
for a particular attribute. Or that value may be unknown. Or, in the case of a multi-valued
attribute, the appropriate value might be the empty set.

Example: The attribute DateOfDeath is not applicable to a living person and its correct value

5
DATABASE MANAGEMENT SYSTEMS

may be unknown for some persons who have died.

In such cases, we use a special attribute value (non-value?), called null. There has been
some argument in the database literature about whether a different approach (such as having
distinct values for not applicable and unknown) would be superior.

5. Complex Attribute:
For an entity, if an attribute is made using the multi valued attributes and composite
attributes then it is known as complex attributes.
• Example: A person can have more than one residence; each residence can have more
than one phone.
6. Key Attributes:
This attribute represents the main characteristic of an entity i.e. primary key. Key
attribute has clearly different value for each element in an entity set.
• Example: The entity student ID is a key attribute because no other student will have
the same ID.
7. Non-Key Attributes:
These are attributes other than candidate key attributes in a table. For example, Firstname
is a non-key attribute as it does not represent the main characteristics of the entity.
8. Required Attribute: A required attribute is an attribute that must have a
data value. These attributes are required because they describe what is
important in the entity. For example, in a STUDENT entity, firstname and
lastname is a required attribute. In the above example there are two bold
faced attributes, this indicates that data is required for that attributes.

Domains (Value Sets) of Attributes:


The domain of an attribute is the "universe of values" from which its value can be drawn. In
other words, an attribute's domain specifies its set of allowable values. The concept is
similarto data type.

Example Database Application: COMPANY

Suppose that Requirements Collection and Analysis results in the following (informal)
description of the COMPANY miniworld:

6
DATABASE MANAGEMENT SYSTEMS

The company is organized as a collection of departments.

➢ Each department
o has a unique name
o has a unique number
o is associated with a set of locations
o has a particular employee who acts as its manager (and who assumed that
position on some date)
o has a set of employees assigned to it
o controls a set of projects
➢ Each project
o has a unique name
o has a unique number
o has a single location
o has a set of employees who work on it
o is controlled by a single department
➢ Each employee
o has a name
o has an SSN that uniquely identifies her/him
o has an address
o has a salary
o has a sex
o has a birth date
o has a direct supervisor
o has a set of dependents
o is assigned to one department
o works some number of hours per week on each of a set of projects (which
neednot all be controlled by the same department)
➢ Each dependent
o has first name
o has a sex
o has a birth date
o is related to a particular employee in a particular way (e.g., child, spouse, pet)
o is uniquely identified by the combination of her/his first name and the employee
of which (s)he is a dependent.

Initial Conceptual Design of COMPANY database

Using the above structured description as a guide, we get the following preliminary design for
entity types and their attributes in the COMPANY database:

• DEPARTMENT (Name, Number, {Locations}, Manager,ManagerStartDate,{


Employees }, { Projects })
• PROJECT (Name, Number, Location, { Workers }, ControllingDept)
• EMPLOYEE (Name (FName, MInit, LName), SSN, Sex, Address, Salary, BirthDate,

7
DATABASE MANAGEMENT SYSTEMS

Dept, Supervisor, { Dependents }, { WorksOn(Project, Hours) })


• DEPENDENT(Employee,FirstName, Sex, BirthDate, Relationship)

Note: Note that the attribute WorksOn of EMPLOYEE (which records on which projects the
employee works) is not only multi-valued (because there may be several such projects) but also
composite, because we want to record, for each such project, the number of hours per week that
the employee works on it. Also, each candidate key has been indicated by underlining.

For similar reasons, the attributes Manager and ManagerStartDate of DEPARTMENT really
ought to be combined into a single composite attribute. Not doing so causes little or no
harm, however, because these are single-valued attributes. Multi-valued attributes would pose
some difficulties, on the other hand. Suppose, for example, that a department could have two or
more managers, and that some department had managers Mary and Harry, whose start dates were
10- 4-1999 and 1-13-2001, respectively. Then the values of the Manager and
ManagerStartDate attributes should be { Mary, Harry } and { 10-4-1999, 1-13-2001 }. But
from these two attribute values, there is no way to determine which manager started on
which date. On the other hand, by recording this data as a set of ordered pairs, in which
each pair identifies amanager and her/his starting date, this deficiency is eliminated.

8
DATABASE MANAGEMENT SYSTEMS

Relationship:
• The relationship is an association between entities.
• The entities that participate in a relationship are also known as participants, and each
relationship is identified by a name that describes the relationship.
• The relationship name is an active or passive verb.
• For example, a STUDENT takes a CLASS, a PROFESSOR teaches a CLASS, a
DEPARTMENT employs a PROFESSOR and an AIRCRAFT is flown by CREW
Relationships between entities always operate in both directions. For example, to define
the relationship between the entities named CUSTOMER and INVOICE, we can specify
that:
• A CUSTOMER may generate many INVOICEs
• Each INVOICE is generated by one CUSTOMER.
This relationship can be classified as 1: M
The relationship between DIVISION and EMPLOYEE can be specified as
• A DIVISION is managed by one EMPLOYEE
• An EMPLOYEE may manage only one DIVISION. This relationship is in 1:1
Relationship set
• Relationship set is a set of relationships of the same type.
Recursive relationship set
• The same entity set participates in a relationship set more than once then it is
called recursive relationship set.
• E.g.: An Employee entity participated in relationship under with department
entity as an employee as well manager also.
Connectivity and Cardinality:
The term Connectivity is used to describe the relationship classification.
Cardinality expresses the minimum and maximum number of entity occurrences associated
with one occurrence of the related entity.
In the ERD, cardinality is indicated by placing the appropriate numbers besides the
entities, using the format (x, y).The first value represents the minimum no. of
associated entities and the second value represents the maximum no. of associated
entities
As you examine the Chen model in Figure, Chen cardinalities represent the number of
occurrences in the related entity. For example, the cardinality (1,4) written next to the
PROFESSOR entity in the "PROFESSOR teaches CLASS" relationship indicates that the
PROFESSOR table's foreign key value occurs at least once and no more than four times
in the CLASS table. Similarly, the cardinality (1, 1) written next to the CLASS entity
indicates that each class is taught by one and only one professor.

9
DATABASE MANAGEMENT SYSTEMS

RELATIONSHIP PARTICIPATION
Participation in an entity relationship is either optional or mandatory.
Optional: Participation is optional if one Entity occurrence does not require a
corresponding entity occurrence in a particular relationship.
For example, “COURSE generates CLASS" relationship, an entity occurrence (row) in the
COURSE table does not necessarily require the existence of a corresponding entity
occurrence in the CLASS table.
Therefore, the CLASS entity is considered to be optional to the COURSE
entity.
In the Chen and Crow's Foot models, an optional relationship between entities is
shown by drawing a small circle (0) on the side of the optional entity, as shown in
Figure below. The existence of optionality indicates that the minimum cardinality is 0 for
the optional entity.

Mandatory: Relationship participation is mandatory if one entity occurrence


requires a corresponding entity occurrence in a particular relationship. If no
optionality symbol is depicted with the entity, that entity exists in a mandatory
relationship with the related entity. The existence of a mandatory relationship indicates
that the minimum cardinality is 1 for the mandatory entity.
"COURSE generates CLASS" relationship, it is easy to see that a CLASS cannot exist
without a COURSE. Therefore, COURSE entity is mandatory in the relationship
CLASS is mandatory. This condition is created by the constraint that is imposed by the
semantics of the statement "Each COURSE generates one or more CLASSes." In ER

10
DATABASE MANAGEMENT SYSTEMS

terms, each COURSE in the "generates" relationship must have at least one CLASS.
Therefore, a CLASS must be created as the COURSE is created.

Relationship Degree:
A relationship's degree indicates the number of associated entities
or participants associated with a relationship.
A unary relationship exists when an association is maintained within a
single entity.
A binary relationship exists when two entities are associated.
A ternary relationship exists when three entities are associated.
Unary Relationship:
In Unary relationship shown in below fig, an employee within
the EMPLOYEE entity is the manager for one or more employees
within that entity. In this case, the existence of the “manages
“relationship means that EMPLOYEE requires another EMPLOYEE
to be the manager-that is EMPLOYEE has a relationship with
itself. Such a relationship is known as a Recursive Relationship.

Binary Relationship:
A Binary relationship exists when two entities are associated in a relationship .TO simplify the
conceptual design, whenever possible, most higher order relationships are decomposed into
appropriate equivalent binary relationships. In the figure, “a PROFESSOR teaches one or more
CLASSes” represents a binary relationship.

11
DATABASE MANAGEMENT SYSTEMS

Ternary and Higher –Degree Relationships:


A ternary relationship implies an association among three different entities. For example, the
relationships in below figure are represented by the following business rules:
• A DOCTOR writes one or more PRESCRIPTIONs.
• A PATIENT may receive one or more PRESCRIPTIONs.
• A DRUG may appear in one or more PRESCRIPTIONs

Recursive Relationships:
A recursive relationship is one in which a relationship can exist between occurrences of
the same entity set.
For example, a 1: M unary relationship can be expressed by “an Employee may manage
many EMPLOYEEs and each EMPLOYEE is managed by one EMPLOYEE.”
Finally, the M: N unary relationship may be expressed by a “COURSE may be a
prerequisite to many other COURSEs, and each COURSE may have many other
COURSEs as prerequisites “

Types of Relationships:
A relationship describes an association among entities. Data models use three types of
relationships:
• one-to-many(1:M)
• many-to-many (M: N)
• one-to-one (1:1)

12
DATABASE MANAGEMENT SYSTEMS

➢ One-to-many (1:M or 1...*) relationship:


A one-to-many relationship exists between a pair of tables when a single record in the first
table can be related to one or more records in the second table, but a single record in the
second table can be related to only one record in the first table.
Example:

A painter paints many different paintings, but each one of them is painted by only one
painter. Thus, the painter (the “one”) is related to the paintings (the “many”). Therefore,
database designers label the relationship “PAINTER paints PAINTING” as 1: M.

➢ Many-to-many (M: N or *..*) relationship


A pair of tables bears a many-to-many relationship when a single record in the first table can
be related to one or more records in the second table and a single record in the second table
can be related to one or more records in the first table.
Example:
An employee may learn many job skills, and each job skill may be learned by many
employees. Database designers label the relationship “EMPLOYEE learns SKILL” as M: N.

➢ One-to-one (1:1 or 1..1) relationship


A pair of tables bears a one-to-one relationship when a single record in the first table is
related to only one record in the second table, and a single record in the second table is
related to only one record in the first table
Example:

A store is managed by a single employee. In turn, each store manager, who is an


employee, manages only a single store. Therefore, the relationship “EMPLOYEE manages
STORE” is labeled 1:1.

13
DATABASE MANAGEMENT SYSTEMS

Existence Dependency:
If an entity's existence depends on the existence of one or more other entities,
it is said to be Existence-dependent.
For Example, if an XYZ Corporation employee wants to claim one or more
dependents for tax, the relationship is "EMPLOYEE claims DEPENDENT". In this case,
the DEPENDENT entity is clearly existence-dependent on the EMPLOYEE entity,
because it is impossible for the dependent to exist apart from the EMPLOYEE in the
XYZ Company's database.

If an en tit y c a n e x i s t a p a r t f r o m o n e o r m o r e r el a t e d e n t i t i e s , i t i s s a i d t o b e
Existence-independent (strong entity).

Weak Relationship (Non- I d e n t i f y i n g relationships):


If one entity is existence dependent on another entity, the relationship
between them is described as a weak relationship, also known as a non-
identifying relationship. From a database design perspective, a weak
relationship exists if the PK of the related entity does not contain a PK component of
the parent entity.
For example, suppose that we define the definition of the COURSE and CLASS entities
as

COURSE (CRS_CODE, DEPT-CODE, CRS-DESCRIPTION, CRS-CREDIT)


CLASS (CLASS_CODE, CRS_CODE, CLASS-SECTION, CLASS-TIME... etc.)

In this case, a weak relationship exists between COURSE and CLASS, because the
CLASS CODE is the CLASS entity's PK, while the CRS_CODE in CLASS is only an FK.
In this case, the CLASS PK did not inherit the PK component from the COURSE
entity. Figure below shows that the Crow's Foot model depicts a weak
relationship by placing a dashed relationship line between the related entities.
The Chen model does not make a distinction between weak and strong
relationships.

A Weak (Non-Identifying) Relationship Between Course And Class

14
DATABASE MANAGEMENT SYSTEMS

Strong (Identifying) Relationships


A strong relationship, also known as an identifying relationship, exists when
the PK of the related entity contains a PK component of the parent entity.
For example, the definitions of the COURSE and CLASS entities
COURSE (CRS_CODE, DEPT-CODE, CRS-DESCRIPTION, CRS-CREDIT)
CLASS (CRS_CODE, CLASS-SECTION, CLASS _TIME, ROOM_CODE, PROF_NUM)
indicates that a strong relationship exists between COURSE and CLASS, because the
CLASS entity's composite PK is composed of CRS CODE + CLASS-SECTION.

Below Figure shows that the Crow's Foot model depicts the strong (identifying)
relationship with a solid line between the entities

Weak Entities: A weak entity is one that meets two conditions:


The entity is existence –dependent.i.e., it cannot exist without the entity with which it has a
relationship.
The entity has a primary key that is partially or totally derived from the parent entity in the
relationship.
For example, a company insurance policy may insure an employee and
his/her dependents. For the purpose of describing an insurance policy, an
EMPLOYEE may or may not have a DEPENDENT, but the DEPENDENT must
be associated with an EMPLOYEE. Moreover, the DEPENDENT cannot exist
without the EMPLOYEE.
DEPENDENT is the weak entity in the relationship "EMPLOYEE has DEPENDENT “

The Chen’s model identifies the weak entity by using a double-walled entity rectangle.

15
DATABASE MANAGEMENT SYSTEMS

A strong (identifying) relationship indicates that the related entity is weak, because such
a relationship means that both conditions for the weak entity definition have been met-
the related entity is existence-dependent and the PK of the related entity contains a
PK component of the parent entity

Refining the ER Design:


We can now refine the database design in below Figure by changing the attributes that
represent relationships into relationship types. The cardinality ratio and participation
constraint of each relationship type are determined from the requirements listed in a sample
database application. If some cardinality ratio or dependency cannot be determined from the
requirements, the users must be questioned further to determine these structural constraints.
In our example, we specify the following relationship types:
➢ MANAGES, which is a 1:1(one-to-one) relationship type between EMPLOYEE and
DEPARTMENT. EMPLOYEE participation is partial. DEPARTMENT participation is not
clear from the requirements. We question the users, who say that a department must
have a manager at all times, which implies total participation. The attribute Start_date
is assigned to this relationship type.
➢ WORKS_FOR, a 1:N (one-to-many) relationship type between DEPARTMENT and
EMPLOYEE. Both participations are total.
➢ CONTROLS, a 1:N relationship type between DEPARTMENT and PROJECT. The
participation of PROJECT is total, whereas that of DEPARTMENT is determined to be
partial, after consultation with the users indicates that some departments may control
no projects.
➢ SUPERVISION, a 1:N relationship type between EMPLOYEE (in the supervisor role) and
EMPLOYEE (in the supervisee role). Both participations are determined to be partial,
after the users indicate that not every employee is a supervisor and not every
employee has a supervisor.
➢ WORKS_ON, determined to be an M:N (many-to-many) relationship type with
attribute Hours, after the users indicate that a project can have several employees
working on it. Both participations are determined to be total.
➢ DEPENDENTS_OF, a 1:N relationship type between EMPLOYEE and DEPENDENT, which
is also the identifying relationship for the weak entity type DEPENDENT. The
participation of EMPLOYEE is partial, whereas that of DEPENDENT is total.
After specifying the previous six relationship types, we remove from the entity types in
Figure ,all attributes that have been refined into relationships. These include Manager
and Manager_start_date from DEPARTMENT; Controlling_department from PROJECT;
Department, Supervisor, and Works_on from EMPLOYEE; and Employee from
DEPENDENT. It is important to have the least possible redundancy when we design the
conceptual schema of a database.

16
DATABASE MANAGEMENT SYSTEMS

ER-Diagram Notations:

17
DATABASE MANAGEMENT SYSTEMS

Naming Conventions for ER Diagrams:


We choose to use singular names for entity types, rather than plural ones, because the
entity type name applies to each individual entity belonging to that entity type.
In ER diagrams, we will use the convention that entity type and relationship type names
are uppercase letters, attribute names have their initial letter capitalized, and role names
are lowercase letters.
Entities are represented by Nouns; relationships are represented by using verbs,
Attribute names generally from additional nouns that describe the nouns corresponding
to entity types.

Design Choices for ER Conceptual Design:

Sometimes it is not clear whether a particular miniworld concept ought to be modeled as


an entity type, an attribute, or a relationship type. Here are some guidelines (given with
the understanding that schema design is an iterative process in which an initial design is
refined repeatedly until a satisfactory result is achieved):

➢ As happened in our development of the ER model for COMPANY, if an


attribute of entity type A serves as a reference to an entity of type B, it may be
wise to refine that attribute into a binary relationship involving entity types A and
B. It may well be that B has a corresponding attribute referring back to A, in which
case it, too, is refined into the aforementioned relationship. In our COMPANY
example, this was exemplified by the Projects and ControllingDept attributes of
DEPARTMENT and PROJECT, respectively.
➢ An attribute that exists in several entity types may be refined into its own entity
type. For example, suppose that in a UNIVERSITY database we have entity
types STUDENT, INSTRUCTOR, and COURSE, all of which have a Department
attribute. Then it may be wise to introduce a new entity type, DEPARTMENT, and
then to follow the preceding guideline by introducing a binary relationship
between DEPARTMENT and each of the three aforementioned entity types.
➢ An entity type that is involved in very few relationships (say, zero, one, or possibly
two) could be refined into an attribute (of each entity type to which it is related).

18

You might also like