Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

CSE221: Database Systems

Lecture 2: Data Modeling -


Entity-Relationship (ER) and
Extended ER Models
Professor Shaker El-Sappagh
Shaker.elsappagh@gu.edu.eg
Spring 2023
Chapter Outline
• Overview of Database Design Process
• Example Database Application (COMPANY)
• ER Model Concepts
• Entities and Attributes
• Entity Types, Value Sets, and Key Attributes
• Relationships and Relationship Types
• Weak Entity Types
• Roles and Attributes in Relationship Types
• ER Diagrams - Notation
• ER Diagram for COMPANY Schema
• Alternative Notations – UML class diagrams, others
• Relationships of Higher Degree
Overview of Database Design Process
• Two main activities for database application design:
• Database design
• Applications design
• Conceptual database design: To design the conceptual schema for a
database application. It is structures and constraints definition.
• Applications design focuses on the programs and interfaces that access the
database
• Generally considered part of software engineering
Overview of Database Design Process

User-defined operations (or Specified in as detailed and


transactions) that will be complete a form as possible.
applied to the database,
including both retrievals and
updates. Entity types, relationships,
and constraints
Data model mapping is often
automated or semiautomated
within the database design tools.

Internal storage structures, file


organizations, indexes, access
paths, and physical design
parameters for the database files
are specified.
Methodologies for Conceptual Design
• Entity Relationship (ER) modeling: popular high-level conceptual data model.
The output is ER diagrams (ERD).
• Enhanced Entity Relationship (EER)
• Use of Design Tools in industry for designing and documenting large scale
designs
• The UML (Unified Modeling Language): popular in both database and
software design. It provides various types of diagrams: data flow diagrams,
sequence diagrams, use case diagram, activity diagram, class diagram, etc.
• Class Diagrams are popular in industry to document conceptual database
designs. Operations on objects are specified, in addition to specifying the
database schema structure.
Example COMPANY Database
• We need to create a database schema design based on the following
(simplified) requirements of the COMPANY Database:
• COMPANY database keeps track of a company’s employees, departments, and projects.
• The company is organized into DEPARTMENTs. Each department has a name, number and
an employee who manages the department. We keep track of the start date of the
department manager. A department may have several locations.
• Each department controls a number of PROJECTs. Each project has a unique name,
unique number and is located at a single location.
• The database will store each EMPLOYEE’s social security number, address, salary, sex, and
birthdate.
• Each employee works for one department but may work on several projects.
• The DB will keep track of the number of hours per week that an employee currently works on each
project.
• It is required to keep track of the direct supervisor of each employee.
• Each employee may have a number of DEPENDENTs.
• For each dependent, the DB keeps a record of name, sex, birthdate, and relationship to the employee.
ER Model Concepts
• Entities and Attributes
• Entity is a basic concept for the ER model. Entities are specific things or objects in the
mini-world that are represented in the database.
• An entity may be an object with a physical existence (for example, a particular person, car,
house, or employee) or it may be an object with a conceptual existence (for instance, a
company, a job, or a university course).
• For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX PROJECT
• Attributes are properties used to describe an entity.
• For example an EMPLOYEE entity may have the attributes Name, SSN, Address, Sex, BirthDate
• A specific entity will have a value for each of its attributes.
• For example a specific employee entity may have Name='John Smith', SSN='123456789',
Address ='731, Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55‘
• Each attribute has a value set (or data type) associated with it – e.g. integer, string, date,
enumerated type, …
Examples: two entities and the values of their
attributes

The EMPLOYEE entity e1 has four attributes: Name, Address, Age, and Home_phone; their values are ‘John Smith,’ ‘2311
Kirby, Houston, Texas 77001’, ‘55’, and ‘713-749-2630’, respectively. The COMPANY entity c1 has three attributes: Name,
Headquarters, and President; their values are ‘Sunco Oil’, ‘Houston’, and ‘John Smith’, respectively.
Types of Attributes (1)
• Simple
• Each entity has a single atomic value for the attribute. For example, SSN or Sex.
• Composite
• The attribute may be composed of several components. For example:
• Address(Apt#, House#, Street, City, State, ZipCode, Country), or
• Name(FirstName, MiddleName, LastName).
• Composition may form a hierarchy where some components are themselves composite.
• Multi-valued
• An entity may have multiple values for that attribute. For example, Color of a CAR or
PreviousDegrees of a STUDENT.
• Denoted as {Color} or {PreviousDegrees}.
Types of Attributes (2)
• In general, composite and multi-valued attributes may be nested arbitrarily to any number of
levels, although this is rare. This is called Complex attribute.
• For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute
denoted by {PreviousDegrees (College, Year, Degree, Field)}
• Multiple PreviousDegrees values can exist
• Each has four subcomponent attributes:
• College, Year, Degree, Field
• We can represent arbitrary nesting by grouping components of a composite attribute
between parentheses ( ) and separating the components with commas, and by displaying
multivalued attributes between braces { }.
Types of Attributes (2)
• Attribute can be stored or derived: In some cases, two (or more) attribute values are
related—for example, the Age and Birth_date attributes of a person. For a particular
person entity, the value of Age can be determined from the current (today’s) date and
the value of that person’s Birth_date.
• The Age attribute is hence called a derived attribute and is said to be derivable from the
Birth_date attribute, which is called a stored attribute.
• Some attribute values can be derived from related entities; for example, an attribute
Number_of_employees of a DEPARTMENT entity can be derived by counting the
number of employees related to (working for) that department.
Example of a composite attribute
Entity Types and Key Attributes (1)
• Entities with the same basic attributes are grouped or typed into an
entity type.
• For example, the entity type EMPLOYEE and PROJECT.
• An attribute of an entity type for which each entity must have a unique
value is called a key attribute of the entity type.
• For example, SSN of EMPLOYEE.
• An entity type usually has one or more attributes whose values
are distinct for each individual entity in the entity set. Such an
attribute is called a key attribute, and its values can be used to
identify each entity uniquely.
Entity Types and Key Attributes (2)
• A key attribute may be composite.
• VehicleTagNumber is a key of the CAR entity type with components
(Number, State).
• An entity type may have more than one key.
• The CAR entity type may have two keys:
• VehicleIdentificationNumber (popularly called VIN)
• VehicleTagNumber (Number, State), aka license plate number.
• Each key is underlined (Note: this is different from the relational schema
where only one “primary key is underlined). It is defined by the
uniqueness constraint.
Entity Set
• Each entity type will have a collection of entities stored in the
database
• Called the entity set or sometimes entity collection
• Same name can be used to refer to both the entity type and the
entity set
• However, entity type and entity set may be given different names
• Entity set is the current state of the entities of that type that are
stored in the database.
• An entity type describes the schema or intension for a set of
entities that share the same structure. The collection of entities
of a particular entity type is grouped into an entity set, which is
also called the extension of the entity type.
Entity type and entity set
Value Sets (Domains) of Attributes
• Each simple attribute is associated with a value set
• E.g., Lastname has a value which is a character string of upto 15 characters, say
• Date has a value consisting of MM-DD-YYYY where each letter is an integer
• A value set specifies the set of values associated with an attribute
• NULL Values: used in two situations
• Not applicable: In some cases, a particular entity may not have an applicable value for
an attribute. For example, a College_degrees attribute applies only to people with
college degrees. For such situations, a special value called NULL is created.
• Unknown: NULL can also be used if we do not know the value of an attribute for a
particular entity. For example, if we do not know the home phone number of ‘John
Smith’. It has two meanings: The first case arises when it is known that the attribute
value exists but is missing. The second case arises when it is not known whether the
attribute value exists.
Attributes and Value Sets
• Value sets are similar to data types in most programming languages – e.g.,
integer, character (n), real, bit
• Mathematically, an attribute A for an entity type E whose value set is V is
defined as a function
A : E -> P(V)
Where P(V) indicates a power set (which means all possible subsets) of V. The
above definition covers simple and multivalued attributes, as well as NULLs.
NULL value is represented by the empty set.
• We refer to the value of attribute A for entity e as A(e).
• For single-valued attributes, A(e) is restricted to being a singleton set for each
entity e in E, whereas there is no restriction on multivalued attributes.
• For a composite attribute A, the value set V is the power set of the Cartesian
product of P(V1), P(V2), . . . , P(Vn), where V1, V2, . . . , Vn are the value sets of
the simple component attributes that form A: V = P(P(V1) × P(V2) × . . . × P(Vn))
Displaying an Entity type
• In ER diagrams, an entity type is displayed in a rectangular box
• Attributes are displayed in ovals
• Each attribute is connected to its entity type
• Components of a composite attribute are connected to the oval representing
the composite attribute
• Each key attribute is underlined
• Multivalued attributes displayed in double ovals
NOTATION for ER diagrams

Slide 3- 20
Entity Type CAR with two keys and a corresponding Entity Set
Key attribute
• Single attribute key: For the PERSON entity type, a typical key attribute is SSN
(Social Security number).
• Multi-attributes key: Sometimes several attributes together form a key,
meaning that the combination of the attribute values must be distinct for
each entity. The proper way to represent this in the ER model that we
describe here is to define a composite attribute and designate it as a key
attribute of the entity type.
• Notice that such a composite key must be minimal; that is, all component
attributes must be included in the composite attribute to have the
uniqueness property.
• Uniqueness property must hold for every entity set of the entity type.
• So, constraint that it prohibits any two entities from having the same value
for the key attribute at the same time. This is for any entity set of the entity
type at any point in time.
Initial Conceptual Design of Entity Types for the COMPANY
Database Schema
• First, define entity types (relations or tables) and their attributes.
Second, define the relationships among relations.
• Based on the requirements, we can identify four initial entity types in
the COMPANY database:
• DEPARTMENT
• PROJECT
• EMPLOYEE
• DEPENDENT
• Their initial conceptual design is shown on the following slide
• The initial attributes shown are derived from the requirements
description
Initial Design of Entity Types: EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT
Refining the initial design by introducing relationships
• The initial design is typically not complete. Why?
• there are several implicit relationships among the various entity types.
• When an attribute of one entity type refers to another entity type, somerelationship
exists.
• For example, the attribute Manager of DEPARTMENT refers to an EMPLOYEE who
manages the department; the attribute Controlling_department of PROJECT refers to
the DEPARTMENT that controls the project; the attribute Supervisor of EMPLOYEE
refers to another EMPLOYEE (the one who supervises this employee); the attribute
Department of EMPLOYEE refers to the DEPARTMENT for which the employee works
• Some aspects in the requirements will be represented as relationships
• ER model has three main concepts:
• Entities (and their entity types and entity sets)
• Attributes (simple, composite, multivalued)
• Relationships (and their relationship types and relationship sets)
Relationships and Relationship Types (1)
• A relationship relates two or more distinct entities with a specific meaning.
• For example, EMPLOYEE John Smith works on the ProductX PROJECT, or EMPLOYEE Franklin
Wong manages the Research DEPARTMENT.
• Relationships of the same type are grouped or typed into a relationship type.
• For example, the WORKS_ON relationship type in which EMPLOYEEs and PROJECTs
participate, or the MANAGES relationship type in which EMPLOYEEs and DEPARTMENTs
participate.
• The degree of a relationship type is the number of participating entity types.
• Both MANAGES and WORKS_ON are binary relationships.
Relationship instances of the WORKS_FOR N:1 relationship between EMPLOYEE and DEPARTMENT

Slide 3- 27
Relationship instances of the M:N WORKS_ON relationship between EMPLOYEE and PROJECT

Slide 3- 28
Definition
Relationship type vs. relationship set (1)

• Relationship Type:
• Is the schema description of a relationship
• Identifies the relationship name and the participating entity types
• Also identifies certain relationship constraints
• Relationship Set:
• The current set of relationship instances represented in the database
• The current state of a relationship type
Relationship type vs. relationship set (2)

• Previous figures displayed the relationship sets


• Each instance in the set relates individual participating entities – one from each
participating entity type
• In ER diagrams, we represent the relationship type as follows:
• Diamond-shaped box is used to display a relationship type
• Connected to the participating entity types via straight lines
• Note that the relationship type is not shown with an arrow. The name should
be typically be readable from left to right and top to bottom.
Refining the COMPANY database schema by introducing
relationships
• By examining the requirements, six relationship types are identified
• All are binary relationships( degree 2)
• Listed below with their participating entity types:
• WORKS_FOR (between EMPLOYEE, DEPARTMENT)
• MANAGES (also between EMPLOYEE, DEPARTMENT)
• CONTROLS (between DEPARTMENT, PROJECT)
• WORKS_ON (between EMPLOYEE, PROJECT)
• SUPERVISION (between EMPLOYEE (as subordinate), EMPLOYEE (as supervisor))
• DEPENDENTS_OF (between EMPLOYEE, DEPENDENT)
ER DIAGRAM – Relationship Types are:
WORKS_FOR, MANAGES, WORKS_ON, CONTROLS, SUPERVISION, DEPENDENTS_OF
Relationship Degree, Role Names, and Recursive Relationships
Degree of a Relationship Type:
• The degree of a relationship type is the number of participating entity types.
• WORKS_FOR relationship is of degree two.
• A relationship type of degree two is called binary, and one of degree three is called ternary.
• An example of a ternary relationship is SUPPLY, where each relationship instance ri
associates three entities—a supplier s, a part p, and a project j—whenever s supplies part p
to project j.
Relationship Degree, Role Names, and Recursive Relationships
Relationships as Attributes:
• It is sometimes possible to think of a binary relationship type in terms of attributes.
• Consider the WORKS_FOR relationship type. One can think of an attribute called
Department of the EMPLOYEE entity type, where the value of Department for each
EMPLOYEE entity is (a reference to) the DEPARTMENT entity for which that employee
works.
Role Names and Recursive Relationships:
• Each entity type that participates in a relationship type plays a particular role in the
relationship.
• For example, in the WORKS_FOR relationship type, EMPLOYEE plays the role of employee
or worker and DEPARTMENT plays the role of department or employer.
• In some cases, the same entity type participates more than once in a relationship type in
different roles. In such cases the role name becomes essential for distinguishing the
meaning of the role that each participating entity plays. Such relationship types are called
recursive relationships or self-referencing relationships.
Recursive Relationship Type
• A relationship type between the same participating entity type in distinct roles
• Also called a self-referencing relationship type.
• Example: the SUPERVISION relationship
• EMPLOYEE participates twice in two distinct roles:
• supervisor (or boss) role
• supervisee (or subordinate) role
• Each relationship instance relates two distinct EMPLOYEE entities:
• One employee in supervisor role
• One employee in supervisee role
Displaying a recursive relationship
• In a recursive relationship type.
• Both participations are same entity type in different roles.
• For example, SUPERVISION relationships between EMPLOYEE (in
role of supervisor or boss) and (another) EMPLOYEE (in role of
subordinate or worker).
• In following figure, first role participation labeled with 1 and second
role participation labeled with 2.
• In ER diagram, need to display role names to distinguish
participations.
A Recursive Relationship Supervision`
Recursive Relationship Type is:
SUPERVISION
(participation role names are
shown)
Refining the ER Design for
the COMPANY Database
Refining the ER Design for the COMPANY Database (Cont’d)
• In the refined design, some attributes from the initial entity types are
refined into relationships:
• Manager of DEPARTMENT -> MANAGES
• Works_on of EMPLOYEE -> WORKS_ON
• Department of EMPLOYEE -> WORKS_FOR
• etc
• In general, more than one relationship type can exist between the same
participating entity types
• MANAGES and WORKS_FOR are distinct relationship types between EMPLOYEE
and DEPARTMENT
• Different meanings and different relationship instances.
Constraints on Relationships
• Constraints on Relationship Types
• (Also known as ratio constraints)
• Cardinality Ratio (specifies maximum participation)
• One-to-one (1:1)
• One-to-many (1:N) or Many-to-one (N:1)
• Many-to-many (M:N)
• Existence Dependency Constraint (specifies minimum participation) (also called
participation constraint)
• zero (optional participation, not existence-dependent)
• one or more (mandatory participation, existence-dependent)

Slide 3- 42
1:1 relationship
Many-to-one (N:1) Relationship
Many-to-many (M:N) Relationship
Weak Entity Types
• An entity that does not have a key attribute and that is identification-dependent on
another entity type.
• In contrast, regular entity types (strong entity types) that do have a key attribute.
• A weak entity must participate in an identifying relationship type with an owner or
identifying entity type
• Entities are identified by the combination of:
• A partial key of the weak entity type
• The particular entity they are related to in the identifying relationship type
• Example:
• A DEPENDENT entity is identified by the dependent’s first name, and the specific
EMPLOYEE with whom the dependent is related
• Name of DEPENDENT is the partial key
• DEPENDENT is a weak entity type
• EMPLOYEE is its identifying entity type via the identifying relationship type
DEPENDENT_OF
Weak Entity Types
Weak Entity Types
Attributes of Relationship types
• A relationship type can have attributes:
• For example, HoursPerWeek of WORKS_ON
• Its value for each relationship instance describes the number of hours per
week that an EMPLOYEE works on a PROJECT.
• A value of HoursPerWeek depends on a particular (employee, project)
combination
• Most relationship attributes are used with M:N relationships
• In M:N relationship types, some attributes may be determined by the
combination of participating entities in a relationship instance, not by any single
entity. Such attributes must be specified as relationship attributes.
• In 1:N relationship types, they can be transferred to the entity type on the N-side
of the relationship.
• In 1:1 relationship types, they can be migrated to any of the entity types of the
sides of the relationship.
Example Attribute of a Relationship Type: Hours of WORKS_ON
Notation for Constraints on Relationships
• Cardinality ratio (of a binary relationship): 1:1, 1:N, N:1, or M:N
• Shown by placing appropriate numbers on the relationship edges.
• Participation constraint (on each participating entity type): This
constraint specifies the minimum number of relationship instances
that each entity can participate in and is sometimes called the
minimum cardinality constraint:
• Two types: total (called existence dependency) or partial.
• Total shown by double line, partial by single line.
• Example of total participation : e.g., every employee must work for a
department, then an employee entity can exist only if it participates in at
least one WORKS_FOR relationship instance.
• Example of partial participation: e.g., if some or part of the set of employee
entities are related to some department entity via MANAGES, but not
necessarily all.
Alternative (min, max) notation for relationship structural
constraints:
• Specified on each participation of an entity type E in a relationship type R
• Specifies that each entity e in E participates in at least min and at most max relationship
instances in R
• Default(no constraint): min=0, max=n (signifying no limit)
• Must have minmax, min0, max 1
• Derived from the knowledge of mini-world constraints
• Examples:
• A department has exactly one manager and an employee can manage at most one
department.
• Specify (0,1) for participation of EMPLOYEE in MANAGES
• Specify (1,1) for participation of DEPARTMENT in MANAGES
• An employee can work for exactly one department but a department can have any
number of employees.
• Specify (1,1) for participation of EMPLOYEE in WORKS_FOR
• Specify (0,n) for participation of DEPARTMENT in WORKS_FOR
The (min,max) notation for relationship
constraints

Read the min,max numbers next to the entity


type and looking away from the entity type
COMPANY ER Schema Diagram using (min, max) notation
Alternative diagrammatic notation
• ER diagrams is one popular example for displaying database schemas
• Many other notations exist in the literature and in various database
design and modeling tools
• UML class diagrams is representative of another way of displaying ER
concepts that is used in several commercial design tools
UML class diagrams

• Represent classes (similar to entity types) as large rounded boxes with three
sections:
• Top section includes entity type (class) name
• Second section includes attributes
• Third section includes class operations (operations are not in basic ER model)
• Relationships (called associations) represented as lines connecting the classes
• Other UML terminology also differs from ER terminology
• Used in database design and object-oriented software design
• UML has many other types of diagrams for software design
UML class diagram for COMPANY database schema

Slide 3- 57
Other alternative diagrammatic notations
Relationships of Higher Degree
• Relationship types of degree 2 are called binary
• Relationship types of degree 3 are called ternary and of degree n are
called n-ary
• In general, an n-ary relationship is not equivalent to n binary
relationships
• Constraints are harder to specify for higher-degree relationships (n >
2) than for binary relationships
Discussion of n-ary relationships (n > 2)

• In general, 3 binary relationships can represent different information than a


single ternary relationship
• If needed, the binary and n-ary relationships can all be included in the schema
design
• In some cases, a ternary relationship can be represented as a weak entity if the
data model allows a weak entity type to have multiple identifying relationships
(and hence multiple owner entity types)
Example of a ternary
relationship
Discussion of n-ary relationships (n > 2)

• If a particular binary relationship can be derived from a higher-degree


relationship at all times, then it is redundant
• For example, the TAUGHT_DURING binary relationship in Figure 3.18
(see next slide) can be derived from the ternary relationship OFFERS
(based on the meaning of the relationships)

Slide 3- 62
Another example of a ternary relationship

Slide 3- 63
Displaying constraints on higher-degree relationships

• The (min, max) constraints can be displayed on the edges – however, they do not
fully describe the constraints
• Displaying a 1, M, or N indicates additional constraints
• An M or N indicates no constraint
• A 1 indicates that an entity can participate in at most one relationship instance that
has a particular combination of the other participating entities
• In general, both (min, max) and 1, M, or N are needed to describe fully the
constraints
• Overall, the constraint specification is difficult and possibly ambiguous when
we consider relationships of a degree higher than two.
Another Example: A UNIVERSITY Database
• To keep track of the enrollments in classes and student grades,
another database is to be designed.
• It keeps track of the COLLEGEs, DEPARTMENTs within each college,
the COURSEs offered by departments, and SECTIONs of courses,
INSTRUCTORs who teach the sections etc.
• These entity types and the relationships among these entity types are
shown on the next slide in Figure 3.20.
UNIVERSITY
database conceptual
schema

©2016 Ramez Elmasri and Shamkant B. Navathe Slide 3- 66


Extended Entity-Relationship (EER) Model (in
the next chapter)

• The entity relationship model in its original form did not support the
specialization and generalization abstractions
• Next chapter illustrates how the ER model can be extended with
• Type-subtype and set-subset relationships
• Specialization/Generalization Hierarchies
• Notation to display them in EER diagrams
Thank you

You might also like