Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
78 views62 pages

Data Modeling MIT2

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 62

Data modeling

Prof. Amos DAVID


http://ui-n2.loria.fr

Course content

Why data modeling ? Entity-Relation model Relational model We will focus on


WHY WHAT

HOW
2

Why data modeling ?

Illustration with an example

From information problem statement to data specification

A case study

Problem statement

We want an information system on students

Questions

What is an information system ? What do we mean by students ? Why do we want the information system ?

The system should provide answers to what questions ?


4

What is an information system ?

Functional characteristics of an information system


Store information (creation) Retrieve information (access) Update information (modification, deletion)

Components of an information system

Users

End-user Information system manager/administrator

Information base (database) User interface


To implement the functional characteristics Between the end-user and the information base

General schema of an IRS : functional approach


User
Information problem

Information problem transformation into access expression


Common access methods * navigation * query

Information base

Matching

Store /Update

Results

Objects
6

What do we mean by students ?

Students viewed by who ?


Admission office ? Post graduate school ? Registrars office ? A university department ? Alumni association ? By a state government ? Foreign government ? By other structures ?

University structures

Government structures

Students considered over what period ?


From admission to graduation only ? Consider ex-students ?

Who are the end-users ?


This will determine how students are viewed This will determine the final use of the information to be accessed Examples of end-users in the university structure

The VC The registrar The Dean The HOD Any category of student (in-course, ex-student)
8

Why data modeling ?

Represent the real world

Focus on the use of the elements Represent only the necessary elements Represent the relationships between the selected elements DO NOT ignore or neglect necessary elements or relations
9

Why data modeling ?

For efficient computerization

Reduce data redundancy


Disk space problem Volume of data transfer Describe the computed elements user notice technical notice

Objects of documentation

For the programmer For the system designer / manager For the end-users

Guaranty data integrity

Valid information irrespective of context


10

Example on data integrity

Admission office

STUDENT (N, names, marital status, gender, age, degree, address)

The department

STUDENT (N, names, date of birth, courses, address)

The administrative office

PERSONNEL (N, names, marital status, faculty, department, address)


11

General schema of an IRS : Users and usage centered approach


IRS (What is seen by the end-user) Implemented for Database Management System (DBMS) Determined by

Developed using

Event Operation Data


Data modeling Real world
12

Entity-Relation model

13

Represent the real world elements with four main concepts


Entity Attributes Relation Cardinality Employs graphic representation Intuitive approach

14

Entity

The basic conceptual or real element Examples

A student A personnel A town

Entities have real existence (the instances) They are identifiable


Amos DAVID Charles ROBERT Ibadan


15

Entity

Each entity is associated with a set of attributes The instances of an entity have the same characteristics They have the same set of attributes Examples

All students have the same set of attributes All members of staff have the same set of attributes

16

Attributes

Attributes are used for describing the entities The entities and their attributes are determined according to the database project Taking into account the functions to be accomplished Examples

Represent students at the department for course registration Represents members of staff for salaries and promotions One of the attributes must be an IDENTIFIER Its value is unique for each entity
17

Attributes

How to reduce redundancy

Avoid structured attributes

Structured attributes should be decomposed

Example

Names First name, last name Address Street n, street name, town, local government, state

Decomposing structured attributes allows an easy access to the component elements

Example

The town element of an address can be easily extracted instead of performing string extraction on the structured element

18

Attributes

Examples

Name, Address Amos DAVID; Dept computer science, UI Ibadan, Ibadan, Oyo state Olu OJO; 23 Aderemo street, Agbowo, AgbowoLGA, Ibadan, Oyo state Uche KALU; 5 market road, Anambra, Anambra state

Problems with this representation

The addresses do not have the same number of elements, so how can one obtain a specific component ?

The nth element ? Starting from the nth character ? How can one locate the town within an address ?
19

Attributes

Examples Dissociate structured elements

Name, Street number, Street name, Town, Local government, State

Amos DAVID; Dept computer science; UI Ibadan; Ibadan; ;Oyo state Olu OJO; 23; Aderemo street; Agbowo; Ibadan; ; Oyo state Uche KALU; 5; market road; Anambra; ; Anambra state
Efficiency

Each entry has the same number of elements A component element can be easily extracted using its position Example

The town value is always at the 4th position The state value is always at the last position The position can be in string functions or for the colon numbers in tables

20

Attributes reducing redundancy

Avoid attributes whose value is a list ; a new entity should be created Example (memory redundancy)

Courses as attribute of Degree We do not know the number of courses for a degree

Create DEGREE and COURSE Associate the two entities (to be seen later)

21

Example (memory redundancy)


Computer science, course 1, course 2, course 3 Biology, course 3, course 6, course 7, course 20 Chemistry, course 7, course 3, course 8, course 9, course 10 In terms of memory allocation, how many courses should be anticipated ? Because of the unknown number of courses, the anticipated number will either be too few or too many

22

Attributes

How to ensure data integrity

Identify the functional dependency between attributes

Example of dependency

A town belongs to only one state Towns are unique there is dependency between town and state if the town is known, the state can be determined unambiguously

In a case of functional dependency, create a new entity to regroup the dependent attributes Create a relation between the new entity and the original one

23

Attributes

Examples

Name, Street number, Street name, Town, Local government, State 1.Amos DAVID; Dept computer science; UI Ibadan; Ibadan; Oyo state 2.Olu OJO; 23; Aderemo street; Agbowo; Ibadan; Oyo state 3.Samuel UCHE; 213 Sango road; Dugbe; Ibadan; Oyo state 4.Uche KALU; 5; market road; Anambra; ; Anambra state

Entities 1, 2 and 3 are redundant, prone to non integrity

Entering entities 1, 2 and 3 (town, state) three times may produce typographical error Should a town change from one state to another, all the entities are no longer valid

All the entities must be modified

24

Entity graphical representation

An entity is represented by a rectangle divided into two parts The name of the entity is represented at the upper part The names of the attributes are represented at the lower part The identifier is underlined

PERSON Number Last name First name Date of birth

TOWN Town name State Local government

25

Relation

A relation specifies the association between two or more entities Example

Town and Person

The relation should specify the semantic of the association

A person lives in a town


26

Relation

A relation is symbolized by an oval with its semantic inside the oval A relation is further specified by cardinalities that indicate the number of associated instances Example

A person lives in a minimum of one town and in a maximum of 1 town A town is inhabited by a minimum of one person and a maximum of n (indicating several)

TOWN Name Surface area State Local government

(1,1)

Lives in (1,n)

PERSON Number Last name First name Date of birth


27

Relation

A relation may sometimes have an attribute

The attribute describes the relation and not the entities associated Example

The number of an article bought by a client as well as the date are neither an attribute of the client nor that of the article, but an attribute of the association

The attributes of the relation are indicated at the lower part of the oval that represents the relation
28

Relation
CLIENT Number Last name First name Date of birth ARTICLE Name Unit price (1,m)

(1,n)

Bought Quantity Date

29

Relation

Maximum cardinality

This indicates the maximum cardinalities on either side of a relation

Example
[n:1] (1,1)

TOWN Name Surface area State Local government

Lives in (1,n)

PERSON Number Last name First name Date of birth


30

Relation

How to read the relations : recall


A person lives in a minimum of 1 town and in a maximum of 1 town A town is inhabited (is lived) by a minimum of one person and a maximum of n person (several)
[n:1]

TOWN Name Surface area State Local government

(1,1)

Lives in (1,n)

PERSON Number Last name First name Date of birth


31

Proposed methodology

Identify an entity Develop fully the entity and chose the identifier Associate the entity with existing ones if and where necessary Establish the cardinalities

32

Relation

A relation can be between two same entities Example

A person is the father of another person WARNING : A person is the father of 0 or many persons ; A person has as father 1 and only 1 person
(0,n) PERSON Number Last name First name Date of birth (1,1)

Father of

33

Exercise

In a shopping center, a client can buy one or more articles of various quantities. Propose an ER model for representing the elements of information necessary for billing the client.

34

REMARKS

The entity and the association as described above correspond to the description of the concepts. In the literature, they are termed entity-type and relationtype. Their instantiations (existence) are termed entity and relation.

For us we use entity and entity-type ; relation and relation-type indifferently.


35

Relational model

36

Relational model

The basic concepts

Relation Domain Attribute Key N-uplet

37

Relation (Table)
STUDENT
Last name

Attributes
First name Date of birth Degree

N-Uplets
Domain (same types of value : names)
38

Domain

Represents the data type of a column Can be defined in form of intention or extension In form of intention, it is specified by a formal definition

Example

Integer values Character set of less than 20 characters

In form of extension, it is specified as a finite list of values

Example

Town : {Oyo, Ibadan, Lagos}


39

Relation

A relation R is represented as R(A1, , An)

Where

A1 takes its values from D1 An takes its values from Dm m <= n

40

Attribute

An attribute specifies a constituent of the relation (a particular column of the table)

Attributes are unique within a relation (each column must be distinguished from the others)

Two columns should not have the same name

Two attributes may have the same domain

Example

First name, Last name : NAMES


41

Cartesian product of relation


Let R(A1, A2) be a relation The Cartesian product of the relation represents all the possible combinations of the values of the attributes Example

Cars-parked (Make, Color) Where


Make makes of cars (Toyota, Peugeot) Color colors of cars (red, black, white)
42

Cartesian product of a relation


Toyota, red Toyota, black Toyota, white Peugeot, red Peugeot, black Peugeot, white

43

The intention of a relation

The intention of a relation specifies how the relation should be interpreted Example

Cars-parked (Make, Color) Cars parked in front of the department of computer science, University of Ibadan
44

N-uplet

Represents the extension of a relation It is a Cartesian product of attributes


A line of the table Also called a record

Example

Person (First name, Surname)

Amos, David John, Olaoye


45

Schema of a relation

The schema of a relation specifies the intention of the relation and the associated integrity constraints

46

Constraint of data integrity

Constraint on a single attribute

Example

The values of vehicle makes should be German vehicles

Constraint based on two attributes

Example

The date of marriage should be date of birth

Constraint on the table n-uplets

Example

The number of registrations for a degree in one year should be limited to one
47

Maximum Key

A set of attributes of a relation whose values are distinct for each n-uplet Example

Person (Matriculation number, First name, Last name, Date of birth, email) All the attributes combined can form the maximum key

48

Minimum key (the key)

A key is the minimum set of attributes of a relation whose values are distinct for each n-uplet Examples

Student N, first name, last name

The Student N, first name combined can be used as key, but only the Student N is sufficient

Student N, first name, last name, email

Either Student N or email can be used as key


49

Functional dependency

In the relation R(X, Y, Z),

There is functional dependency between X and Y if and only if the value of Y is determined by the value of X and the value of X is unique

Example

Person (N, FN, LN, town, state)

There is FD between town and state


50

Relations in 1st normal form

A relation is said to be in 1st normal form if all the attributes are of single values

51

Relations in 2nd normal form

A relation is in 2nd normal form if and only if it is in 1st normal form and there is no FD between a subset of the key and the rest of the attributes

This mean that the key of the relation must be a minimum key

52

Relations in 3rd normal form

A relation is in 3rd normal form if it is in 2nd normal form and there is no FD between non key attributes

53

From ERM to RM
1. 2. 3. 4.

5.

To each entity corresponds a relation (Entity name relation name) To each attribute of an entity corresponds an attribute of the relation The identifier of the entity becomes the key of the relation For associations of maximum cardinality [1:n], add the key of the relation on the n side to the relation on the 1 side For associations of maximum cardinality [n:m], a new relation should be created using the concatenation of the keys of the associated relations as the key. The attributes of the association should be added as attributes of the new relation

54

REMARKS

A collection of relations obtained from an entity-relation model as described above will have the following characteristics

Each attribute is single-value The key contains the least number of attributes There are no dependencies between the attributes

A collection of relations that have the above characteristics are considered to be of 3rd normal form

55

Important problem

Some relations resulting from the translation may not have keys In this case, define a new key This happens occasionally particularly in the transformation of NM associations
56

Graph of relations

Specify by pointed arrows the origin of imported attributes Redraw the relations in form of rectangles Use the pointed arrows to link the relations REMARKS

There should be no linked circle


57

Practical

Model the following types of person in the university


the students the members of staff

Be sure to apply the methods for reducing redundancy and guaranteeing data integrity

58

Practical : University students

Description

Each student has a number, a name, an address A student is registered for a degree A student may not register for more than one degree simultaneously A student may take several degrees from the university To a degree is associated a set of courses A degree is managed by a department A course is offered by only one department

Example of questions

What are the courses associated with a degree ? What are the courses taken by a student for a degree ? What are the courses offered by a department ?

Propose an ER model
59

Practical Documentary information system

A library contains the following types of document


Books Journals that contain articles Proceedings that contain articles Write-ups for master and PhD works Books and write-ups are described using title, authors, and a list of keywords Journals and proceedings are described using the title, editor and year of publication Articles are represented using the title, authors, their addresses and a list of keywords The authorized keywords for describing the documents are represented using a thesaurus Propose an ER model and the associated MR to manage the information on the various types of document in the library as well as the thesaurus

60

A thesaurus

List of concepts linked by semantic links The semantic links are

Specific Generic link (hierarchy) See also (association) Used for (synonymous)

61

Example of a thesaurus
Transport
Specific/generic

Plane
Specific/generic

Boat
Specific/generic

Car

Vehicle
See also

Boeing

Airbus

Mercedes

Peugeot

62

You might also like