Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Database Design - Normalization

Uploaded by

Andrew Light
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Database Design - Normalization

Uploaded by

Andrew Light
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

DBMS – Unit 1 Database Design Concept i

Unit 1 Database Design Concept

Table of Contents
Entity Relationship Diagram.......................................................................................................1
Entities and Attributes.........................................................................................................1
Relationships.......................................................................................................................1
Artist Database ERD and Tables.........................................................................................3
General Rules Governing Relationships Among Tables.....................................................3
Normalization Rules...................................................................................................................6
The Need for Normalization...............................................................................................6
Conversion to 1NF..............................................................................................................7
Conversion to 2NF..............................................................................................................8
Conversion to 3NF..............................................................................................................9
Boyce-Codd Normal Form (BCNF).................................................................................10
Database Design........................................................................................................................11
Changing Data into Information.......................................................................................11
The Information System....................................................................................................11
Systems Development Life Cycle.....................................................................................12
Database Lifecycle (DBLC).............................................................................................13
Phase 1: Database Initial Study.................................................................................13
Phase 2: Database Design.........................................................................................14
Phase 3: Implementation and Loading......................................................................16
Phase 4: Testing and Evaluation...............................................................................16
Phase 5: Operation....................................................................................................17
Phase 6: Maintenance and Evaluation......................................................................17
DB Design Strategy Notes........................................................................................................17
Centralized vs. Decentralized Design.......................................................................................17
Database Management System.................................................................................................17
Installation and Uninstallation of Oracle Database Server...............................................18
DBMS – Unit 1 Database Design Concept 1

Entity Relationship Diagram


– Represents the conceptual database as viewed by the end user.

Entities and Attributes


 Entity –
 Corresponds to entire table, not to a row in the relational environment.
 Represented by rectangle containing the entity name, which is a noun written in
capital letters.
 Attributes –
 Characteristics of entities – the STUDENT entity could include the attributes
STU_LNAME, STU_FNAME and STU_INITIAL.
 Domain – domain is set of possible values
 For the numeric attribute GPA is written (0,4) because the lowest and highest
possible GPA is 0 and 4, respectively.
 For the character attribute SEX consists of only two possibilities, M or F.
 For the date attribute HIRE_DATE consists of all dates from startup date to
current date.
 Primary Keys – primary keys are underlined in the E-R diagram.
 Format: TABLE NAME (KEY ATTRIBUTE 1, ATTRIBUTE 2, …)
 For example, the CLASS_CODE is a primary key, then
CLASS (CLASS_CODE, CRS_CODE, CLASS_SECTION, CLASS_TIME,
CLASS_ROOM, PROF_NUM)
If the CLASS_CODE is deleted, then may be represented by
CLASS (CRS_CODE, CLASS_SECTION, CLASS_TIME, CLASS_ROOM,
PROF_NUM)
 Simple Attribute – Cannot be subdivided; ex: age, sex, marital status would be
classified as simple attributes.
 Composite Attribute – Can be subdivided into additional attributes; ex: address into
street, city, zip; Phone number into area code and exchange number.
 Single-valued Attribute – Can have only a single value; ex: person has one social
security number.
 Multi-valued Attribute – Can have many values; ex: person may have several
college degrees or householder may own more phone numbers.
 Derived Attribute – Can be derived with algorithm, instead of being physically
stored within the database; ex: age can be derived from date of birth – by the integer
value of difference between the current date and the employee’s date of birth.
DBMS – Unit 1 Database Design Concept 2

Relationships
– A relationship is an association between entities. For example, a STUDENT takes a
CLASS, a PROFESSOR teaches a CLASS, a DEPARTMENT employs a PROFESSOR, a
DIVISION is managed by an EMPLOYEE, and AIRCRAFT is flown by a CREW, etc. The
entities that participate in a relationship, i.e., connected entities, are called participants.
 Connectivity and Cardinality
 Connectivity describes relationship classification. It may be classified as 1:1, 1:M,
M:N.
 Cardinality expresses the specific number of entity occurrences associated with one
occurrence of related entity.
 Connectivity and cardinality are established by very concise statements known as
business rules, which is derived from a precise and detailed description of an
organization’s data environment, also establish the E-R model’s entities, attributes,
relationships, connectivities and constraints.
 Relationship Strength
 Existence Dependence – Entity’s existence depends on existence of related entities
 Weak (non-identifying) Relationships
 PK of related entity doesn’t contain PK component of parent entity
 For example, the definition of the COURSE and CLASS entities is
COURSE(CRS_CODE, DEPT_CODE, CRS_DESCRIPTION,
CRS_CREDIT)
CLASS(CLASS_CODE, CRS_CODE, CLASS_SECTION, CLASS_TIME,...)
 Strong (identifying) Relationships
 PK of related entity contains PK component of parent entity
 For example, the definition of the COURSE and CLASS entities is
COURSE(CRS_CODE, DEPT_CODE, CRS_DESCRIPTION,
CRS_CREDIT)
CLASS(CRS_CODE, CLASS_SECTION, CLASS_TIME,…)
 Weak Entities – In database design terms, the existence of a strong relationship
between a parent entity and its related entity or entities is associated with weak
entities.
 Relationship Participation
 Optional
 Entity occurrence does not require a corresponding occurrence in related entity
 A PROFESSOR does not to teach a CLASS; i.e., CLASS is optional to
PROFESSOR
 It indicates that the minimum cardinality is 0 for the optional entity.
 Mandatory
DBMS – Unit 1 Database Design Concept 3

 Entity occurrence requires corresponding occurrence in related entity.


 The CLASS must be taught by a PROFESSOR; i.e., PROFESSOR is
mandatory to CLASS.
 Indicates the minimum cardinality is 1 for the mandatory entity.
 Composite Entities
 Used to ‘bridge’ between M:N relationships.
 Bridge Entities, known as Composite Entities, composed of primary keys of each
entity needing connection.

Artist Database ERD and Tables

General Rules Governing Relationships Among Tables

 M:N, Both Sides Mandatory


DBMS – Unit 1 Database Design Concept 4

 M:N, Both Sides Optional

 M:N, One Side Optional

 1:M, Both Sides Mandatory

 1:M, Both Sides Optional


DBMS – Unit 1 Database Design Concept 5

 1:M, Many Side Optional, One Side Mandatory

 1:M, One Side Optional, One Side Mandatory

 1:1, Both Sides Mandatory

 1:1, Both Sides Optional


DBMS – Unit 1 Database Design Concept 6

 1:1, One Side Optional, One Side Mandatory

 Weak Entity, Foreign Key Located in Weak Entity

 Multi-valued Attributes (1:M Relationship, Foreign Key CAR_VIN in the New Table)

Normalization Rules

The Need for Normalization


 The simplified case illustration – for a construction company.
DBMS – Unit 1 Database Design Concept 7

 Manages several building projects.


 Each project contains project number, name, employee, and so on.
 Each employee has ID, name, job title, and so on.
 The clients are charged by billing hours spent on each contract. The rate depends
on employee’s position, i.e., salary.
 The total charge is shown on the figure below.

Conversion to 1NF
 Repeating groups must be eliminated
 Repeating groups, for example, any project number (PROJ_NUM) in the Figure,
can have a group of several data entries.
 Make sure that each row defines a single entity.
 Proper primary key developed; for example, uniquely identifies attribute values
(rows), combination of PROJ_NUM and EMP_NUM.
DBMS – Unit 1 Database Design Concept 8

 Dependencies can be identified with help of the figure above, which is the dependency
diagram shown in first normal form (1NF).
 Desirable dependencies based on primary keys, which are bold, underlined, and
shared in a different color in the figure.
 Less desirable dependencies, which is indicated with arrows below the dependency
diagram. Two types of dependencies exist:
 Partial dependencies – based on part of composite primary key.
 Transitive dependencies – one nonprime attribute depends on another
nonprime attribute.
 The table structure can shown in the format:
TABLE_NAME(PRIMARY_KEY_ATTRIBUTE(S), DEPENDENT
ATTRIBUTES)
 Prime attribute (or key attribute), any attribute that is at least part of a key, e.g.,
PROJ_NUM and EMP_NUM.
 Nonprime attribute (or nonkey attribute), is not even part of a key.
 1NF Summarized
 All key attributes defined
 No repeating groups in table
 All attributes dependent on primary key

Conversion to 2NF
 Start with 1NF format:
 Write each key component on separate line
 Write original (composite) key on last line
PROJ_NUM
EMP_NUM
PROJ_NUM EMP_NUM
 Each component will become the key new table
 Write dependent attributes after each key. For example herein, the three new tables,
PROJECT, EMPLOYEE, and ASSIGN, are described by
PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGN (PROJ_NUM, EMP_NUM, ASSIGN_HOURS)
 The conversion results of this operation are displayed in next page. (It still shows a
transitive dependency, which can generate anomalies – the charge per hour changes for a
job classification that is held by many employees.)
 2NF Conversion Results
 In 1NF
DBMS – Unit 1 Database Design Concept 9

 Includes no partial dependencies


 No attribute dependent on a portion of primary key
 Still possible to exhibit transitive dependency
 Attributes may be functionally dependent on nonkey attributes

Conversion to 3NF
 Break off the piece(s) that are identified by the transitive dependency arrow(s) below the
dependency diagram and storing them in separate table(s) to eliminate transitive
functional dependencies
PROJECT (PROJ_NUM, PROJ_NAME)
DBMS – Unit 1 Database Design Concept 10

ASSIGN (PROJ_NUM, EMP_NUM, ASSIGN_HOURS)


EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)
JOB (JOB_CLASS, CHG_HOUR)
 3NF Summarized
 In 2NF
 Contains no transitive dependencies
 Improve the database’s ability to provide information and to enhance it operational
characteristics for a completed database shown in previous page.
 This conversion has eliminated the original EMPLOYEE table’s transitive
dependency; the tables are now said to be in third normal form (3NF).

Boyce-Codd Normal Form (BCNF)


 Every determinant in the table is a candidate key
 Determinant is attribute whose value determines other values in row.
 3NF table with one candidate key is already in BCNF
 3NF Table Not in BCNF, they both can be equivalent ONLY IF the table contains only
one candidate key.
 Most designers consider the BCNF as a special case of the 3NF. To determine a
table in 3NF but not in BCNF, note that
 A transitive dependency exists when one nonprime attribute is dependent on
another nonprime attribute.
 A table is in 3NF if it is in 2NF and there are no transitive dependencies.
DBMS – Unit 1 Database Design Concept 11

Database Design

Changing Data into Information


 Data
 Raw facts stored in databases
 Need additional processing to become useful
 Information
 Required by decision maker
 Data processed and presented in a meaningful form
 Transformation

The Information System


 Database
 Carefully designed and constructed repository of facts
 Part of an information system
 Information System
 Provides data collection, storage, and retrieval
 Facilitates data transformation
 Components include:
 People
 Hardware
 Software
 Database(s)
 Application programs
 Procedures
 System Analysis
 Establishes need and extent of an information system
 Systems development
 Process of creating information system
 Database development
 Process of database design and implementation
 Creation of database models
 Implementation
 Creating storage structure
DBMS – Unit 1 Database Design Concept 12

 Loading data into database


 Providing for data management

Systems Development Life Cycle (SDLC)

Figure 2
 Planning
 Should the existing system be continued, modified or replaced?
 The technical of hardware and software requirements.
 The system cost.
 Analysis
 What are the precise requirements of the current system’s end users?
 Do these requirements fit into the overall information requirements?
DBMS – Unit 1 Database Design Concept 13

 Detailed system design – the designer completes the design of the system’s processes
including all the necessary technical specifications for the screens, menus, reports, and
other devices that might e used in help make the system a more efficient information
generator.
 Implementation
 Install the hardware, the DBMS software, and application programs.
 The system enters into a cycle of coding, testing, and debugging, until it is readto
delivered.
 The actual database is created, and the system is customized, by using a variety of
methods and devices:
 Customized user programs
 Database interface programs
 Conversion programs that import the data from a different file structure, using
batch programs, a database utility, or both.
 Maintenance
 Corrective maintenance in response to system errors.
 Adaptive maintenance due to changes in the business environment.
 Perfect maintenance to enhance the system.

Database Lifecycle (DBLC)


DBMS – Unit 1 Database Design Concept 14

Figure 3

Phase 1: Database Initial Study


 Purposes
 Analyze company situation
 Operating environment
 Organizational structure
 Define problems and constraints
 Define objectives What is the proposed system’s initial objective?
 Will the system interface with other existing or future systems in the company?
 Will the system share the data with other systems or users?
 Define scope and boundaries
 Initial Study Activities – Figure 4

Phase 2: Database Design


 Most Critical DBLC phase
 Makes sure final product meets requirements
 Focus on data requirements
 Sub-phases:
 Create conceptual design
 DBMS software selection
 Create logical design
 Create physical design
 Tow Views of Data: Business Manager and Designer – Figure 5
 I. Conceptual Design
 Data modeling creates abstract data structure to represent real-world items
 High level of abstraction
 Four steps
 Data analysis and requirements
 Entity relationship modeling and normalization
 Data model verification
 Distributed database design
 Data analysis and Requirements
 Focus on:
DBMS – Unit 1 Database Design Concept 15

 Information needs
 Information users
 Information sources
 Information constitution
 Data sources
 Developing and gathering end-user data views
 Direct observation of current system
 Interfacing with systems design group
 Business rules
 Entity Relationship Modeling and Normalization
 Steps to develop the conceptual model using ERD
STEP ACTIVITY
1 Identify, analyze, and refine the business rules.
2 Identify the main entities, using the results of Step 1.
3 Define the relationships among the entities, using the results of Steps 1 and 2.
4 Define the attributes, primary keys, and foreign keys for each of the entities.
5 Normalize the entities.
6 Complete the initial ERD.
7 Have the main end users verify the model in Step 6 against the data, information and
processing requirements.
8 Modify the ERD, using the results of Step 7.
 E-R Modeling is Iterative – Figure 8
 Concept Design: Tools and Resource – Figure 9
 Data Model Verification
 E-R model is verified against proposed system processes
 End user views and required transactions
 Access paths, security, concurrency control
 Business-imposed data requirements and constraints
 E-R Model Verification Process
STEP ACTIVITY
1 Identify the E-R model’s central entity.
2 Identify each module and its components.
3 Identify each module’s transaction requirements;
Internal: Updates/Inserts/Deletes/Queries/Reports
External: Module interfaces
4 Verify all processes against the E-R model.
5 Make all necessary changes suggested in Step 4.
DBMS – Unit 1 Database Design Concept 16

6 Repeat Steps through 5 for all modules.


 Reveals additional entity and attribute details
 Define major components as modules
 Cohesivity
 Coupling
 Iterative Process of Verification – Figure 10
 Distributed Database Design
 Design portions in different physical locations
 Development of data distribution and allocation strategies
 II. DBMS Software Selection
 DBMS software selection is critical
 Advantages and disadvantages need study
 Factors affecting purchasing decision
 Cost
 DBMS features and tools
 Underlying model
 Portability
 DBMS hardware requirements
 III. Logical Design
 Translates conceptual design into internal model
 Maps objects in model to specific DBMS constructs
 Design components
 Tables
 Indexes
 Views
 Transactions
 Access authorities
 Others
 IV. Physical Design
 Selection of data storage and access characteristics
 Very technical
 More important in older hierarchical and network models
 Becomes more complex for distributed systems
 Designers favor software that hides physical details
 Physical Organization – Figure 12
DBMS – Unit 1 Database Design Concept 17

Phase 3: Implementation and Loading


 Creation of special storage-related constructs to house end-user tables
 Data loaded into tables
 Refer Figure 13 to compare parallel activities during implementation period in the DBLC
and the SDLC.
 Other issues
 Performance
 Security
 Physical security – allows physical access to areas by authorized personnel
only.
 Password security – allows the assignment of access rights to specific
authorized users.
 Access rights – restricts operations on predetermined objects such as
databases, tables, views, queries, and reports.
 Audit trails – provide an after-the-fact device to check for access violations.
 Data encryption – prevents unauthorized users who might have violated some
of the database security layers.
 Diskless workstations – allow end users to access the database without being
able to download the information from their workstation.
 Backup and recovery
 Integrity
 Company standards
 Concurrency controls

Phase 4: Testing and Evaluation


 Database is tested and fine-tuned for performance, integrity, concurrent access, and
security constraints
 Done in parallel with application programming
 Actions taken if tests fail
 Fine-tuning based on reference manuals
 Modification of physical design
 Modification of logical design
 Upgrade or change DBMS software or hardware

Phase 5: Operation
 Database considered operational
 Starts process of system evaluation
 Unforeseen problems may surface
DBMS – Unit 1 Database Design Concept 18

 Demand for change is constant

Phase 6: Maintenance and Evaluation


 Preventative maintenance
 Corrective maintenance
 Adaptive maintenance
 Assignment of access permissions
 Generation of database access statistics to monitor performance
 Periodic security audits based on system-generated statistics
 Periodic system usage-summaries

DB Design Strategy Notes


 Top-down Design
 1) Identify data sets
 2) Define data elements
 Bottom-up Design
 1) Identify data elements
 2) Group them into data sets
 Top-Down vs. Bottom-Up – Figure 6-14

Centralized vs. Decentralized Design


 Centralized design – Figure 15
 Typical of simple databases
 Conducted by single person or small team
 Decentralized design – Figure 16
 Larger numbers of entities and complex relations
 Spread across multiple sites
 Developed by teams

Database Management System


 Oracle Database Server
 MS SQL Server
 DB2
 MySQL

Oracle 9i download site


http://www.oracle.com/technology/software/products/oracle9i/index.html
Select package
DBMS – Unit 1 Database Design Concept 19

Oracle9i Release 2 (9.2.0.1)


1. Enterprise/Standard/Personal Edition (include Client)
Oracle9i Database Release 2 Enterprise/Standard/Personal Edition for Windows NT/2000/XP
2. Client Edition – works for remote control
Oracle9i Database Release 2 Client for Windows 98/NT/2000/XP

Install and Uninstall Oracle Database Server


Please refer Appendix 01 – Install and Remove Oracle 9i in Windows

You might also like