Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Unit-1 Database Management System

The document provides an overview of Database Management Systems (DBMS), including definitions of data, databases, and the functions of DBMS such as data definition, updating, retrieval, and user administration. It discusses the advantages and disadvantages of DBMS, key concepts like primary keys, candidate keys, and foreign keys, as well as various database models including hierarchical, network, object-oriented, and relational models. Additionally, it covers database languages, relationships between entities, ER diagrams, and the process and importance of normalization in database design.

Uploaded by

bhattadhiraj74
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Unit-1 Database Management System

The document provides an overview of Database Management Systems (DBMS), including definitions of data, databases, and the functions of DBMS such as data definition, updating, retrieval, and user administration. It discusses the advantages and disadvantages of DBMS, key concepts like primary keys, candidate keys, and foreign keys, as well as various database models including hierarchical, network, object-oriented, and relational models. Additionally, it covers database languages, relationships between entities, ER diagrams, and the process and importance of normalization in database design.

Uploaded by

bhattadhiraj74
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

UNIT-1 DATABASE MANAGEMENT

SYSTEM

Prepared By:
Amrit Parajuli
Lecturer: Waling Multiple Campus
Lecturer: Andhikhola Polytechnic Institute
Lecturer: Pioneers Higher Education Academy
Owner: Integrated IT Solutions
Email: amritparazuli@gmail.com
Contact: 9841954069
INTRODUCTION
Data: Data are simply facts or figures or bits of information, but not
information itself. When data are processed, interpreted, organized,
structured or presented so as to make them meaningful or useful, they
are called information. Information provides context for data. For
example The history of temperature readings all over the world for
the past 100 years is data. If this data is organized and analyzed to
find that global temperature is rising, then that is information.

Database: It is a collection of related information about a subject


organized in a useful manner that provides a base or foundation for
procedure, such as retrieving information, drawing conclusion and
make decision. The database is a repository of all types of data.
After storing it, there should be a mechanism to access, manipulate
and update it. For this, we have a tool in the form of a query to
perform all these operations.
DATABASE MANAGEMENT SYSTEM
A DBMS is a set of programs that manages the database files. It allows accessing
the files, updating the records and retrieving data as requested. DBMS is a
collection of inter-related data and set of programs to store & access those data in
an easy and effective manner. Database Management System or DBMS in short
refers to the technology of storing and retrieving user’s data with utmost
efficiency along with appropriate security measures.
DBMS allows users the following tasks:
 Data Definition: It is used for creation, modification, and removal of definition
that defines the organization of data in the database.
 Data Updation: It is used for the insertion, modification, and deletion of the
actual data in the database.
 Data Retrieval: It is used to retrieve the data from the database which can be
used by applications for various purposes.
 User Administration: It is used for registering and monitoring users, maintain
data integrity, enforcing data security, dealing with concurrency control,
monitoring performance and recovering information corrupted by unexpected
failure.
ADVANTAGE/DISADVANTAGESOF
DBMS
Advantages:
 Controls database redundancy: It can control data redundancy because it stores all
the data in one single database file and that recorded data is placed in the database.
 Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
 Easily Maintenance: It can be easily maintainable due to the centralized nature of the
database system.
 Reduce time: It reduces development time and maintenance need.
 Backup: It provides backup and recovery subsystems which create automatic backup
of data from hardware and software failures and restores the data if required.
 Multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces
Disadvantages:
 Size: It occupies large disk space and large memory to run efficiently.
 Cost: DBMS requires a high-speed data processor and larger memory to run DBMS
software, so it is costly.
 Complexity: DBMS creates additional complexity and requirements.
DBMS
Field: A field consists of a grouping of characters. A data field
represents an attribute (a characteristic or quality) of some entity
(object, person, place, or event).
Record: A record represents a collection of attributes that describe
a real-world entity. A record consists of fields, with each field
describing an attribute of the entity.
Object: Any defined object in the database which can be used to
reference or store data is known as a database object. Database
objects can be made using the create command. These database
objects are used for holding and manipulating the data in the
database.
KEYS IN DBMS
Primary Key:
 It is the first key which is used to identify one and only one instance of an entity

uniquely. An entity can contain multiple keys as we saw in PERSON table. The
key which is most suitable from those lists become a primary key.
 In the EMPLOYEE table, ID can be primary key since it is unique for each

employee. In the EMPLOYEE table, we can even select License_Number and


Passport_Number as primary key since they are also unique.
KEYS IN DBMS
Candidate Key:
 A candidate key is an attribute or set of an attribute which can uniquely identify

a tuple.
 The remaining attributes except for primary key are considered as a candidate

key. The candidate keys are as strong as the primary key.


 For example: In the EMPLOYEE table, id is best suited for the primary key.

Rest of the attributes like SSN, Passport_Number, and License_Number, etc. are
considered as a candidate key.
SUPER KEYS
Super key is a set of an attribute which can uniquely identify a tuple. Super key is a
superset of a candidate key.
 For example: In the EMPLOYEE table, for(EMPLOEE_ID,
EMPLOYEE_NAME) the name of two employees can be the same, but their
EMPLYEE_ID can't be the same. Hence, this combination can also be a key.
 The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-

NAME), etc.
FOREIGN KEYS
 Foreign keys are the column of the table which is used to point to the primary key
of another table.
 In a company, every employee works in a specific department, and employee and
department are two different entities. So we can't store the information of the
department in the employee table. That's why we link these two tables through the
primary key of one table.
 We add the primary key of the DEPARTMENT table, Department_Id as a new
attribute in the EMPLOYEE table.
 Now in the EMPLOYEE table, Department_Id is the foreign key, and both the
tables are related.
DATABASE LANGUAGES
Database languages can be used to read, store and update the data in the database. There are
four different types of database languages. They are:
1. Data Definition Language (DDL): DDL stands for Data Definition Language. It is used
to define database structure or pattern. It is used to create schema, tables, indexes,
constraints, etc. in the database. Using the DDL statements, you can create the skeleton
of the database. The task that comes under DDL are CREATE, ALTER, DROP,
TRUNCATE, RENAME, COMMENT etc.
2. Data Manipulation Language (DML): DML stands for Data Manipulation Language.
It is used for accessing and manipulating data in a database. It handles user requests. The
tasks that comes under DML are SELECT, INSERT, UPDATE, DELETE, MERGE etc.
3. Data Control Language (DCL): DCL stands for Data Control Language. It is used to
retrieve the stored or saved data. The DCL execution is transactional. It also has rollback
parameters.The tasks that comes under DCL are GRANT and REVOKE.
4. Transaction Control Language (TCL): TCL is used to run the changes made by the
DML statement. TCL can be grouped into a logical transaction. Here are some tasks that
come under TCL:
Commit: It is used to save the transaction on the database.
Rollback: It is used to restore the database to original since the last Commit.
DATA MODELS
Data models define how the logical structure of a database is modeled. Data
Models are fundamental entities to introduce abstraction in a DBMS. Data
models define how data is connected to each other and how they are processed
and stored inside the system.
Hierarchical Data Model:
The Hierarchical model was essentially born from the first mainframe database
management system. It uses an upside-down tree to structure data. The top of
the tree is the parent and the branches are children. Each child can only have
one parent but a parent can have many children.
HIERARCHIAL MODEL
Advantages:
 Structures data in an upside-down tree. (Simplifies data

overview)
 Manages large amounts of data.

 Improve data sharing.

 Distribute data in terms of relationships.

 Have many different structures and forms.

Disadvantages:
 Complex (users require physical representation of database)

 Data must be organized in a hierarchical way without

compromising the information.


 Many too many relationships not supported.

 One parent per child.


NETWORK DATABASE MODEL

This is an extension of the Hierarchical model. In this model data is organized


more like a graph, and are allowed to have more than one parent node.
In this database model data is more related as more relationships are established in
this database model. Also, as the data is more related, hence accessing the data is
also easier and fast. This database model was used to map many-to-many data
relationships.
This was the most widely used database model, before Relational Model was
introduced.
NETWORK DATABASE MODEL
Advantages:
 Multi-parent support.

 More useful than the hierarchical data model.

 Deals with even larger amounts of information than the hierarchical

model.
 Promotes data integrity.

 Many too many relationships support.

 Improved data access.

Disadvantages:
 Data relationships must be predefined.

 Much more complex than the hierarchical date model.

 Users are still require to know the physical representation of the database.

 Information can be related in various and complicated ways.


OBJECT ORIENTED MODEL
In Object Oriented Data Model, data and their
relationships are contained in a single structure which
is referred as object. In this, real world problems are
represented as objects with different attributes. All
objects have multiple relationships between them.
Basically, it is combination of Object Oriented
programming and Relational Database Model.
OBJECT ORIENTED DATABSE MODEL
Advantages:
 The object-oriented data model allows the ‘real world’ to be modeled more

closely.
 OODBMSs allow new data types to be built from existing types.

 Unlike traditional databases (such as hierarchical, network or relational), the

object oriented database are capable of storing different types of data, for
example, pictures, voice video, including text, numbers and so on.
 OODBMSs use a different protocol to handle the types of long-duration

transaction that are common in many advanced database application.


Disadvantages:
 There is no universally agreed data model for an OODBMS, and most

models lack a theoretical foundation.


 In comparison to RDBMSs the use of OODBMS is still relatively limited.

 There is a general lack of standards of OODBMSs.

 Most OODBMSs do not provide a view mechanism.

 OODBMSs do not provide adequate security mechanisms.


RELATIONAL DATABASE MODEL
 Relational Model (RM) represents the database as a collection of relations. A
relation is nothing but a table of values. Every row in the table represents a
collection of related data values. These rows in the table denote a real-world
entity or relationship.
 The table name and column names are helpful to interpret the meaning of values
in each row. The data are represented as a set of relations. In the relational model,
data are stored as tables. However, the physical storage of the data is independent
of the way the data are logically organized.
RELATIONAL DATABASE
MANAGEMENT SYSTEM
Key Terms:
Domain: A domain is a unique set of values permitted for an attribute in a table. For
example, a domain of month-of-year can accept January, February….December
as possible values, a domain of integers can accept whole numbers that are
negative, positive and zero.
Tuples: A single entry in a table is called a Tuple or Record or Row. A tuple in a
table represents a set of related data.
Entity: Entity in DBMS can be a real-world object with an existence. For example,
in a College database, the entities can be Professor, Students, Courses, etc.
Attribute: An attribute is a property or characteristic of an entity. An entity may
contain any number of attributes. One of the attributes is considered as the
primary key.
Data type: The data type of an attribute defines what type of data to be stored in
that attribute. For example integer, float, char etc.
Redundancy: Data redundancy is a condition created within a database or data
storage technology in which the same piece of data is held in two separate places.
RELATIONAL DATABSE MODEL
Advantages:
 A Relational data model in DBMS is simpler than the hierarchical and network
model.
 The relational database is only concerned with data and not with a structure.
 The Relational model in DBMS is easy as tables consisting of rows and columns
are quite natural and simple to understand.
 It makes possible for a high-level query language like SQL to avoid complex
database navigation.
 The Structure of Relational database can be changed without having to change
any application.

Disadvantages:
 Few relational databases have limits on field lengths which can't be exceeded.

 Relational databases can sometimes become complex as the amount of data

grows, and the relations between pieces of data become more complicated.
 Complex relational database systems may lead to isolated databases where the
RELATIONSHIP
A relationship is used to describe the relation between entities.
Diamond or rhombus is used to represent the relationship.

Types of relationship are as follows:


1. One-to-One Relationship

2. One-to-many relationship

3. Many-to-one relationship

4. Many-to-many relationship
ONE-TO-ONE RELATIONSHIP

When only one instance of an entity is associated with the


relationship, then it is known as one to one relationship.
For example, A female can marry to one male, and a male can marry
to one female.
ONE-TO-MANY RELATIONSHIP
When only one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then
this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention
is done by the only specific scientist.
MANY TO ONE RELATIONSHIP

When more than one instance of the entity on the left, and only
one instance of an entity on the right associates with the
relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course
can have many students.
MANY TO MANY RELATIONSHIP

When more than one instance of the entity on the left, and more than
one instance of an entity on the right associates with the relationship
then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can
have many employees.
ER DIAGRAM
ER Diagram:
An ER diagram shows the relationship among entity sets. An entity set
is a group of similar entities and these entities can have attributes. In
terms of DBMS, an entity is a table or attribute of a table in database,
so by showing relationship among tables and their attributes, ER
diagram shows the complete logical structure of a database.
ADVANTAGES OF ER DIAGRAM
 Conceptually it is very simple: ER model is very simple because if
we know relationship between entities and attributes, then we can
easily draw an ER diagram.
 Better visual representation: ER model is a diagrammatic
representation of any logical structure of database. By seeing ER
diagram, we can easily understand relationship among entities and
relationship.
 Effective communication tool: It is an effective communication
tool for database designer.
 Highly integrated with relational model: ER model can be easily
converted into relational model by simply converting ER model into
tables.
 Easy conversion to any data model: ER model can be easily
converted into another data model like hierarchical data model,
network data model and so on.
NORMALIZATION
 Normalization is the process of organizing the data in the database.
 Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like Insertion,
Update and Deletion Anomalies.
 Normalization divides the larger table into the smaller table and links them using
relationship.
 The normal form is used to reduce redundancy from the database table.
Types of Normal Forms
There are the four types of normal forms:
ADVANTAGES/NEED OF
NORMALIZATION
Advantages:
 Greater overall database organization

 Reduction of redundant data

 Data consistency within the database

 A much more flexible database design

 A better handle on database security

Need:
 It is used to remove the duplicate data and database anomalies from

the relational table.


 Normalization helps to reduce redundancy and complexity by

examining new data types used in the table.


 It is helpful to divide the large database table into smaller tables and

link them using relationship.


 It avoids duplicate data or no repeating groups into a table.


KEY TERMS
Prime attribute: An attribute, which is a part of the candidate-key is
known as a prime attribute.
Non-prime attribute: An attribute, which is not a part of the primary key
is said to be a non-prime attribute.
PRIMARY KEY in DBMS is a column or group of columns in a table
that uniquely identify every row in that table.
CANDIDATE KEY in SQL is a set of attributes that uniquely identify
tuples in a table. Candidate Key is a super key with no repeated attributes.
FOREIGN KEY is a column that creates a relationship between two
tables. The purpose of Foreign keys is to maintain data integrity and allow
navigation between two different instances of an entity.
A nonkey attribute does not uniquely identify an instance of an entity.
For example, a database can have multiple instances of the same customer
name, which means that “customer name” is not unique.
A functional dependent is a constraint that specifies the relationship
between two sets of attributes where one set can accurately determine the
value of other sets.
1NF
The table will be in First Normal Form (1NF) if all the attributes of the table
contain only atomic values. We can also say that if a table holds the multivalued
data items in attributes or composite values, the relation cannot be in the first
normal form. So, we need to make it first normal form by making the entries of
the table atomic.
2NF
A Relation will be in 2NF if it follows the following condition:
 The table or relation should be in 1NF or First Normal Form.
 All the non-prime attributes should be fully functionally dependent on the
candidate key.
 The table should not contain any partial dependency.
3NF
The table will be in Third Normal Form (3NF) if it follows the given
conditions:
 The table or relation should be in 2NF.
 It should not contain any transitive dependency. A Transitive Dependency is
that any non-prime attribute determines or depends on the other non-prime
attribute.
A relation is in 3NF if FD X determines Y ('X' -> 'Y') satisfies one of the
following condition:
 If X -> Y is a trivial FD, i.e., Y is a subset of X.
 If X -> Y, where X is a Super key.
 If X -> Y, (Y - X) is a prime attribute.
3NF
BCNF
It stands for Boyce Codd Normal form, which is the
next version of 3NF. Sometimes, it is also pronounced
as 3.5 NF. A normal form is said to be in BCNF if it
follows the given conditions:
 A table or relation must be in 3NF.
 If a relation R has functional dependencies (FD) and if
A determines B, where A is a super Key, the relation is
in BCNF.
CENTRALIZED DATABASE SYSTEM
A centralized database is stored at a single location such as a mainframe computer.
It is maintained and modified from that location only and usually accessed using
an internet connection such as a LAN or WAN. The centralized database is used
by organizations such as colleges, companies, banks etc.

As can be seen from the above diagram, all the information for the organization is
stored in a single database. This database is known as the centralized database.
ADVANTAGES/DISADVANTAGES OF CENTRALIZED
DATABASE SYSTEM
Advantages:
 The data integrity is maximized as the whole database is stored at a single physical
location.
 The data redundancy is minimal in the centralized database. All the data is stored
together and not scattered across different locations.
 Since all the data is in one place, there can be stronger security measures around it. So,
the centralized database is much more secure.
 The centralized database is cheaper than other types of databases as it requires less power
and maintenance.
 All the information in the centralized database can be easily accessed from the same
location and at the same time.
Disadvantages:
 Since all the data is at one location, it takes more time to search and
access it.
 There is a lot of data access traffic for the centralized database.
 Since all the data is at the same location, if multiple users try to access
it simultaneously it creates a problem.
 If there are no database recovery measures in place and a system
failure occurs, then all the data in the database will be destroyed.
DISTRIBUTED DATABASE SYSTEM
In a distributed database management system, the database is not stored at a single location.
Rather, it may be stored in multiple computers at the same place or geographically spread far
away. Despite all this, the distributed database appears as a single database to the user. A diagram
to better explain this is as follows:

As seen in the figure, the components of the distributed database can be in multiple locations
such as India, Canada, Australia, etc. However, this is transparent to the user i.e the database
appears as a single entity.
ADVANTAGES/ DISADVANTAGES OF DISTRIBUTED
DATABASE SYSTEM
Advantages:
 If there were a natural catastrophe such as a fire or an earthquake, all the data would

not be destroyed as it is stored at different locations.


 It is cheaper to create a network of systems containing a part of the database. This

database can also be easily increased or decreased.


 Even if some of the data nodes go offline, the rest of the database can continue its

normal functions.

Disadvantages:
 The distributed database is quite complex and it is difficult to make sure that a user gets

a uniform view of the database because it is spread across multiple locations.


 It is difficult to provide security in a distributed database as the database needs to be

secured at all the locations it is stored. Moreover, the infrastructure connecting all the
nodes in a distributed database also needs to be secured.
 It is difficult to maintain data integrity in the distributed database because of its nature.

There can also be data redundancy in the database as it is stored at multiple locations.
CENTRALIZED VS DISTRIBUTED DATABASE
SYSTEM
Centralized Database System Distributed Database System
1. It is a database that is stored, located as 1. It is a database which consists of
well as maintained at a single location multiple databases which are connected
only. with each other and are spread across
2. The data access time in the case of different physical locations.
multiple users is more in a centralized 2. The data access time in the case of
database. multiple users is less in a distributed
3. The management, modification, and database.
backup of this database are easier as the 3. The management, modification, and
entire data is present at the same backup of this database are very difficult
location. as it is spread across different physical
4. This database provides a uniform and locations.
complete view to the user. 4. Since it is spread across different
5. This database has more data consistency locations thus it is difficult to provide a
in comparison to distributed database. uniform view to the user.
6. Centralized database is less costly. 5. This database may have some data
replications thus data consistency is less.
6. This database is very expensive.
DATA SECURITY
Data security is the practice of protecting digital information from unauthorized
access, corruption, or theft throughout its entire lifecycle. It’s a concept that
encompasses every aspect of information security from the physical security of
hardware and storage devices to administrative and access controls, as well as the
logical security of software applications. It also includes organizational policies
and procedures.
Accidental loss of data may result from
 Crashes during transaction processing
 Logical errors in the program
 Due to the distribution of data over several computers.
Intentional Loss of data may result from
 Unauthorized reading of data.
 Unauthorized modification of data.
 Unauthorized destruction of data.
SECURITY MEASURE AT DIFFERENT
LEVELS
To protect the database, we must take security measures at several levels:
 Physical: The sites containing the computer systems must be secured against

armed or surreptitious entry by intruders.


 Human: Users must be authorized carefully to reduce the chance of any such

user giving access to an intruder in exchange for a bribe or other favors .


 Operating System: No matter how secure the database system is, weakness

in operating system security may serve as a means of unauthorized access to


the database.
 Network: Since almost all database systems allow remote access through

terminals or networks, software-level security within the network software is


as important as physical security, both on the Internet and in networks private
to an enterprise.
 Database System: Some database-system users may be authorized to access

only a limited portion of the database. Other users may be allowed to issue
queries, but may be forbidden to modify the data. It is responsibility of the
database system to ensure that these authorization restrictions are not
violated.
GUIDELINE FOR DATA SECURITY
1. Use encryption to protect confidential data.
2. Backup important data and test the backup regularly.
3. Use strong passwords, keep them private and change
regularly.
4. Activate password protection for unattended
computing devices.
5. Beware of suspicious e-mails.
6. Configure your computer securely.
7. Turn off unnecessary wireless connections.
8. Observe and comply with the “Data Protection
Principles”.
9. Report Information Security incident immediately.
10. Configure firewall.
DATA ABSTRACTION
Database systems are made-up of complex data structures. To ease the user
interaction with database, the developers hide internal irrelevant details from
users. This process of hiding irrelevant details from user is called data abstraction.
LEVELS OF DATA ABSTRACTION
We have three levels of abstraction:
Physical level: This is the lowest level of data abstraction. It describes how
data is actually stored in database. You can get the complex data structure
details at this level.
Logical level: This is the middle level of 3-level data abstraction
architecture. It describes what data is stored in database.
View level: Highest level of data abstraction. This level describes the user
interaction with database system.
Example: Let’s say we are storing customer information in a customer
table. At physical level these records can be described as blocks of storage
(bytes, gigabytes, terabytes etc.) in memory. These details are often hidden
from the programmers.
At the logical level these records can be described as fields and attributes
along with their data types, their relationship among each other can be
logically implemented. The programmers generally work at this level
because they are aware of such things about database systems.
At view level, user just interact with system with the help of GUI and enter
the details at the screen, they are not aware of how the data is stored and
DATABASE ADMINISTRATOR
A database administrator (DBA) is a specialized computer systems administrator
who maintains a successful database environment by directing or performing all
related activities to keep the data secure. The top responsibility of a DBA
professional is to maintain data integrity. This means the DBA will ensure that
data is secure from unauthorized access but is available to users.
Roles or functions of DBA:
1. Installation, configuration and upgradation of databases like Microsoft SQL/
MySQL/ Oracle Server Software.
2. Evaluating the features of various databases.
3. Establishing and maintaining sound backup and recovery policies and
procedures.
4. Taking care of database design and implementation.
5. Implementing and maintaining the database security.
6. Database tuning, application tunning and performance monitoring.
7. Maintaining documentation and standards.
8. DBA does some technical trouble shooting and consultation to development
teams.
DATA INTEGRITY
Data Integrity is the overall completeness, accuracy and consistency of data. This
can be indicated by the absence of alteration between two instances or between
two updates of a data record, meaning data is intact and unchanged. Data integrity
is usually imposed during the database design phase through the use of standard
procedures and rules. It is maintained through the use of various error-checking
methods and validation procedures.
Types of Data Integrity:
1. Entity Integrity: The entity integrity constraint states that primary key value
can't be null. This is because the primary key value is used to identify individual
rows in relation and if the primary key has a null value, then we can't identify
those rows. A table can contain a null value other than the primary key field.
2. Referential Integrity: A referential integrity constraint is specified between two
tables. In the Referential integrity constraints, if a foreign key in Table 1 refers to
the Primary Key of Table 2, then every value of the Foreign Key in Table 1 must
be null or be available in Table 2.
3. Domain Integrity: Domain constraints can be defined as the definition of a
valid set of values for an attribute. The data type of domain includes string,
character, integer, time, date, currency, etc. The value of the attribute must be
available in the corresponding domain.

You might also like