DBMS First Chapter
DBMS First Chapter
DBMS First Chapter
Introduction
Data are the known facts that can be recorded and have an implicit meaning.
A database is a collection of logically related data.
Properties of a database:
It represents some aspect of real world (miniworld).
It is a logically coherent collection of data with some inherent meaning.
It is designed, built and populated with data for a specific purpose.
A database can be of any size and complexity.
Overview
Ex1: A university database
Entities such as students, faculty, courses
Relationships between entities such as students’ enrollment in courses, faculty teaching
courses
Ex2: A Hospital database
Entities such as doctors, patients, nurses, wards
Relationships between entities such as doctors visiting patients, patients in rooms.
A database may be generated and maintained manually or it may be computerized.
A Database Management System (DBMS) is a software package (collection of programs) that
enables users to create, store and maintain a database.
The DBMS is a general purpose software system that facilitates the process of defining,
constructing, manipulating, and sharing databases among various users and applications.
Functionalitie of DBMS
A DBMS is a general purpose software system facilitating each of the following (with respect to
a database):
Defining a database
o specifying data types, structures, and constraints of the data to be stored in the database.
Constructing the database
o the process of storing the data on some storage medium (e.g., magnetic disk) that is
controlled by the DBMS
Manipulating the database
o querying the database to retrieve specific data, updating the database to reflect changes in
the miniworld, and generating reports
Sharing a database
o allowing multiple users and programs to access the database "simultaneously"
Maintaining the database
o allowing the system to evolve as requirements change over time
Protection includes system protection and security protection.
System protection
o preventing database from becoming corrupted when hardware or software failures occur
Security protection
o preventing unauthorized or malicious access to database.
The data in the database at a particular moment is called a database state or snapshot.
This state is also the current set of occurrences or instances in the database.
Many database states can be constructed to correspond to a particular database.
When a new database is defined; database state is empty state with no data.
When database is populated, database enters in initial state.
The DBMS is partly responsible for ensuring that every state of the database is a valid state -
that is, a state that satisfies the structure and constraints specified in the schema.
Hence, specifying a correct schema to the DBMS is extremely important.
DBMS stores the descriptions of the schema constructs and constraints in the catalog (meta-
data).
Schema is called the intension and state is called as extension.
Changes in application requirements result in schema evolution.
Three-Schema Architecture
The schema in DBMS can be described at three levels:
Internal level has an internal schema
Conceptual level has a conceptual schema
External level includes a number of external schemas or user views
The information about all three schemas is stored in the system catalog.
Three-Schema Architecture: Diagram
Internal Schema
The internal schema specifies complete details of storage and access paths for the database.
File organization on the disk should be decided e.g. hashing, indexing etc.
The process of arriving at a good physical database schema is called physical database design.
Conceptual Schema
The conceptual schema (or logical schema) describes the structure of the database.
The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
A representational data model is used to describe the conceptual schema when a database is
implemented.
The process of arriving at a good conceptual database schema is called conceptual database
design.
Conceptual Schema (example)
Students (sid: string, sname: string, login: string, age: integer, gpa: real)
Faculty (fid: string, fname: string, sal: real)
Courses (cid: string, cname: string, credits: integer)
Rooms (rno: string, address: string, capacity: integer)
Enrolled (sid: string, cid: string, grade: string)
Teaches (fid: string, cid: string)
Meets_In (cid: string, rno: integer, time: string)
External Schema
The external schemas allow data access to be customized at the level of individual users and hide
rest of the details from the users.
Any database has exactly one conceptual schema and one physical schema but may have many
external schemas in view to support different users.
Each external schema consists of a collection of one or more number of views or tables.
The external schema design is guided by end user requirements e.g. courseinfo (cid, fname, sid)
Three-Schema Architecture (more)
Three schemas are only descriptions of data; the stored data that actually exists is at the physical
level.
DBMS must transform a request specified on an external schema into a request against the
conceptual schema, and then into a request on the internal schema for processing over the stored
database.
The process of transforming requests and results between levels are called mappings.
Data Independence
One of the most important benefits of using a DBMS is its support for data independence.
Applications are insulated from how data are structured and stored.
Data independence is the capacity to change the schema at one level of a database without
changing the schema at the next higher level.
Data Independence (types)
Logical data independence:
o Protection of user views from changes in logical structure of data.
o Logical data independence is the capacity to change the conceptual schema without having to
change external schemas or application programs.
Physical data independence:
o Protection of logical structure from changes in physical structure of data.
o Physical data independence is the capacity to change the internal schema without having to
change conceptual schemas.
Physical data independence exists in most databases.
But logical data independence is hard to achieve.
Data independence occurs because when the schema is changed at some level, the schema at
higher level remains unchanged; only the mapping between the two levels is changed.
Two level of mappings create an overhead during compilation or execution of a query or
program, leading to inefficiencies in the DBMS.
Database Languages
o Data Definition Language (DDL)
o Storage Definition Language (SDL)
o View Definition Language (VDL)
o Data Manipulation Language (DML)
o A high level or nonprocedural DML
o A low level or procedural DML
Data Definition Language (DDL)
In most of the DBMS, there is no separate language for internal and conceptual schema, DDL is
used by DBA and Designers.
The DBMS will have a DDL compiler which processes the DDL statements.
Storage Definition Language (SDL)
If distinct languages are used, DDL is used to specify conceptual schema and SDL is used to
specify internal schema.
The mapping between the two schemas may be specified in either of the language.
View Definition Language (VDL)
For a true three-schema architecture, VDL is required for external schema and its mapping with
conceptual schema.
In relational DBMSs, SQL is used in the role of VDL to define views as results of predefined
queries.
Data Manipulation Language (DML)
DBMS provides a set of operations like insertion, modification and deletion through DML
A high level or nonprocedural DML
o It can be used on its own to specify complex database operations.
o These statements can be entered interactively from a display monitor or terminal.
o These can be embedded in a general purpose programming language where they can be
extracted by a precompiler and processed by the DBMS.
o This can specify and retrieve many records in a single DML statement, thus known as
set-at-a-time or set-oriented DMLs (eg. SQL)
o Also known as declarative language
A low level or procedural DML
o It must be embedded in a general purpose programming language.
o This retrieves individual records from the database and processes each separately.
o Thus it needs programming language constructs like loops.
o This is also known as record-at-a-time (eg. DL/1)
Whenever a DML is embedded in a general purpose language, that language is called host
language and DML is data sublanguage.
A high level DML used in a standalone interactive manner is called a query language.
Naïve and parametric users generally interact with database through user friendly interfaces.
Database Interfaces
User-friendly interfaces provided by a DBMS include:
Menu-based Interfaces for Web Clients or Browsing
Form-based Interfaces
Graphical Interfaces
Natural Language Interfaces
Speech Input and Output
Interfaces for parametric Users
Interfaces for DBA
Menu-based Interfaces for Web Clients or Browsing
These interfaces present the users with lists of options (called menus) that help the user in
formulation of query request.
The query is composed step by step by picking options from a menu that is displayed by the
system.
Pull-down menu is a popular technique in Web-based user interfaces.
They are used in browsing interfaces, which allow a user to browse the content of a database
in an exploratory and unstructured manner.
Form-based Interfaces
Forms are designed and programmed for naïve users.
User can fill up the form for new entries in database.
User can also fill up few entries and rest of the matching entries are retrieved from the
database.
Many DBMSs have specification languages, which help programmers specify such forms.
Oracle Forms, a component of Oracle product suite, provides an extensive set of features to
design and build applications using forms.
Graphical Interfaces
A GUI displays a schema to the user in programmatic form.
Users can specify a query by manipulating the diagram.
GUIs utilize both menus and forms.
A pointing device, mouse, can be used.
Natural Language Interfaces
A natural language interface has its own schema and a dictionary of important words.
The natural language interface refers to the words in its schema and to the set of standard
words in dictionary.
If the interpretation is successful, the interface generates a high level query corresponding to
the natural language request and submit it to DBMS for further processing.
If interpretation is not successful, a dialogue is started with the user to clarify the request.
Speech Input and Output
It provides limited use of speech as an input query and speech as an answer to a question or
result of a request e.g. Telephone directory, Flight arrival/departure, and Bank account
information etc.
The speech input is detected using a library of predetermined words and used to set up the
parameters that are supplied to the queries.
For output, similar conversion from text or numbers into speech takes place.
Interfaces for parametric Users
A special interface is implemented for each known class of naive users.
Special function keys can be programmed to do repeated work fast.
Interfaces for DBA
DBA Staff can use privileged commands for creating accounts, setting system parameters,
granting account authorization, changing a schema and reorganizing the storage structure of
the database.
Based on cost:
o Open Source DBMS: Main RDBMS products like MySQL, PostgresSQL are available as 30
days versions, Many vendors support with additional facilities and sell. Giant systems are
sold in modular form according the configuration required.
o License based
Site license allow unlimited use of database system with any number of
copies running at customer site.
License limits the number of concurrent users at a location.
o Standalone single user versions are sold per copy or included in desktop or laptop
configuration, ACCESS
o Additional features can be made available in any of the above kind at extra cost
Object-oriented databases
o Object data model is based on object oriented approach i.e. objects, classes with their
attributes and operations. ODMG object model provides standards for commercial object data
model.
o Example: Some object-oriented databases are designed to work well with object-oriented
programming languages such as Python, Java, C#, Visual Basic .NET, C++, Objective-C and
Smalltalk; others have their own programming languages. The early commercial products
were integrated with various languages: GemStone (Smalltalk), Gbase (LISP), Vbase (COP)
and VOSS (Virtual Object Storage System for Smalltalk).
End of Chapter 1