Chapter 1-Database System Introduction
Chapter 1-Database System Introduction
Database Systems
"The numbers have no way of speaking for themselves. We speak for them.
We imbue them with meaning." —Statistician Nate Silver in the book
The Signal and the Noise
Knowledge
Human mind purposefully organized the information and evaluate it to
produce knowledge. In other words the ability of the person recalls or
uses his information and experience is known as knowledge.
For example,
"386" is data,
"your marks are 386" is information,
And
"It is result of your hard work" is knowledge.
Terminologies cont…
Data, Information and Knowledge
Terminologies cont…
Database(A large collection of data.)
•is collection of related data and its metadata organized in a
structured format for optimized information management
•Examples: databases of customers, products,...
•A database usually models (some part of) a real- world enterprise.
Entities (e.g., students, courses)
Relationships (e.g., John Doe is taking DBS)
Database System
is an integrated system of hardware, software, people, procedures,
and data that define and regulate the collection, storage,
management, and use of data within a database environment
Data Hierarchy- the systematic organization of data
Data Hierarchy-cont…
Database System versus File System
DBMS File Processing System
Minimal data redundancy problem in
Data Redundancy problem exits
DBMS
Data Inconsistency does not exist Data Inconsistency exist here
Accessing database is easier Accessing is comparatively difficult
Data is scattered in various files and files
The problem of data isolation is not
may be of different format, so data
found in database
isolation problem exists
Transactions like insert, delete, view, In file system, transactions are not
updating, etc are possible in database possible
Concurrent access and recovery is Concurrent access and recovery is not
possible in database possible
Security of data Security of data is not good
A database manager (administrator) A file manager is used to store all
stores the relationship in form of relationships in directories in file
Why learn about databases?
Data mining
Integrating information
BICTE?
How could we find this using a conventional
search within file system?
Do we get what we want?
Hardware
Software
- OS
- DBMS
- Applications
People
Procedures
Data
Compliance
The user has the right to a system that performs exactly as promised.
Instruction
The user has the right to easy-to-use instructions (user guides, online
or contextual help, error messages) for understanding and utilizing a
system to achieve desired goals and recover efficiently and gracefully
from problem situations.
Usability
The user should be the master of software and hardware technology,
not vice-versa. Products should be natural and intuitive to use.
Database: Data Models
Importance
Abstraction of complex real-word data structures in relative simple
(graphical) representations
Facilitate interaction among the designer, the applications
programmer, and the end user
Data independence
You don’t need to know the implementation of the database to
access the data
Applications insulated from how data is structured and stored
change the order of tuples
Note that query does not change when physical structure changes
One of the most important benefits of using a DBMS
Why Use a DBMS? Cont…
Efficient access
queries are optimized.
Reduced application development time
Queries can be expressed declaratively, we do not need to
indicate how to execute them
Data integrity and security
Some constraints on the data are enforced
automatically.
Data Consistency
Data Constraints:
All students must have a student ID (sID)
Etc.
Why Use a DBMS? Cont…
Concurrent access, recovery from crashes
Many users can access/update the database at the same
DBMS performance.
Because disk accesses are frequent, and relatively slow
inconsistency:
A cheque is cleared while account balance is being
computed.
DBMS ensures that such problems do not arise: users can
Yahoo: 2 PB (1 PB ≈ 1015 B)
AND
Operations on data
Constraints
SELECT *
FROM Students WHERE age = 20
File-based
Hierarchical
Object-oriented
Network
Relational Web-based
Entity-Relationship
Database: Historical Roots
Weakness
“Islands of data” in scattered file systems.
Problems
Duplication
same data may be stored in multiple files
Inconsistency
same data may be stored by different names in different format
Rigidity
requires customized programming to implement any changes
cannot do ad-hoc queries
Implications
Waste of space
Data inaccuracies
High overhead of data manipulation and maintenance
File System: Problem Case
Background
Developed to manage large amount of data for complex manufacturing
projects
e.g., Information Management System (IMS)
IBM-Rockwell joint venture
clustered related data together
hierarchically associated data clusters using pointers
Disadvantages
Limited representation of data relationships
did not allow Many-to-Many (M:N) relations
Complex implementation
required in-depth knowledge of physical data storage
Structural Dependence
data access requires physical storage path
Lack of Standards
limited portability
Network Database
Objectives
Represent more complex data relationships
Improve database performance
Impose a database standard
Advantages
More data relationship types
More efficient and flexible data access
“network” vs. “tree” path traversal
Conformance to standards
enhanced database administration and portability
Disadvantages
System complexity
require familiarity with the internal structure for data access
Lack of structural independence
small structural changes require significant program changes
Relational Database
Advantages
Structural independence
Separation of database design and physical data storage/access
Easier database design, implementation, management, and use
Ad hoc query capability with Structured Query Language (SQL)
SQL translates user queries to codes
Disadvantages
Substantial hardware and system software overhead
more complex system
Poor design and implementation is made easy
ease-of-use allows careless use of RDBMS
Entity Relationship Model
Entity
represented by a rectangle with its name in capital
letters.
Relationships
represented by an active or passive verb inside the
diamond that connects the related entities.
Connectivities
i.e., types of relationship
written next to each entity box.
Entity
represented by a rectangle with its name in
capital letters.
Relationships
represented by an active or passive verb
that connects the related entities.
Connectivities
indicated by symbols next to entities.
2 vertical lines for 1
“crow’s foot” for M
Advantages
Disadvantages
Incomplete model on its own
Limited representational power
cannot model data constraints not tied to entity relationships
e.g. attribute constraints
cannot represent relationships between attributes within entities
No data manipulation language (e.g. SQL)
Disadvantages
Lack of standards
no standard data access method
Complex navigational data access
class hierarchy traversal
Steep learning curve
difficult to design and implement properly
More system-oriented than user-centered
High system overhead
slow transactions
Web Database