DB_notes
DB_notes
DB_notes
Database Systems
CHAPTER 1
1.1 The Evolution of Database
Systems
What is database System?
What are its components?
DATABASE SYSTEM
A database system is a application or platform used to create, manage,
and organize data. It allows for efficient storage, retrieval, and
manipulation of data in a structured manner.
COMPONENTS:
1. Database
2. DBMS
3. Users
4. Query Language
1.1 The Evolution of Database Systems
Database:
A database is an organized collection of data that is stored
and managed to allow for easy access, retrieval, and
manipulation.
DBMS:
The software that manages the database, allowing users
to interact with and manipulate the data.
Users:
The individuals or applications that interact with the
database and DBMS.
Query Language:
A language like SQL (Structured Query Language) used to
interact with the database.
Features of DBMS (Database
management system):
The key features of a DBMS include:
1. Allow users to create databases and define their logical
structure using a data-definition language.
2. Enable users to query and modify the data through a query or
data-manipulation language.
3. Support storage of large amounts of data with efficient
access for queries and modifications.
4. Durability: Ensures recovery of the database in case of
failures, errors, or misuse.
5. Isolation: Control access to data from many users at once,
without allowing unexpected interactions among users.
6. Atomicity: Set of actions in a database happens all at once
or not at all.
Examples of DBMS:
• Data Searching
• Lack of Data Security
• Limited Multi-User Support
• No Efficient Data Access
• Data Redundancy
• Limited Storage
How DBMS Improved Things?
The company might want to see an overall picture, like how many
employees it has or what products it makes, but the differences in
databases make this difficult. Older applications rely on these
databases, so they can’t just throw them away and start fresh.
SOLUTIONS:
Data Warehouses
Middleware (Mediators)
Solutions:
Data Warehouses Middleware
(Mediators)
All the data from different
databases is copied into one Middleware works as a translator
central database. While between the databases. It lets
users search and analyze data as
copying, the data is cleaned
if it’s in one system, even though
and standardized so it’s still in separate databases.
everything uses the same Middleware supports an
terms and formats. integrated model of data of
various databases, while
translating between this model
and the actual models used by
Hence, using the middleware approach
eachisdatabase.
clearly an easier &
efficient approach.
1.2 Overview of a Database
Management System
To advance in this concept
we need to remember few
things:
Single boxes represent
different components of
the system,
Double boxes represent
in-memory data
structures.
Solid lines show how
both control and data
move through the
system.
Dashed lines show
data movement only.
1.2 Overview of a Database
Management System
Users and Application Programs: These request data from the
database or make changes to the data stored in it.
Atomicity
Consistency
Isolation
Durability
Terminologies to Know
Blocks: Buffers:
A block refers to the smallest A buffer is a designated
unit of data that can be read from or
area in main memory (RAM) where
written to the disk .Block corresponds
to a fixed-size chunk of data on the data from disk blocks is
storage device (magnetic disk), temporarily stored before
typically ranging from 512 bytes to a processing.
few kilobytes.
Buffer Manager: Storage Manager:
The buffer manager is a It is the job of the
software component of the DBMS storage manager to control the
responsible for allocating and placement of data on disk and its
managing buffers in main memory. movement between disk and main
memory.
1.2.3 Storage and Buffer
Management:
In a database system, most data is stored on secondary storage, like
hard drives or SSDs.
Why Secondary storage devices?
1.2.3 Storage and Buffer
Management:
In a database system, most data is stored on secondary storage, like
hard drives or SSDs.
Why Secondary storage devices?
1. Because it can hold large amounts of data, unlike main memory
(RAM), which is smaller and temporary.
2. However To do any useful operation (e.g., searching or updating
data), the data must first be loaded into main memory (RAM),
because the CPU cannot directly work with data on disk.
1.2.3 Storage and Buffer
Management:
The buffer manager is responsible for dividing the available main
memory into buffers, which are page-sized regions into which disk
blocks can be transferred. Thus, all DBMS components that need
information from the disk will interact with the buffers and the buffer
manager, either directly or through the execution engine. The kinds of
information that various components may need include:
1. Data: the contents of the database itself.
2. Metadata: the database schema that describes the structure of, and
constraints on, the database.
3. Log Records: information about recent changes to the database;
these support durability of the database.
4. Statistics: information gathered and stored by the DBMS about data
properties such as the sizes of, and values in. various relations or other
components of the database.
5. Indexes: data structures that support efficient access to the data.
Storage Hierarchy From Diagram:
• Main memory with buffer pools
• Secondary storage (disk) with data
blocks
• Data movement between layers
Transaction properties
Revise, we learnt about ACID Test.
The ACID properties include:
ATOMICITY
CONSISTENCY
ISOLATION
DURABILITY
Atomicity
A transaction is atomic, meaning it is an all-or-nothing operation. Either
all the changes made during the transaction are committed, or none are.
If any part of the transaction fails, all changes are rolled back.
Example:
If the transaction involves two steps:
Example:
Before Transaction:
Account A has $500.
Account B has $300.
Transaction:
Debit $100 from Account A.
Credit $100 to Account B.
After Transaction:
Account A should have $400 (after debit).
Account B should have $400 (after credit).
Now, let’s say that the transaction is supposed to ensure:
Account A cannot have a negative balance.
Account B cannot have more money than allowed (let’s say the maximum limit is $500).
Ifthe transaction is consistent, after transferring $100, Account A will have $400 and Account
B will have $400. Both accounts are valid and consistent with the rules.If there was a rule
violation, for example, if the transaction tried to take out more money from Account A than it
had (e.g., $600 instead of $100), the transaction would violate consistency, and the DBMS would
reject the transaction.
Isolation:
Transactions are isolated from each other. One transaction should not
affect the execution of another transaction, even if they occur
concurrently.in DBMS, isolation is implemented through locking.
How It Works:
Isolation ensures that intermediate states of a transaction are not
visible to other transactions. Each transaction appears to execute in
isolation, even if multiple transactions are running concurrently.
Isolation:
Without Isolation (Initial With Isolation :
state): Transaction 1 acquires lock and
Transaction 1 reads balance reads balance ($1000)
($1000) Transaction 2 must wait (shown
Transaction 2 reads balance by delayed position)
($1000) before T1 finishes Transaction 1 writes new
Transaction 1 writes new balance ($800) and releases
balance ($800) lock
Transaction 2 writes new Transaction 2 can now proceed,
balance ($700) reads $800
The $200 deduction from T1 Transaction 2 writes final
is lost! balance ($500)
Final balance is wrong ($700 Correct final state achieved
instead of $500)
Durability:
Durability ensures that once a transaction is committed, its changes are
permanent, even in the face of failures like power outages or crashes. It
can be achieved when transaction logs record changes before
committing them and changes are written to non-volatile storage (e.g.,
disk or SSD) before the transaction completes.
Example:
A transaction deposits $500 into an account. Once committed.
The database ensures the $500 is stored in durable storage.
Even if the system crashes immediately after, the $500 will be
available upon recovery.
1.2.5 Query Processing:
Query Optimizer:
Finds most efficient execution plan
Query Optimizer:
Finds most efficient execution plan. It transforms the initial query plan
into the best available sequence of operations on the actual data. The
optimizer uses information about the database, such as:
Metadata: Describes the structure of the database, including tables,
columns, indexes, and constraints.
Statistics: Information about the data, like the number of rows in a
table, the distribution of values in a column, or the presence of
unique values.
Click icon to add picture