Basic RDBMS Concepts
Basic RDBMS Concepts
Basic RDBMS Concepts
BY KINGSHUK SRIVASTAVA
Agenda
Introduction
SQL
Introduction
Shift from computation to information always true for corporate computing Web made this point for personal computing more and more true for scientific computing Need for DBMS has exploded in the last years Corporate: retail swipe/clickstreams, customer relationship mgmt, supply chain mgmt, data warehouses, etc. Scientific: digital libraries, Human Genome project, NASA Mission to Planet Earth, physical sensors, grid physics network DBMS encompasses much of CS in a practical discipline OS, languages, theory, AI, multimedia, logic Yet traditional focus on real-world apps
modeling languages and systems for querying data complex queries with real semantics* over massive data sets concurrency control for data manipulation controlling concurrent access ensuring transactional semantics reliable data storage maintain data semantics even if you pull the plug
* semantics: the meaning or relationship of meanings of a sign or set of signs
data
Data? Information?
cardboard file. Every business group has its own set of files
Data Redundancy
Data Inconsistency Data can not be shared
maintain information
A database is a repository for stored data and
Advantages of DBMS
Centralized control.
No Data Redundancy
Data Consistency Data can be shared
Data Models
A data model is a collection of concepts for
describing data A Schema is a description of a particular collection of data using the given data model The relational model is the most widely used model today
Levels of Abstraction
Many Views and single Conceptual and Physical Schema Views Describe how users see the data Conceptual Schema defines the logical structure Physical Schema defines the physical files and Indexes
Data Independence
Applications insulated from how data is structured
and stored
Structure of a DBMS
ACID Test
Atomicity
Consistency
Isolation Durability
(reads/writes). DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact. Each transaction, executed completely, must take the DB between consistent states or must not run at all. DBMS ensures that concurrent transactions appear to run in isolation. DBMS ensures durability of committed Xacts even if system crashes.
Note: can specify simple integrity constraints on the data. The DBMS
enforces these. Beyond this, the DBMS does not understand the semantics of the data. Ensuring that a single transaction (run alone) preserves consistency is largely the users responsibility!
Types of DBMS
Hierarchical
Network Relational
Network DBMS
Data is represented by records and pointers
Relational DBMS
Based on Relational Mathematics principles
a table Addresses all types of relations Easy to design No anomalies for insert/delete/update
Relational Terminology
Tuple (Row) Attribute (Column) Relation (Table) Integrity Constraints
Primary Key Alternate Key Foreign Key
Normalization
Normalization
Normalization - process of removing data
and the relation is decomposed into more number of relations to remove insert, delete and update anomalies.
approach.
Un normalized Form
A relation is said to be in Un normalized Form (0NF) if the values of any of its attributes are non-atomic. In other words more than one value is associated with each instance of the attribute.
Un normalized Relation
S#
S1
PQ
P# P1 P2 P3 P4 P1 P2 P2 QTY 300 200 400 200 300 400 200
S2
S3
Functional Dependency
Given a relation R, attribute Y of R is functionally dependent on attribute X if and only if each X-value in R has associated with it precisely one Y-value in R (at any one time)
teacher. Each teacher teaches only one subject. Each subject is taught by several teachers.
Position 1 2 2 1
subject.
Codds Rules
Codds Rules
1985 Proposed to test DBMSs for confirmation to concept of Codds Relational model Hardly any commercial product follows all
Rule Zero
For a system to qualify as an RDBMS it must be able to manage its databases entirely through its Relational capabilities The other 12 rules derive from this rule
Example: SQL If file supporting table can be accessed by any manner except a SQL Interface, then a violation
All views that are theoretically updatable should be updatable View = "Virtual table", temporarily derived from base tables Example: If a view is formed as join of 3 tables, changes to view should be reflected in base tables Not updatable: View does not have NOT-NULL attribute of base table Problems with computed fields in view e.g. Total Income = White income + Black income
Rule7: Relational level operations There must be insert, update, delete operations at the level of Relations
The database should be able to enforce its own integrity rather than using other programs Integrity rules = Filter to allow correct data, should be stored in Data Dictionary Key and check constraints, triggers etc should be stored in Data Dictionary This also makes RDBMS independent of front end
If low level access is allowed to a system it should not be able to subvert or bypass integrity rules to change data This may be achieved by some sort of locking or encryption Some low level access tools are provided by vendors that violate these rules for extra speed
Data Definition Language DML Data Manipulation language DCL Data Control Language
DDL
Create
Alter
Drop Truncate
DML
Insert
Update
Delete Select
DCL
Commit
Rollback
Save point Set transaction
Integrity Constraints
Primary key (PK)
Check
Data Types
Character
Varchar2
Number Date
BLOB
BFILE
Arithmetic Operator
+
* /
Mod
ABS
Set Operators
UNION
UNION ALL
INTERSECTION MINUS
Thank You