A2Z Dbms
A2Z Dbms
The database is a collection of inter-related data which is used to retrieve, insert and delete the data
efficiently. It is also used to organize the data in the form of a table, schema, views, and report etc
For example: The college Database organizes the data about the admin, staff, students and faculty
etc.
Using the database, you can easily retrieve, insert, and delete the information.
o Data Definition: It is used for creation, modification, and removal of definition that defines
the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual data in
the database.
o Data Retrieval: It is used to retrieve the data from the database which can be used by
applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data integrity,
enforcing data security, dealing with concurrency control, monitoring performance and
recovering information corrupted by unexpected failure
Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the information. o It
can provide a clear and logical view of the process that manipulates data. o DBMS contains
automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure. o It
can reduce the complex relationship between data.
o It is used to support manipulation and processing of data. o It is used to provide security of
data.
o It can view the database from different viewpoints according to the requirements of the
user.
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all the data
in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data among
multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of the
database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup of
data from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical user
interfaces, application program interfaces
Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and large
memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently. o
Complexity: Database system creates additional complexity and requirements. o Higher
impact of failure: Failure is highly impacted the database because in most of the
organization, all the data stored in a single database and if the database is damaged due to
electric failure or database corruption then the data may be lost forever.
Types of Databases
There are various types of databases used for storing different varieties of data:
1) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts the users to
access the stored data from different locations through several applications. These applications
contain the authentication process to let users access data securely. An example of a Centralized
database can be Central Library that carries a central database of each library in a
college/university.
Advantages of Centralized Database
o It has decreased the risk of data management, i.e., manipulation of data will not affect the
core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data standards.
o It is less costly because fewer vendors are required to handle the data sets.
2) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed among different
database systems of an organization. These database systems are connected via communication
links. Such links help the end-users to access the data easily. Examples of the Distributed database
are Apache Cassandra, HBase, Ignite, etc. We can further divide a distributed database system
into:
o Homogeneous DDB: Those database systems which execute on the same operating system
and use the same application process and carry the same hardware devices.
o Heterogeneous DDB: Those database systems which execute on different operating
systems under different application procedures, and carries different hardware devices.
Advantages of Distributed Database
o Modular development is possible in a distributed database, i.e., the system can be expanded
by including new computers and connecting them to the distributed system.
o One server failure will not affect the entire data set.
3) Relational Database
This database is based on the relational data model, which stores data in the form of rows(tuple)
and columns(attributes), and together forms a table(relation). A relational database uses SQL for
storing, manipulating, as well as maintaining the data. E.F. Codd invented the database in 1970.
Each table in the database carries a key that makes the data unique from others. Examples of
Relational databases are MySQL, Microsoft SQL Server, Oracle, etc.
A means Atomicity: This ensures the data operation will complete either with success or with
failure. It follows the 'all or nothing' strategy. For example, a transaction will either be committed
or will abort.
C means Consistency: If we perform any operation over the data, its value before and after the
operation should be preserved. For example, the account balance before and after the transaction
should be correct, i.e., it should remain conserved.
I means Isolation: There can be concurrent users for accessing data at the same time from the
database. Thus, isolation between the data should remain isolated. For example, when multiple
transactions occur at the same time, one transaction effects should not be visible to the other
transactions in the database.
D means Durability: It ensures that once it completes the operation and commits the data, data
changes should remain permanent.
4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data sets. It
is not a relational database as it stores data not only in tabular form but in several different ways.
It came into existence when the demand for building modern applications increased. Thus, NoSQL
presented a wide variety of database technologies in response to the demands. We can further
divide a NoSQL database into the following four types:
a. Key-value storage: It is the simplest type of database storage where
it stores every single item as a key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like
document. It helps developers in storing data by using the same document-model
format as used in the application code.
a. Graph Databases: It is used for storing vast amounts of data in a graph-like structure. Most
commonly, social networking websites use the graph database.
b. Wide-column stores: It is similar to the data represented in relational databases.
Here, data is stored in large columns together, instead of storing in rows.
5) Cloud Database
A type of database where data is stored in a virtual environment and executes over the cloud
computing platform. It provides users with various cloud computing services (SaaS, PaaS, IaaS,
etc.) for accessing the database. There are numerous cloud platforms, but the best options are:
7) Hierarchical Databases
It is the type of database that stores data in the form of parent-children relationship nodes. Here, it
organizes data in a tree-like structure.
Data get stored in the form of records that are connected via links. Each child record in the tree
will contain only one parent. On the other hand, each parent record can have multiple child records.
8) Network Databases
It is the database that typically follows the network data model. Here, the representation of data is
in the form of nodes connected via links between them. Unlike the hierarchical database, it allows
each record to have multiple children and parent nodes to form a generalized graph structure.
9) Personal Database
Collecting and storing data on the user's system defines a Personal Database. This database is
basically designed for a single user.
Functional dependencies
In a relational database management, functional dependency is a concept that
specifies the relationship between two sets of attributes where one attribute
determines the value of another attribute. It is denoted as X → Y, where the attribute
set on the left side of the arrow, X is called Determinant, and Y is called the
Dependent.
42 17
43
44
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is
not a subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial
functional dependency, since age is not a subset of {roll_no, name}
43
44
45 19
Here, roll_no → {name, age} is a multivalued functional dependency, since the dependents
name & age are not dependent on each other(i.e. name → age or age →
name doesn’t exist !)
4. Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on determinant. i.e.
If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive
functional dependency.
42
43
44
45
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect
functional dependency, hence called Transitive functional dependency
.
5. Fully Functional Dependency
In full functional dependency an attribute or a set of attributes uniquely determines another
attribute or set of attributes. If a relation R has attributes X, Y, Z with the dependencies X-
>Y and X->Z which states that those dependencies are fully functional.
6. Partial Functional Dependency
In partial functional dependency a non key attribute depends on a part of the composite
key, rather than the whole key. If a relation R has attributes X, Y, Z where X and Y are the
composite key and Z is non key attribute. Then X->Z is a partial functional dependency in
RBDMS.
1. Data Normalization
2. Query Optimization
With the help of functional dependencies we are able to decide the connectivity between
the tables and the necessary attributes need to be projected to retrieve the required data
from the tables. This helps in query optimization and improves performance.
3. Consistency of Data
Functional dependencies ensures the consistency of the data by removing any redundancies
or inconsistencies that may exist in the data. Functional dependency ensures that the
changes made in one attribute does not affect inconsistency in another set of attributes thus
it maintains the consistency of the data in database.
Normalization
A large database defined as a single relation may result in data duplication. This repetition of data
may result in:
So to handle these problems, we should analyze and decompose the relations with redundant data
into smaller, simpler, and well-structured relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations with fewer attributes.
What is Normalization?
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies. o Normalization divides the larger table into smaller and links them using
relationships.
o The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the database
grows. Normalization consists of a series of guidelines that helps to guide you in creating a good
database structure.
o Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a
relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data
results in the unintended loss of some other important data.
o Updatation Anomaly: The update anomaly is when an update of a single data value requires
multiple rows of data to be updated.
Advantages of Normalization
o Normalization helps to minimize data redundancy. o
Greater overall database organization. o Data consistency
within the database. o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization
o You cannot start building the database before knowing what the user needs.
o The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF. o It is very time-consuming and difficult to normalize relations of a higher
degree. o Careless decomposition may lead to a bad database design, leading to serious
problems.
EMPLOYEE table:
14 John 7272826385, UP
9064738238
12 Sam Punjab
7390372389,
8589830302
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 7272826385 UP
14 John 9064738238 UP
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
TEACHER table
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
A relation is in third normal form if it holds atleast one of the following conditions for every non-
trivial function dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:
EMPLOYEE_DETAIL table:
Non-prime attributes: In the given table, all attributes except EMP_ID are nonprime.
That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMPLOYEE table:
EMPLOYEE_ZIP table:
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
Now, this is in BCNF because left side part of both the functional dependencies is a key.
Structured Query Language
SQL (Structured Query Language) is a standard programming language used for managing
relational databases and performing various operations on the data stored within them. SQL
provides a comprehensive set of commands and syntax for interacting with databases. It allows
users to create, modify, and query databases, as well as perform other tasks such as data
manipulation, data definition, and data control. SQL is designed to be easy to read and write,
making it accessible to both technical and non-technical users.
SQL operates on a set-based approach, where queries are executed on sets of data rather than
individual records. This allows for efficient and concise data retrieval and manipulation. SQL
queries can be used to perform a wide range of operations, including data filtering, sorting,
aggregations, and joining multiple tables.
1. SQL Statements:
- SQL is based on a set of statements that are used to communicate with a database. These
statements can be broadly categorized into three types:
- Data Manipulation Language (DML) statements: Used to retrieve, insert, update, and delete
data in the database. Common DML statements include SELECT, INSERT, UPDATE, and
DELETE. - Data Definition Language (DDL) statements: Used to create, modify, and delete
database objects such as tables, views, indexes, and constraints. Common DDL statements
include CREATE, ALTER, and DROP.
- Data Control Language (DCL) statements: Used to manage user access and permissions,
granting or revoking privileges on the database objects. Common DCL statements include
GRANT and REVOKE.
3. Querying Data:
- SQL provides the SELECT statement for querying data from one or more tables. It allows
you to specify the columns to retrieve, apply filtering conditions using WHERE clause, perform
sorting using ORDER BY, and group data using GROUP BY. You can also use aggregate
functions like SUM, COUNT, AVG, etc., to perform calculations on the data.
4. Data Manipulation:
- SQL enables you to insert new data into a table using the INSERT statement. You can update
existing data using the UPDATE statement to modify specific columns or rows. The DELETE
statement is used to remove data from a table.
5. Filtering and Joining Data:
- SQL provides various operators and clauses to filter and combine data. The WHERE clause
allows you to specify conditions to filter data based on specific criteria. The JOIN operation
allows you to combine rows from different tables based on common columns, enabling you to
retrieve related data from multiple tables.
7. Views:
- SQL allows you to create virtual tables called views. Views are based on one or more tables
and provide a way to present a subset of data or a customized perspective of the data. Views can
be used to simplify complex queries, provide data security, and enhance query performance.
SQL is supported by a wide range of relational database management systems, such as Oracle,
MySQL, Microsoft SQL Server, PostgreSQL, and SQLite. Although SQL is a standardized
language, each database system may have its own specific variations and extensions.
In summary, SQL is a versatile and widely used language for working with relational
databases. It provides a concise and powerful way to manage data, define database structures,
and control access. SQL's popularity and standardization make it an essential skill for anyone
working with databases and data management.
Types of SQL
There are several types of SQL, each with its own specific purpose and functionality. Here are
some commonly used types of SQL:
- DDL statements are used to define and manage the structure of a database. They include
commands for creating, altering, and dropping database objects such as tables, indexes, views,
and constraints.
Syntax examples:
- CREATE TABLE: Creates a new table in the database.
CREATE TABLE employees (id INT PRIMARY KEY, name VARCHAR(50), age INT );
- DML statements are used to manipulate data within the database. They include commands for
inserting, updating, deleting, and retrieving data from tables.
Syntax examples:
- INSERT INTO: Inserts new records into a table.
INSERT INTO employees (id, name, age) VALUES (1, 'John Doe', 30);
- DCL statements are used to manage access and permissions within the database. They control
user privileges and permissions for executing certain operations.
Syntax examples:
- GRANT: Grants specific privileges to a user.
Syntax examples:
BEGIN TRANSACTION;
COMMIT;
ROLLBACK;
Syntax examples:
- SELECT: Retrieves data from one or more tables.
SELECT * FROM employees WHERE age > 25;
These are the main types of SQL statements along with their syntax and examples. Each type
serves a specific purpose in managing and manipulating data within a relational database.