Dbms Interview
Dbms Interview
Dbms Interview
A Database Management System (DBMS) is software that helps you store, organize, and manage
data in a structured way. It allows you to create, read, update, and delete data while keeping it
secure and easy to access. In simpler terms, it's like a digital filing cabinet where you can efficiently
manage large amounts of information, such as customer records, sales data, or personal contacts,
without getting lost in paperwork. Working of DBMS is defined in the figure below.
2) What is a database?
A database is an organized collection of information or data that is stored in a way that makes it easy
to find, update, and manage. Think of it like a digital version of a filing cabinet, where each file or
piece of information (like names, addresses, or sales records) is neatly stored so you can quickly find
or update it when needed.
A database system is a combination of software (like a Database Management System or DBMS) and
the actual database itself that works together to store, manage, and organize data. It includes the
tools and methods needed to handle large amounts of information, allowing users to create, update,
and retrieve data efficiently. Essentially, a database system makes sure data is stored in a structured
way and is easily accessible when needed.
o Redundancy control
o Easy accessibility
o Easy data extraction and data processing due to the use of queries
5) What is a checkpoint in DBMS?
The Checkpoint is a type of mechanism where all the previous logs are removed from the system and
permanently stored in the storage disk.
In simple terms, a checkpoint in a database is like a "save point" in a video game. It saves all the
recent changes made to the database, so if something goes wrong (like a crash), the system can start
again from that saved point without losing too much data. This makes it easier and faster to recover
the database after an issue.
There are two ways which can help the DBMS in recovering and maintaining the ACID properties, and
they are- maintaining the log of each transaction and maintaining shadow pages. So, when it comes
to log based recovery system, checkpoints come into existence. Checkpoints are those points to
which the database engine can recover after a crash as a specified minimal point from where the
transaction log record can be used to recover all the committed data up to the point of the crash.
A checkpoint is like a snapshot of the DBMS state. Using checkpoints, the DBMS can reduce the
amount of work to be done during a restart in the event of subsequent crashes. Checkpoints are
used for the recovery of the database after the system crash. Checkpoints are used in the log-based
recovery system. When due to a system crash we need to restart the system then at that point we
use checkpoints. So that, we don't have to perform the transactions from the very starting.
The transparent DBMS is a type of DBMS which keeps its physical structure hidden from users.
Physical structure or physical storage structure implies to the memory manager of the DBMS, and it
describes how the data stored on disk.
OR
A transparent DBMS means that the database system works in the background without the users
needing to worry about how it manages the data. Users can interact with the database, like adding
or retrieving information, without knowing the technical details of how the database stores,
organizes, or retrieves that data.
In simple terms, it's like driving a car without needing to understand how the engine works—
everything happens smoothly behind the scenes while you just focus on getting the job done.
PROJECTION and SELECTION are the unary operations in relational algebra. Unary operations are
those operations which use single operands. Unary operations are SELECTION, PROJECTION, and
RENAME.
An RDBMS (Relational Database Management System) is a type of database system that organizes
data into tables with rows and columns. Each table represents a different entity (like "Customers" or
"Orders"), and the relationships between these tables are defined using keys. This allows the data to
be stored efficiently and related logically.
In simple terms, an RDBMS is like a collection of interconnected spreadsheets where each sheet
holds different kinds of data, and you can easily link them together to get the information you need.
Popular examples of RDBMS are MySQL, PostgreSQL, and Oracle.
o Data Definition Language (DDL) e.g., CREATE, ALTER, DROP, TRUNCATE, RENAME, etc. All
these commands are used for updating the data thats why they are known as Data Definition
Language.
o Data Manipulation Language (DML) e.g., SELECT, UPDATE, INSERT, DELETE, etc. These
commands are used for the manipulation of already updated data that's why they are the
part of Data Manipulation Language.
o DATA Control Language (DCL) e.g., GRANT and REVOKE. These commands are used for giving
and removing the user access on the database. So, they are the part of Data Control
Language.
o Transaction Control Language (TCL) e.g., COMMIT, ROLLBACK, and SAVEPOINT. These are the
commands used for managing transactions in the database. TCL is used for managing the
changes made by DML.
NOTE:-
• COMMIT is like saving your changes. When you make updates to a database and then issue a
COMMIT, it permanently saves those changes, making them final.
• ROLLBACK is like undoing your changes. If you made some updates but realize there's a
mistake, you can use ROLLBACK to undo everything since your last save (COMMIT), returning
the database to its previous state.
Database language implies the queries that are used for the update, modify and manipulate the
data.
11) What do you understand by Data Model?
The Data model is specified as a collection of conceptual tools for describing data, data relationships,
data semantics and constraints. These models are used to describe the relationship between the
entities and their attributes.
o network model
o relational model
A Relation Schema is specified as a set of attributes. It is also known as table schema. It defines what
the name of the table is. Relation schema is known as the blueprint with the help of which we can
explain that how the data is organized into tables. This blueprint contains no data.
A relation is specified as a set of tuples. A relation is the set of related attributes with identifying key
attributes
Let r be the relation which contains set tuples (t1, t2, t3, ..., tn). Each tuple is an ordered list of n-
values t=(v1,v2, ...., vn).
The degree of relation is a number of attribute of its relation schema. A degree of relation is also
known as Cardinality it is defined as the number of occurrence of one entity which is connected to
the number of occurrence of other entity. There are three degree of relation they are one-to-
one(1:1), one-to-many(1:M), many-to-one(M:M).
The Relationship is defined as an association among two or more entities. There are three type of
relationships in DBMS-
One-To-One: Here one record of any object can be related to one record of another object.
One-To-Many (many-to-one): Here one record of any object can be related to many records of other
object and vice versa.
Many-to-many: Here more than one records of an object can be related to n number of records of
another object.
15) What are the disadvantages of file processing systems?
o Inconsistent
o Not secure
o Data redundancy
o Data isolation
o Data integrity
o Atomicity problem
Data abstraction in DBMS is a process of hiding irrelevant details from users. Because database
systems are made of complex data structures so, it makes accessible the user interaction with the
database.
For example: We know that most of the users prefer those systems which have a simple GUI that
means no complex processing. So, to keep the user tuned and for making the access to the data easy,
it is necessary to do data abstraction. In addition to it, data abstraction divides the system in different
layers to make the work specified and well defined.
Physical level: It is the lowest level of abstraction. It describes how data are stored.
Logical level: It is the next higher level of abstraction. It describes what data are stored in the
database and what the relationship among those data is.
View level: It is the highest level of data abstraction. It describes only part of the entire database.
For example- User interacts with the system using the GUI and fill the required details, but the user
doesn't have any idea how the data is being used. So, the abstraction level is entirely high in VIEW
LEVEL.
Then, the next level is for PROGRAMMERS as in this level the fields and records are visible and the
programmers have the knowledge of this layer. So, the level of abstraction here is a little low in VIEW
LEVEL.
Data Definition Language (DDL) is a standard for commands which defines the different structures in
a database. Most commonly DDL statements are CREATE, ALTER, and DROP. These commands are
used for updating data into the database.
Data Manipulation Language (DML) is a language that enables the user to access or manipulate data
as organized by the appropriate data model. For example- SELECT, UPDATE, INSERT, DELETE.
Procedural DML or Low level DML: It requires a user to specify what data are needed and how to get
those data.
Non-Procedural DML or High level DML: It requires a user to specify what data are needed without
specifying how to get those data.
The DML Compiler translates DML statements in a query language that the query evaluation engine
can understand. DML Compiler is required because the DML is the family of syntax element which is
very similar to the other programming language which requires compilation. So, it is essential to
compile the code in the language which query evaluation engine can understand and then work on
those queries with proper output.
Relational Algebra is a Procedural Query Language which contains a set of operations that take one
or two relations as input and produce a new relationship. Relational algebra is the basic set of
operations for the relational model. The decisive point of relational algebra is that it is similar to the
algebra which operates on the number.
o select
o project
o set difference
o union
o rename,etc.
The term query optimization specifies an efficient execution plan for evaluating a query that has the
least estimated cost. The concept of query optimization came into the frame when there were a
number of methods, and algorithms existed for the same task then the question arose that which
one is more efficient and the process of determining the efficient way is known as query
optimization.
o More queries can be performed as due to optimization every query comparatively takes less
time.
Once the DBMS informs the user that a transaction has completed successfully, its effect should
persist even if the system crashes before all its changes are reflected on disk. This property is called
durability. Durability ensures that once the transaction is committed into the database, it will be
stored in the non-volatile memory and after that system failure cannot affect that data anymore.
Normalization is a process of analysing the given relation schemas according to their functional
dependencies. It is used to minimize redundancy and also used to minimize insertion, deletion and
update distractions. Normalization is considered as an essential process as it is used to avoid data
redundancy, insertion anomaly, updation anomaly, deletion anomaly.
Or
Normalization is the process of organizing data in a database to reduce redundancy and improve
data integrity. It involves structuring the database in a way that minimizes duplicate data and ensures
that relationships between different pieces of data are logical and efficient.
• Improves Data Integrity: Helps ensure that changes to data (like updating a phone number)
are consistent across the database.
• Simplifies Queries: Makes it easier to retrieve and manage data efficiently.
Denormalization is the process of boosting up database performance and adding of redundant data
which helps to get rid of complex data. Denormalization is a part of database optimization technique.
This process is used to avoid the use of complex and costly joins. Denormalization doesn't refer to
the thought of not to normalize instead of that denormalization takes place after normalization. In
this process, firstly the redundancy of the data will be removed using normalization process than
through denormalization process we will add redundant data as per the requirement so that we can
easily avoid the costly joins.
Or
Functional Dependency is the starting point of normalization. It exists when a relation between two
attributes allow you to determine the corresponding attribute's value uniquely. The functional
dependency is also known as database dependency and defines as the relationship which occurs
when one attribute in a relation uniquely determines another attribute. It is written as A->B which
means B is functionally dependent on A.
Or
A functional dependency indicates that if you know the value of one piece of data, you can
determine the value of another piece of data. It can be thought of as a rule that defines how data
relates to each other.
• Data Integrity: They help ensure that the data remains accurate and consistent.
• Normalization: They are used in the process of normalization to organize data in a way that
reduces redundancy.
28) What is the E-R model?
E-R model is a short name for the Entity-Relationship model. This model is based on the real world. It
contains necessary objects (known as entities) and the relationship among these objects. Here the
primary objects are the entity, attribute of that entity, relationship set, an attribute of that
relationship set can be mapped in the form of E-R diagram.
In E-R diagram, entities are represented by rectangles, relationships are represented by diamonds,
attributes are the characteristics of entities and represented by ellipses, and data flow is represented
through a straight line.
The Entity is a set of attributes in a database. An entity can be a real-world object which physically
exists in this world. All the entities have their attribute which in the real world considered as the
characteristics of the object.
For example: In the employee database of a company, the employee, department, and the
designation can be considered as the entities. These entities have some characteristics which will be
the attributes of the corresponding entity.
An entity type is specified as a collection of entities, having the same attributes. Entity type typically
corresponds to one or several related tables in the database. A characteristic or trait which defines or
uniquely identifies the entity is called entity type.
For example, a student has student_id, department, and course as its characteristics.
The entity set specifies the collection of all entities of a particular entity type in the database. An
entity set is known as the set of all the entities which share the same properties.
An extension of an entity type is specified as a collection of entities of a particular entity type that
are grouped into an entity set.
An entity set that doesn't have sufficient attributes to form a primary key is referred to as a weak
entity set. The member of a weak entity set is known as a subordinate entity. Weak entity set does
not have a primary key, but we need a mean to differentiate among all those entries in the entity set
that depend on one particular strong entity set.
For example: If a student is an entity in the table then age will be the attribute of that student.
Data integrity is one significant aspect while maintaining the database. So, data integrity is enforced
in the database system by imposing a series of rules. Those set of integrity is known as the integrity
rules.
Entity Integrity : It specifies that "Primary key cannot have a NULL value."
Referential Integrity: It specifies that "Foreign Key can be either a NULL value or should be the
Primary Key value of other relation
System R was designed and developed from 1974 to 1979 at IBM San Jose Research Centre. System R
is the first implementation of SQL, which is the standard relational data query language, and it was
also the first to demonstrate that RDBMS could provide better transaction processing performance. It
is a prototype which is formed to show that it is possible to build a Relational System that can be
used in a real-life environment to solve real-life problems.
o Research Storage
Data independence specifies that "the application is independent of the storage structure and access
strategy of data." It makes you able to modify the schema definition at one level without altering the
schema definition in the next higher level.
It makes you able to modify the schema definition in one level should not affect the schema
definition in the next higher level.
There are two types of Data Independence:
Physical Data Independence: Physical data is the data stored in the database. It is in the bit-format.
Modification in physical level should not affect the logical level.
For example: If we want to manipulate the data inside any table that should not change the format
of the table.
Logical Data Independence: Logical data in the data about the database. It basically defines the
structure. Such as tables stored in the database. Modification in logical level should not affect the
view level.
For example: If we need to modify the format of any table, that modification should not affect the
data inside it.
Physical level: It is the lowest level of abstraction. It describes how data are stored.
Logical level: It is the next higher level of abstraction. It describes what data are stored in the
database and what relationship among those data.
View level: It is the highest level of data abstraction. It describes only part of the entire database.
For example- User interact with the system using the GUI and fill the required details, but the user
doesn't have any idea how the data is being used. So, the abstraction level is absolutely high in VIEW
LEVEL.
Then, the next level is for PROGRAMMERS as in this level the fields and records are visible and the
programmer has the knowledge of this layer. So, the level of abstraction here is a little low in VIEW
LEVEL.
The Join operation is one of the most useful activities in relational algebra. It is most commonly used
way to combine information from two or more relations. A Join is always performed on the basis of
the same or related column. Most complex queries of SQL involve JOIN command.
o Theta join
o Natural join
o Equi join
o Outer joins: Outer join have three types. They are:
1NF is the First Normal Form. It is the simplest type of normalization that you can implement in a
database. The primary objectives of 1NF are to:
The table must also have a unique identifier (a primary key) for each row.
2NF is the Second Normal Form. A table is said to be 2NF if it follows the following conditions:
o The table is in 1NF, i.e., firstly it is necessary that the table should follow the rules of 1NF.
o Every non-prime attribute is fully functionally dependent on the primary key, i.e., every non-
key attribute should be dependent on the primary key in such a way that if any key element
is deleted, then even the non_key element will still be saved in the database.
3NF stands for Third Normal Form. A database is called in 3NF if it satisfies the following conditions:
Where:
X->Y
Y does not -> X
Y->Z so, X->Z
BCMF stands for Boyce-Codd Normal Form. It is an advanced version of 3NF, so it is also referred to
as 3.5NF. BCNF is stricter than 3NF.
o It is in 3NF.
o For every functional dependency X->Y, X should be the super key of the table. It merely
means that X cannot be a non-prime attribute if Y is a prime attribute.
ACID properties are some basic rules, which has to be satisfied by every transaction to preserve the
integrity. These properties and rules are:
ATOMICITY: Atomicity is more generally known as ?all or nothing rule.' Which implies all are
considered as one unit, and they either run to completion or not executed at all.
CONSISTENCY: This property refers to the uniformity of the data. Consistency implies that the
database is consistent before and after the transaction.
ISOLATION: This property states that the number of the transaction can be executed concurrently
without leading to the inconsistency of the database state.
DURABILITY: This property ensures that once the transaction is committed it will be stored in the
non-volatile memory and system crash can also not affect it anymore.
Or in detail
The ACID properties are a set of rules that ensure reliable and consistent transactions in a database.
Here's what each one means in simple terms:
1. Atomicity:
o All parts of a transaction must succeed or fail as a whole. If one part of the
transaction fails, the entire transaction is rolled back, and no changes are made.
o Example: If you're transferring money between bank accounts, either the entire
transfer happens, or none of it does.
2. Consistency:
o The database must remain in a valid state before and after a transaction. It ensures
that any changes follow the defined rules and constraints of the database.
o Example: After a money transfer, the total balance in both accounts should be
correct according to the rules.
3. Isolation:
o Transactions should not interfere with each other, even if they are happening at the
same time. Each transaction should run as if it's the only one happening.
o Example: If two people are withdrawing from the same account at the same time,
each transaction will happen independently and won't affect the other.
4. Durability:
o Once a transaction is completed and committed, the changes are permanent, even if
the system crashes.
o Example: After transferring money, even if the power goes out, the transfer will still
be saved in the system.
In short, ACID properties ensure that database transactions are safe, consistent, and reliable.
A stored procedure is a group of SQL statements that have been created and stored in the database.
The stored procedure increases the reusability as here the code or the procedure is stored into the
system and used again and again that makes the work easy, takes less time in processing and
decreases the complexity of the system. So, if you have a code which you need to use again and
again then save that code and call that code whenever it is required.
47) What is the difference between a DELETE command and TRUNCATE command?
DELETE command: DELETE command is used to delete rows from a table based on the condition that
we provide in a WHERE clause.
o DELETE command delete only those rows which are specified with the WHERE clause.
TRUNCATE command: TRUNCATE command is used to remove all rows (complete data) from a table.
It is similar to the DELETE command with no WHERE clause.
o The TRUNCATE command removes all the rows from the table.
• This is the user interface layer where users interact with the system. It displays information
and collects user input.
• Example: A web browser or mobile app where you view and enter data (like logging into a
website or app).
• Example: The server that processes your login credentials or calculates the total price in a
shopping cart.
• This is the data storage layer where all the actual data is stored, retrieved, and managed. It
responds to queries and stores updates.
• Example: The database that stores user accounts, product details, or order history.
In Simple Terms:
• Tier 1 (Presentation): What the user interacts with (like a website or app).
• Tier 2 (Application): The behind-the-scenes processing and logic (the "brain" of the system).
• Tier 3 (Database): Where the data is stored and managed (the "memory" of the system).
This structure helps organize the system into manageable parts and improves performance and
security.
The 2-Tier architecture is the same as basic client-server. In the two-tier architecture, applications on
the client end can directly communicate with the database at the server side.
The 3-Tier architecture contains another layer between the client and server. Introduction of 3-tier
architecture is for the ease of the users as it provides the GUI, which, make the system secure and
much more accessible. In this architecture, the application on the client-end interacts with an
application on the server which further communicates with the database system.
You have to use Structured Query Language (SQL) to communicate with the RDBMS. Using queries of
SQL, we can give the input to the database and then after processing of the queries database will
provide us the required output.
51) What is the difference between a shared lock and exclusive lock?
Shared lock: Shared lock is required for reading a data item. In the shared lock, many transactions
may hold a lock on the same data item. When more than one transaction is allowed to read the data
items then that is known as the shared lock.
Exclusive lock: When any transaction is about to perform the write operation, then the lock on the
data item is an exclusive lock. Because, if we allow more than one transaction then that will lead to
the inconsistency in the database.
OR
In a database, locks are used to control access to data when multiple users or transactions are
accessing it at the same time. Here's the difference between shared locks and exclusive locks in
simple terms:
1. Shared Lock (S-Lock):
• A shared lock allows multiple transactions to read the same data at the same time, but no
one can modify the data while it's being read.
• Example: If you're reading a book in a library, others can read the same book at the same
time, but no one can write in or change the book while you're reading.
• An exclusive lock allows a transaction to both read and modify the data, but no other
transactions can access (read or write) the data until the exclusive lock is released.
• Example: If you're writing or editing a book, no one else can read or write in it until you're
done with it.
In Simple Terms:
• Shared Lock: Multiple users can read the data, but no one can change it.
• Exclusive Lock: Only one user can read and modify the data, and no one else can access it
until the changes are done.
1. Primary Key:
• A primary key is a unique identifier for each record in a table. It ensures that no two rows
have the same value for that key.
• Example: In a "Students" table, the Student ID could be the primary key because each
student has a unique ID.
2. Candidate Key:
• A candidate key is any column or combination of columns that can uniquely identify a
record. A table can have more than one candidate key, but only one will be chosen as the
primary key.
• Example: In addition to Student ID, the email address could also be a candidate key because
it’s unique for each student.
3. Super Key:
• A super key is any set of one or more columns that can uniquely identify a row. All primary
and candidate keys are also super keys, but super keys can include extra unnecessary
columns.
• Example: (Student ID, Name) can be a super key because the Student ID already uniquely
identifies a student, and adding the Name doesn't change that.
4. Foreign Key:
• A foreign key is a column in one table that links to the primary key of another table. It
creates a relationship between the two tables.
• Example: In a "Grades" table, the Student ID might be a foreign key that links to the primary
key in the "Students" table.