SD(Introduction to DBMS and RDBMS, SQL vs NoSQL) NOTES (1)
SD(Introduction to DBMS and RDBMS, SQL vs NoSQL) NOTES (1)
What is a Database?
What is DBMS?
Types of DBMS
1. Hierarchical DBMS:
○ Organizes data in a tree-like structure, with each record having one
parent.
○ Example: IBM IMS.
○ Real-World Use: File directories.
2. Network DBMS:
○ Uses a graph structure to represent many-to-many relationships.
○ Example: Integrated Data Store (IDS).
○ Real-World Use: Airline reservation systems.
3. Relational DBMS:
○ Stores data in tables (rows and columns) and uses keys to establish
relationships.
○ Example: MySQL, Oracle Database.
○ Real-World Use: Banking and e-commerce systems.
4. Object-Oriented DBMS:
○ Stores data as objects, supporting features like inheritance and
encapsulation.
○ Example: ObjectDB.
○ Real-World Use: CAD/CAM systems.
5. NoSQL DBMS:
○ Designed for unstructured or semi-structured data with high scalability.
○ Example: MongoDB, Cassandra.
○ Real-World Use: Social media platforms.
Components of DBMS
Advantages of DBMS
Challenges of DBMS
1. Cost: High initial investment in hardware, software, and skilled personnel.
2. Complexity: Requires expertise for database design and maintenance.
3. Performance Overhead: Additional processing layers can impact speed.
4. System Failures: Despite backup mechanisms, hardware or software failures
can still disrupt operations.
1. Tabular Structure:
○ Data is stored in tables, where each row represents a record, and each
column represents a field (attribute).
○ Example: A "Students" table with columns StudentID, Name, Age, and
Grade.
2. Primary Key:
○ A unique identifier for each record in a table.
○ Example: In a "Students" table, the StudentID can serve as the primary
key.
3. Foreign Key:
○ A column in one table that refers to the primary key of another table to
establish a relationship.
○ Example: In an "Enrollments" table, the StudentID column links to the
"Students" table.
4. Data Integrity:
○ Maintains the accuracy and consistency of data through constraints like
Primary Key, Foreign Key, and Unique constraints.
5. SQL (Structured Query Language):
○ Provides a standard language to query, manipulate, and manage
relational databases.
○ Example:
■ SELECT * FROM Students WHERE Age > 18; retrieves all
students older than 18.
6. Normalization:
○ A process to reduce data redundancy and ensure data dependency by
dividing a database into smaller, related tables.
7. ACID Properties:
○ Ensures reliable transactions:
■ Atomicity: Transactions are all-or-nothing.
■ Consistency: Database remains consistent before and after the
transaction.
■ Isolation: Transactions are executed independently.
■ Durability: Completed transactions are saved permanently.
Advantages of RDBMS
Components of RDBMS
1. Tables:
○ The fundamental unit of storage, organized into rows (records) and
columns (fields).
2. Keys:
○ Primary Key: Unique identifier for a table.
○ Foreign Key: Links one table to another.
3. Indexes:
○ Enhance the speed of data retrieval operations.
4. Schemas:
○ Define the structure of the database (tables, columns, relationships).
5. SQL:
○ The query language used to interact with the database.
Examples of RDBMS
1. MySQL:
○ Open-source RDBMS widely used for web applications.
○ Example: Managing e-commerce data for products, orders, and
customers.
2. PostgreSQL:
○ Advanced open-source RDBMS supporting complex queries.
○ Example: Data warehousing and analytics.
3. Oracle Database:
○ Enterprise-level RDBMS with robust performance and scalability.
○ Example: Banking systems for managing accounts and transactions.
4. Microsoft SQL Server:
○ RDBMS designed for enterprise applications and integration with Microsoft
tools.
○ Example: Healthcare systems for managing patient records.
5. SQLite:
○ Lightweight, file-based RDBMS often used in mobile and embedded
applications.
○ Example: Local storage for mobile apps.
Limitations of RDBMS
NoSQL Databases
NoSQL (Not Only SQL) databases are non-relational databases designed to handle
unstructured, semi-structured, or structured data. They are optimized for scalability,
flexibility, and high-performance operations, making them suitable for modern
applications like big data analytics, real-time systems, and distributed architectures.
1. Schema Flexibility:
○ Unlike SQL databases, NoSQL databases do not require a predefined
schema.
○ Data structures can evolve without requiring schema migration.
2. Horizontal Scalability:
○ Scales out by adding more servers (nodes) rather than upgrading
hardware.
3. Variety of Data Models:
○ Supports diverse data formats: key-value pairs, documents, graphs, and
column families.
4. High Performance:
○ Optimized for fast read/write operations, especially for large datasets.
5. Eventual Consistency:
○ Prioritizes availability and partition tolerance, often trading off immediate
consistency for speed.
6. Distributed Architecture:
○ Data is spread across multiple nodes, ensuring fault tolerance and
availability.
7. Big Data Compatibility:
○ Suitable for storing and processing large volumes of data.
Types of NoSQL Databases
1. Key-Value Stores
● Description:
○ Simplest NoSQL model where data is stored as key-value pairs.
○ Similar to a dictionary where a unique key maps to a value.
● Examples:
○ Redis: In-memory data store for caching and real-time analytics.
○ Amazon DynamoDB: Scalable, fully-managed NoSQL database.
● Use Cases:
○ Session management, user preferences, caching.
● Example:
○ Key: userID123
○ Value: { "name": "Alice", "age": 30, "city": "New York"
}
2. Document Stores
● Description:
○ Stores data as documents (e.g., JSON, BSON, or XML) with
key-value-like structures.
○ Each document is self-describing, allowing for complex and nested data.
● Examples:
○ MongoDB: Popular open-source document database.
○ Couchbase: Combines document storage with caching capabilities.
● Use Cases:
○ Content management systems, e-commerce catalogs, mobile app data.
Example:
{
"userID": "user123",
"name": "Alice",
"orders": [
{ "orderID": "order1", "amount": 200 },
{ "orderID": "order2", "amount": 350 }
]
}
●
3. Column-Family Stores
● Description:
○ Data is stored in rows and columns, but columns are grouped into
families.
○ Optimized for read/write operations on large datasets.
● Examples:
○ Cassandra: Highly scalable and fault-tolerant NoSQL database.
○ HBase: Built on Hadoop for distributed storage and processing.
● Use Cases:
○ Real-time analytics, IoT data storage, time-series data.
● Example:
○ Row Key: sensor123
○ Column Family: Temperature, Humidity
○ Values: { "Temperature": 30, "Humidity": 60 }
4. Graph Databases
● Description:
○ Designed to store and analyze relationships between entities using nodes,
edges, and properties.
○ Suitable for interconnected data.
● Examples:
○ Neo4j: Popular graph database used for relationship-heavy datasets.
○ Amazon Neptune: Managed graph database service.
● Use Cases:
○ Social networks, fraud detection, recommendation engines.
● Example:
○ Node: User: Alice
○ Edge: FRIEND_OF
○ Connected Node: User: Bob
1. Eventual Consistency:
○ Sacrifices immediate consistency in distributed setups for better
availability.
2. Less Standardization:
○ No unified query language like SQL, which can lead to compatibility
issues.
3. Complexity in Relationships:
○ Not as efficient as relational databases for applications with complex
relationships.
4. Learning Curve:
○ Requires understanding specific NoSQL models and their APIs.
1. MongoDB:
○ Type: Document Store
○ Use Case: Storing e-commerce product catalogs and user profiles.
2. Cassandra:
○ Type: Column-Family Store
○ Use Case: Real-time analytics for social media and IoT data.
3. Redis:
○ Type: Key-Value Store
○ Use Case: Caching for high-speed data retrieval in web applications.
4. Neo4j:
○ Type: Graph Database
○ Use Case: Fraud detection in financial transactions.
5. Amazon DynamoDB:
○ Type: Key-Value/Document Store
○ Use Case: Serverless applications and real-time user activity tracking.
Here is a detailed comparison between SQL and NoSQL databases based on various
factors:
1. Data Structure
● SQL:
○ Fixed Schema: SQL databases are relational and use a structured
schema defined by tables, rows, and columns. Data must follow this
schema, and changes to the schema require migrations.
○ Relational Model: Data is stored in tables with fixed relationships
between them using foreign keys.
● NoSQL:
○ Flexible Schema: NoSQL databases are non-relational and allow a
dynamic schema. Data does not need to follow a predefined structure,
making it easy to store different types of data (unstructured,
semi-structured).
○ Varied Data Models: NoSQL uses different models like key-value pairs,
documents, graphs, and column-family stores.
2. Scalability
● SQL:
○ Vertical Scaling: SQL databases generally scale vertically, meaning that
to increase performance, you would need to upgrade the hardware (e.g.,
more CPU, RAM, or storage) of the existing server.
● NoSQL:
○ Horizontal Scaling: NoSQL databases scale horizontally, which means
adding more servers (nodes) to distribute the load, providing greater
flexibility and scalability across distributed systems.
3. Consistency
● SQL:
○ Strong Consistency (ACID Properties): SQL databases adhere to ACID
(Atomicity, Consistency, Isolation, Durability) properties, ensuring data
consistency and integrity in transactions. This guarantees that the
database is always in a valid state, even in the case of errors or crashes.
● NoSQL:
○ Eventual Consistency: Most NoSQL databases follow the CAP theorem,
prioritizing Availability and Partition Tolerance over immediate
consistency. In this model, updates may not be immediately visible across
all nodes but will eventually reach consistency.
4. Query Language
● SQL:
○ Standardized Query Language (SQL): SQL databases use the
structured query language (SQL), a standardized programming language
for managing relational databases. SQL provides complex querying
capabilities with SELECT, INSERT, UPDATE, DELETE, and joins.
● NoSQL:
○ Custom APIs/Query Languages: NoSQL databases have proprietary
query languages or APIs specific to the type of database (e.g., MongoDB
uses its query language, Redis uses commands). The query capabilities
vary widely depending on the NoSQL model.
● SQL:
○ Data Integrity with Foreign Keys: SQL databases support referential
integrity with foreign keys and enforce constraints to maintain
consistency in relationships between tables. SQL databases are ideal
when complex relationships and transactions are involved.
● NoSQL:
○ Limited Data Integrity: NoSQL databases generally do not enforce
relationships like SQL, and the data model is often flatter. While some
NoSQL databases (like graph databases) support relationships, they are
not as rigorous or structured as in SQL.
6. Transaction Support
● SQL:
○ ACID Transactions: SQL databases support ACID transactions,
ensuring that each transaction is processed reliably and all database
operations are consistent and complete.
● NoSQL:
○ Eventual Consistency & Limited Transactions: NoSQL databases
typically support eventual consistency, with limited support for complex
transactions. Some NoSQL databases provide base (Basically Available,
Soft state, Eventually consistent) properties instead of strict ACID
transactions.
7. Performance
● SQL:
○ Optimized for Structured Data: SQL databases are optimized for
consistent querying of structured data with complex joins and
relationships. They can become slower when handling large volumes of
unstructured or rapidly changing data.
● NoSQL:
○ Optimized for Large Volumes of Unstructured Data: NoSQL databases
excel in handling high-speed read/write operations, especially with
unstructured or semi-structured data. They are built to perform well with
large datasets, including big data and real-time applications.
8. Use Cases
● SQL:
○ Best for Structured Data with Complex Relationships: Ideal for
applications that require structured data with relationships, such as
banking systems, inventory management, and CRM applications.
○ Examples: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.
● NoSQL:
○ Best for Big Data, Real-Time, and Flexible Data Models: Ideal for
applications that require flexible schema, high scalability, and handle large
amounts of semi-structured or unstructured data.
○ Examples: MongoDB, Cassandra, Redis, Couchbase, Neo4j.
9. Flexibility
● SQL:
○ Fixed Schema: Changes to the schema, such as adding or removing
columns, can be complex and require migrations, especially when the
database is in production.
● NoSQL:
○ Highly Flexible: NoSQL databases allow changes to the data model
without affecting existing data, making them ideal for evolving
applications.
1. Structured Data: When your data is structured and fits into predefined tables
and relationships.
2. Complex Queries and Transactions: When your application requires complex
queries (like joins) and transactions with full ACID compliance.
3. Data Integrity: When you need strong data integrity and consistency, especially
for applications like banking, healthcare, or financial systems.
4. Data Relationships: When you need to store and manage complex relationships
between entities (e.g., foreign keys, referential integrity).
Conclusion
The choice between SQL and NoSQL depends on the specific needs of your
application. SQL databases are best for applications that require structured data,
complex relationships, and ACID compliance. NoSQL databases are ideal for scalable,
flexible applications that deal with large, unstructured, or semi-structured data, and
when high availability and performance are critical. Understanding these differences
helps in choosing the right database solution based on data characteristics and
application requirements.