Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

SD(Introduction to DBMS and RDBMS, SQL vs NoSQL) NOTES (1)

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

SD(Introduction to DBMS and RDBMS, SQL vs NoSQL) NOTES (1)

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Introduction to DBMS (Database Management System)

What is a Database?

● A structured collection of data stored in a way that facilitates easy access,


management, and updating.
● Examples:
○ Library database: Stores information about books, authors, and
borrowers.
○ Bank database: Maintains account holders' details, transactions, and
balances.

What is DBMS?

● Software that interacts with databases, users, and applications to efficiently


store, retrieve, and manipulate data.
● Examples of DBMS:
○ Relational DBMS: MySQL, PostgreSQL, Oracle Database.
○ NoSQL DBMS: MongoDB, Cassandra.

Difference Between Database and File System

1. Definition: A database is a structured collection of data managed by a DBMS,


while a file system is a method of storing raw data in files on a storage medium.
2. Data Organization: A database organizes data into tables, rows, columns, and
relationships, whereas a file system stores data in flat files with no inherent
structure.
3. Data Redundancy: Databases minimize redundancy using techniques like
normalization, while file systems often have duplicate data.
4. Data Integrity: A DBMS enforces rules to ensure consistency and accuracy,
while a file system lacks built-in integrity mechanisms.
5. Access Mechanism: Databases use query languages like SQL for efficient data
retrieval, while file systems require manual navigation or specific APIs.
6. Concurrency: DBMS supports simultaneous data access by multiple users,
while file systems struggle with concurrency.
7. Security: Databases provide advanced security features like role-based access
control, while file systems rely on basic permissions.
8. Backup and Recovery: DBMS includes automated tools for recovery, whereas
file systems require manual backup processes.
9. Scalability: Databases scale better with large datasets and users, while file
systems are less efficient as data grows.

Key Features of DBMS

1. Data Abstraction: Simplifies complex data structures by presenting different


views (e.g., physical, logical).
2. Data Independence: Enables changes in data storage without affecting
application programs.
3. Security: Restricts access to authorized users using authentication and
encryption.
4. Concurrency Control: Allows multiple users to access data simultaneously
without conflicts.
5. Backup and Recovery: Provides mechanisms to restore data after system
failures.
6. Data Integrity: Maintains the accuracy and consistency of stored data using
constraints.
7. Transaction Management: Ensures reliable execution of database transactions
adhering to ACID properties.
8. Data Sharing: Facilitates sharing of data among multiple users or applications.

Types of DBMS

1. Hierarchical DBMS:
○ Organizes data in a tree-like structure, with each record having one
parent.
○ Example: IBM IMS.
○ Real-World Use: File directories.
2. Network DBMS:
○ Uses a graph structure to represent many-to-many relationships.
○ Example: Integrated Data Store (IDS).
○ Real-World Use: Airline reservation systems.
3. Relational DBMS:
○ Stores data in tables (rows and columns) and uses keys to establish
relationships.
○ Example: MySQL, Oracle Database.
○ Real-World Use: Banking and e-commerce systems.
4. Object-Oriented DBMS:
○ Stores data as objects, supporting features like inheritance and
encapsulation.
○ Example: ObjectDB.
○ Real-World Use: CAD/CAM systems.
5. NoSQL DBMS:
○ Designed for unstructured or semi-structured data with high scalability.
○ Example: MongoDB, Cassandra.
○ Real-World Use: Social media platforms.

Components of DBMS

1. Database: The collection of data itself.


2. DBMS Software: Provides the tools and interface to manage the database.
3. Hardware: The physical infrastructure like servers and storage devices.
4. Users:
○ End Users: Interact with the database through applications.
○ DBA (Database Administrator): Maintains and secures the database.
○ Application Developers: Build software to access the database.
5. Query Processor: Converts user queries into instructions for data retrieval.
6. Transaction Manager: Ensures all operations in a transaction are completed
successfully.
7. Metadata: Data that describes the structure, relationships, and constraints of the
database.

Advantages of DBMS

1. Efficient Data Management: Organizes data systematically for faster retrieval.


2. Data Security: Restricts unauthorized access with robust mechanisms.
3. Minimized Redundancy: Reduces duplication using normalization.
4. Improved Data Integrity: Enforces rules to maintain accuracy and consistency.
5. Concurrent Access: Allows multiple users to access data simultaneously.
6. Backup and Recovery: Protects against data loss due to system failures.
7. Scalability: Handles growing datasets and user demands effectively.

Challenges of DBMS
1. Cost: High initial investment in hardware, software, and skilled personnel.
2. Complexity: Requires expertise for database design and maintenance.
3. Performance Overhead: Additional processing layers can impact speed.
4. System Failures: Despite backup mechanisms, hardware or software failures
can still disrupt operations.

Relational Database Management System (RDBMS)

A Relational Database Management System (RDBMS) is a type of DBMS based on


the relational model introduced by E.F. Codd in 1970. In this model, data is organized
into tables (relations) consisting of rows and columns. Each table is uniquely identified
by a name, and relationships between tables are established through keys.

Key Features of RDBMS

1. Tabular Structure:
○ Data is stored in tables, where each row represents a record, and each
column represents a field (attribute).
○ Example: A "Students" table with columns StudentID, Name, Age, and
Grade.
2. Primary Key:
○ A unique identifier for each record in a table.
○ Example: In a "Students" table, the StudentID can serve as the primary
key.
3. Foreign Key:
○ A column in one table that refers to the primary key of another table to
establish a relationship.
○ Example: In an "Enrollments" table, the StudentID column links to the
"Students" table.
4. Data Integrity:
○ Maintains the accuracy and consistency of data through constraints like
Primary Key, Foreign Key, and Unique constraints.
5. SQL (Structured Query Language):
○ Provides a standard language to query, manipulate, and manage
relational databases.
○ Example:
■ SELECT * FROM Students WHERE Age > 18; retrieves all
students older than 18.
6. Normalization:
○ A process to reduce data redundancy and ensure data dependency by
dividing a database into smaller, related tables.
7. ACID Properties:
○ Ensures reliable transactions:
■ Atomicity: Transactions are all-or-nothing.
■ Consistency: Database remains consistent before and after the
transaction.
■ Isolation: Transactions are executed independently.
■ Durability: Completed transactions are saved permanently.

Advantages of RDBMS

1. Data Consistency and Integrity:


○ Built-in constraints ensure accurate and consistent data.
2. Data Redundancy Reduction:
○ Normalization eliminates duplicate data.
3. Ease of Use:
○ SQL makes querying and managing data straightforward.
4. Scalability:
○ Suitable for large datasets with multiple users.
5. Data Security:
○ Role-based access control ensures restricted data access.
6. Multi-User Access:
○ Handles simultaneous operations without data corruption.
7. Backup and Recovery:
○ Facilitates data recovery in case of system failure.

Components of RDBMS

1. Tables:
○ The fundamental unit of storage, organized into rows (records) and
columns (fields).
2. Keys:
○ Primary Key: Unique identifier for a table.
○ Foreign Key: Links one table to another.
3. Indexes:
○ Enhance the speed of data retrieval operations.
4. Schemas:
○ Define the structure of the database (tables, columns, relationships).
5. SQL:
○ The query language used to interact with the database.

Examples of RDBMS

1. MySQL:
○ Open-source RDBMS widely used for web applications.
○ Example: Managing e-commerce data for products, orders, and
customers.
2. PostgreSQL:
○ Advanced open-source RDBMS supporting complex queries.
○ Example: Data warehousing and analytics.
3. Oracle Database:
○ Enterprise-level RDBMS with robust performance and scalability.
○ Example: Banking systems for managing accounts and transactions.
4. Microsoft SQL Server:
○ RDBMS designed for enterprise applications and integration with Microsoft
tools.
○ Example: Healthcare systems for managing patient records.
5. SQLite:
○ Lightweight, file-based RDBMS often used in mobile and embedded
applications.
○ Example: Local storage for mobile apps.

Limitations of RDBMS

1. Complexity in Large Systems:


○ Difficult to manage when data relationships become overly complex.
2. Performance Issues:
○ Slower for unstructured or semi-structured data compared to NoSQL
databases.
3. Scalability Challenges:
○ Vertical scaling (adding more resources to one server) is limited compared
to horizontal scaling in NoSQL.
4. Rigid Schema:
○ Requires predefined structure, making it less flexible for dynamic datasets.

NoSQL Databases

NoSQL (Not Only SQL) databases are non-relational databases designed to handle
unstructured, semi-structured, or structured data. They are optimized for scalability,
flexibility, and high-performance operations, making them suitable for modern
applications like big data analytics, real-time systems, and distributed architectures.

Key Characteristics of NoSQL Databases

1. Schema Flexibility:
○ Unlike SQL databases, NoSQL databases do not require a predefined
schema.
○ Data structures can evolve without requiring schema migration.
2. Horizontal Scalability:
○ Scales out by adding more servers (nodes) rather than upgrading
hardware.
3. Variety of Data Models:
○ Supports diverse data formats: key-value pairs, documents, graphs, and
column families.
4. High Performance:
○ Optimized for fast read/write operations, especially for large datasets.
5. Eventual Consistency:
○ Prioritizes availability and partition tolerance, often trading off immediate
consistency for speed.
6. Distributed Architecture:
○ Data is spread across multiple nodes, ensuring fault tolerance and
availability.
7. Big Data Compatibility:
○ Suitable for storing and processing large volumes of data.
Types of NoSQL Databases

1. Key-Value Stores

● Description:
○ Simplest NoSQL model where data is stored as key-value pairs.
○ Similar to a dictionary where a unique key maps to a value.
● Examples:
○ Redis: In-memory data store for caching and real-time analytics.
○ Amazon DynamoDB: Scalable, fully-managed NoSQL database.
● Use Cases:
○ Session management, user preferences, caching.
● Example:
○ Key: userID123
○ Value: { "name": "Alice", "age": 30, "city": "New York"
}

2. Document Stores

● Description:
○ Stores data as documents (e.g., JSON, BSON, or XML) with
key-value-like structures.
○ Each document is self-describing, allowing for complex and nested data.
● Examples:
○ MongoDB: Popular open-source document database.
○ Couchbase: Combines document storage with caching capabilities.
● Use Cases:
○ Content management systems, e-commerce catalogs, mobile app data.

Example:
{
"userID": "user123",
"name": "Alice",
"orders": [
{ "orderID": "order1", "amount": 200 },
{ "orderID": "order2", "amount": 350 }
]
}

3. Column-Family Stores

● Description:
○ Data is stored in rows and columns, but columns are grouped into
families.
○ Optimized for read/write operations on large datasets.
● Examples:
○ Cassandra: Highly scalable and fault-tolerant NoSQL database.
○ HBase: Built on Hadoop for distributed storage and processing.
● Use Cases:
○ Real-time analytics, IoT data storage, time-series data.
● Example:
○ Row Key: sensor123
○ Column Family: Temperature, Humidity
○ Values: { "Temperature": 30, "Humidity": 60 }

4. Graph Databases

● Description:
○ Designed to store and analyze relationships between entities using nodes,
edges, and properties.
○ Suitable for interconnected data.
● Examples:
○ Neo4j: Popular graph database used for relationship-heavy datasets.
○ Amazon Neptune: Managed graph database service.
● Use Cases:
○ Social networks, fraud detection, recommendation engines.
● Example:
○ Node: User: Alice
○ Edge: FRIEND_OF
○ Connected Node: User: Bob

Advantages of NoSQL Databases


1. Schema Flexibility:
○ Adapts easily to changing data requirements without downtime.
2. Scalability:
○ Handles large-scale applications by scaling horizontally.
3. High Availability:
○ Distributed design ensures minimal downtime and fault tolerance.
4. Efficient for Big Data:
○ Handles massive volumes of unstructured and semi-structured data.
5. Diverse Data Models:
○ Supports a wide range of applications with specialized data formats.

Limitations of NoSQL Databases

1. Eventual Consistency:
○ Sacrifices immediate consistency in distributed setups for better
availability.
2. Less Standardization:
○ No unified query language like SQL, which can lead to compatibility
issues.
3. Complexity in Relationships:
○ Not as efficient as relational databases for applications with complex
relationships.
4. Learning Curve:
○ Requires understanding specific NoSQL models and their APIs.

Popular NoSQL Databases and Real-World Use Cases

1. MongoDB:
○ Type: Document Store
○ Use Case: Storing e-commerce product catalogs and user profiles.
2. Cassandra:
○ Type: Column-Family Store
○ Use Case: Real-time analytics for social media and IoT data.
3. Redis:
○ Type: Key-Value Store
○ Use Case: Caching for high-speed data retrieval in web applications.
4. Neo4j:
○ Type: Graph Database
○ Use Case: Fraud detection in financial transactions.
5. Amazon DynamoDB:
○ Type: Key-Value/Document Store
○ Use Case: Serverless applications and real-time user activity tracking.

SQL vs NoSQL: Key Differences

Here is a detailed comparison between SQL and NoSQL databases based on various
factors:

1. Data Structure

● SQL:
○ Fixed Schema: SQL databases are relational and use a structured
schema defined by tables, rows, and columns. Data must follow this
schema, and changes to the schema require migrations.
○ Relational Model: Data is stored in tables with fixed relationships
between them using foreign keys.
● NoSQL:
○ Flexible Schema: NoSQL databases are non-relational and allow a
dynamic schema. Data does not need to follow a predefined structure,
making it easy to store different types of data (unstructured,
semi-structured).
○ Varied Data Models: NoSQL uses different models like key-value pairs,
documents, graphs, and column-family stores.

2. Scalability

● SQL:
○ Vertical Scaling: SQL databases generally scale vertically, meaning that
to increase performance, you would need to upgrade the hardware (e.g.,
more CPU, RAM, or storage) of the existing server.
● NoSQL:
○ Horizontal Scaling: NoSQL databases scale horizontally, which means
adding more servers (nodes) to distribute the load, providing greater
flexibility and scalability across distributed systems.

3. Consistency

● SQL:
○ Strong Consistency (ACID Properties): SQL databases adhere to ACID
(Atomicity, Consistency, Isolation, Durability) properties, ensuring data
consistency and integrity in transactions. This guarantees that the
database is always in a valid state, even in the case of errors or crashes.
● NoSQL:
○ Eventual Consistency: Most NoSQL databases follow the CAP theorem,
prioritizing Availability and Partition Tolerance over immediate
consistency. In this model, updates may not be immediately visible across
all nodes but will eventually reach consistency.

4. Query Language

● SQL:
○ Standardized Query Language (SQL): SQL databases use the
structured query language (SQL), a standardized programming language
for managing relational databases. SQL provides complex querying
capabilities with SELECT, INSERT, UPDATE, DELETE, and joins.
● NoSQL:
○ Custom APIs/Query Languages: NoSQL databases have proprietary
query languages or APIs specific to the type of database (e.g., MongoDB
uses its query language, Redis uses commands). The query capabilities
vary widely depending on the NoSQL model.

5. Data Integrity & Relationships

● SQL:
○ Data Integrity with Foreign Keys: SQL databases support referential
integrity with foreign keys and enforce constraints to maintain
consistency in relationships between tables. SQL databases are ideal
when complex relationships and transactions are involved.
● NoSQL:
○ Limited Data Integrity: NoSQL databases generally do not enforce
relationships like SQL, and the data model is often flatter. While some
NoSQL databases (like graph databases) support relationships, they are
not as rigorous or structured as in SQL.

6. Transaction Support

● SQL:
○ ACID Transactions: SQL databases support ACID transactions,
ensuring that each transaction is processed reliably and all database
operations are consistent and complete.
● NoSQL:
○ Eventual Consistency & Limited Transactions: NoSQL databases
typically support eventual consistency, with limited support for complex
transactions. Some NoSQL databases provide base (Basically Available,
Soft state, Eventually consistent) properties instead of strict ACID
transactions.

7. Performance

● SQL:
○ Optimized for Structured Data: SQL databases are optimized for
consistent querying of structured data with complex joins and
relationships. They can become slower when handling large volumes of
unstructured or rapidly changing data.
● NoSQL:
○ Optimized for Large Volumes of Unstructured Data: NoSQL databases
excel in handling high-speed read/write operations, especially with
unstructured or semi-structured data. They are built to perform well with
large datasets, including big data and real-time applications.

8. Use Cases
● SQL:
○ Best for Structured Data with Complex Relationships: Ideal for
applications that require structured data with relationships, such as
banking systems, inventory management, and CRM applications.
○ Examples: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.
● NoSQL:
○ Best for Big Data, Real-Time, and Flexible Data Models: Ideal for
applications that require flexible schema, high scalability, and handle large
amounts of semi-structured or unstructured data.
○ Examples: MongoDB, Cassandra, Redis, Couchbase, Neo4j.

9. Flexibility

● SQL:
○ Fixed Schema: Changes to the schema, such as adding or removing
columns, can be complex and require migrations, especially when the
database is in production.
● NoSQL:
○ Highly Flexible: NoSQL databases allow changes to the data model
without affecting existing data, making them ideal for evolving
applications.

When to Use SQL vs NoSQL

When to Use SQL:

1. Structured Data: When your data is structured and fits into predefined tables
and relationships.
2. Complex Queries and Transactions: When your application requires complex
queries (like joins) and transactions with full ACID compliance.
3. Data Integrity: When you need strong data integrity and consistency, especially
for applications like banking, healthcare, or financial systems.
4. Data Relationships: When you need to store and manage complex relationships
between entities (e.g., foreign keys, referential integrity).

When to Use NoSQL:


1. Unstructured or Semi-Structured Data: When you have large amounts of
unstructured, semi-structured, or rapidly evolving data (e.g., JSON, XML).
2. Horizontal Scaling: When you need a highly scalable system to handle large
data volumes across distributed environments.
3. Real-Time Applications: For applications that require real-time data processing,
such as gaming, social media, or IoT.
4. Flexibility: When the application requires fast iterations and changes to the
schema without downtime or migrations.
5. High Availability and Fault Tolerance: When you need a system that can
tolerate partitioning and maintain high availability, even in distributed networks.

Conclusion

The choice between SQL and NoSQL depends on the specific needs of your
application. SQL databases are best for applications that require structured data,
complex relationships, and ACID compliance. NoSQL databases are ideal for scalable,
flexible applications that deal with large, unstructured, or semi-structured data, and
when high availability and performance are critical. Understanding these differences
helps in choosing the right database solution based on data characteristics and
application requirements.

You might also like