DDMUNIT5

UNIT -5 DATABASE DESIGN AND MANAGEMENT
MONGO DB:
✔ NoSQL (often interpreted as Not only SQL) database
✔ It provides a mechanism for storage and retrieval of data that is modeled in

means other than the tabular relations used in relational databases.
SQL NoSQL
Relational Database Management Non-relational or distributed database

System (RDBMS) system.
These databases have fixed or static or They have dynamic schema
predefined schema
These databases are best suited for These databases are not so good for
complex queries complex queries
Vertically Scalable Horizontally scalable
Follows ACID property Follows BASE property
SQL NOSQL
What is MongoDB?
✔ MongoDB is an open source, document-oriented database designed with both

scalability and developer agility in mind.
✔ Instead of storing your data in tables and rows as you would with a relational
database, in MongoDB you store JSON-like documents with dynamic
schemas(schema-free, schema less).
MongoDB is a Schema Free DB.
MongoDB does not need any pre-defined data schema
Every document could have different data!
MongoDB Architecture
Shar
ding is a method for distributing data across multiple machines.
• Partition your data
• Scale write throughput
• Increase capacity
• Auto-balancing
MongoDB uses sharding to support deployments with very large data sets and high
throughput operations.
Features of MongoDB
1. Document-Oriented storege
2. Full Index Support
3. Replication & High Availability
4. Auto-Sharding
5. Aggregation
6. MongoDB Atlas
7. Various APIs
8. JavaScript, Python, Ruby, Perl, Java, Java, Scala, C#, C++,
Haskell, Erlang
9. Community
MONGODB CRUD OPERATIONS
• Create
• db.collection.insert( <document> )
• db.collection.save( <document> )
• db.collection.update( <query>, <update>, { upsert: true } )
• Read
• db.collection.find( <query>, <projection> )
• db.collection.findOne( <query>, <projection> )
• Update
• db.collection.update( <query>, <update>, <options> )
• Delete
• db.collection.remove( <query>, <justOne> )
REFER : EXAMPLES DONE IN THE CLASS

CAP THEOREM
Consistency: – Sequential consistency (a data item behaves as if there is one copy)

Availability: – Node failures do not prevent survivors from continuing to operate
Partition-tolerance: – The system continues to operate despite network partitions
CAP says that “A distributed system can satisfy any two of these guarantees
at the same time but not all three
C in CAP != C in ACID
They are different!
CAP’s C(onsistency) = sequential consistency
Similar to ACID’s A(tomicity) = Visibility to all future operations
ACID’s C(onsistency) = Does the data satisfy schema constraints
CAP THEOREM PROOF
CASE 1:
HBASE
Apache Hadoop is an open source framework that is used to efficiently store and process
large datasets ranging in size from gigabytes to petabytes of data.
● HBase is an open source, sparse, consistent distributed, sorted map modeled after Google’s
BigTable. ● Began as a project by Powerset to process massive amounts of data for natural language
processing. ● Developed as part of Apache’s Hadoop project and runs on top of Hadoop Distributed
File System.
HBase: HBase is an open source database from Apache that runs on Hadoop cluster. It falls
under the non-relational database management system. Column Oriented. NO SQL DB.
HBase can be used without Hadoop. Running HBase in standalone mode will use the
local file system. Hadoop is just a distributed file system with redundancy and the ability to
scale to very large sizes.
Architecture of HBase
HBase is a Columnar data store, also called Tabular data store. The main
difference of a column-oriented database compared to a row-
oriented database (RBMS) is about how data is stored in disk. Check how
the following table would be serialized using a row-oriented and a column-
oriented approach (Source: Columnar Database, Wikipedia).
EmpI
LastnameFirstnameSalary
d
1 Smith Joe 40000
2 Jones Mary 50000
3 Johnson Cathy 44000
Row-oriented
1,Smith,Joe,40000;
2,Jones,Mary,50000;
3,Johnson,Cathy,44000;
Column-oriented
1,2,3;
Smith,Jones,Johnson;
Joe,Mary,Cathy;
40000,50000,44000;
June 19, 2020 admin 0 Comments hbase update, Hbase commands, Hbase create, hbase read
table, Hbase, Hbase crud
HBase CRUD Operations
General Commands
HBase provides shell commands to directly interact with the Database and below are a few most used
shell commands.
status: This command will display the cluster information and health of the cluster.
1 hbase(main):>status
2 hbase(main):>status "detailed"
version: This will provide information about the version of HBase.
1 hbase(main):> version
whoami : This will list the current user.
1 hbase(main):>whoami
table_help : This will give the reference shell command for HBase.
1 hbase(main):009:>table_help
Create
Let’s create an HBase table and insert data into the table. Now that we know, while creating a table
user needs to create required Column Families.
Here we have created two-column families for table ‘employee’. First Column Family is ‘Personal Info’
and Second Column Family is ‘Professional Info’.
1 create 'employee', 'Personal info', 'Professional Info'

2 0 row(s) in1.4750 seconds
3
4 =>Hbase::Table - employee
Upon successful creation of the table, the shell will return 0 rows.
Create a table with Namespace:

A namespace is nothing but a logical grouping of tables.’company_empinfo’ is the namespace id in the
below command.
1 create 'company_empinfo:employee', 'Personal info', 'Professional Info'
Create a table with version:
By default, versioning is not enabled in HBase. So users need to specify while creating. Given below is
the syntax for creating an HBase table with versioning enabled.
1 create 'tableName',{NAME=>"CF1",VERSIONS=>5},{NAME=."CF2",VERSIONS=>5}
2 create 'bankdetails',{NAME=>"address",VERSIONS=>5}
Put:
Put command is used to insert records into HBase.
1 put 'employee', 1, 'Personal info:empId', 10
2 put 'employee', 1, 'Personal info:Name', 'Alex'
3 put 'employee', 1, 'Professional Info:Dept, 'IT'
Here in the above example all the rows having Row Key as 1 is considered to be one row in HBase.To
add multiple rows

2 put 'employee', 2, 'Personal info:Name', 'Bob'
3 put 'employee', 2, 'Professional Info:Dept', 'Sales'
As discussed earlier, the user can add any number of columns as part of the row.
Read
‘get’ and ‘scan’ command is used to read data from HBase. Lets first discuss ‘get’ operation.
get: ‘get’ operation returns a single row from the HBase table. Given below is the syntax for the ‘get’
method.
1 get 'table Name', 'Row Key'
1 hbase(main):022:get 'employee', 1
COLUMN CELL
Personal info:Name timestamp=1504600767520, value=Alex
Personal info:empId timestamp=1504600767491, value=10
Professional Info:Dept timestamp=1504600767540, value=IT
3 row(s) in 0.0250 seconds
To retrieve a specific column of row:

Follow the command to read a specific column of a row.
1 get 'table Name', 'Row Key',{COLUMN => 'column family:column’}

2 get 'table Name', 'Row Key' {COLUMN => ['c1', 'c2', 'c3']
1 get 'employee', 1 ,{COLUMN => 'Personal info:empId'}
COLUMN CELL
Personal info:Name timestamp=1504600767520, value=Alex
Personal info:empId timestamp=1504600767491, value=10
Professional Info:Dept timestamp=1504600767540, value=IT
Note: Notice that there is a timestamp attached to each cell. These timestamps will update for the cell
whenever the cell value is updated. All the old values will be there but timestamp having the latest
value will be displayed as output.
Get all version of a column
Below given command is used to find different versions. Here ‘VERSIONS => 3’ defines number of
version to be retrieved.
1 get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3}
scan:
‘scan’ command is used to retrieve multiple rows.
Select all:
The below command is an example of a basic search on the entire table.
1 scan 'Table Name'
1 hbase(main):074:> scan 'employee'
ROW COLUMN+CELL
1 column=Personal info:Name, timestamp=1504600767520, value=Alex
1 column=Personal info:empId, timestamp=1504606480934, value=15
1 column=Professional Info:Dept, timestamp=1504600767540, value=IT
2 column=Personal info:Name, timestamp=1504600767588, value=Bob
2 column=Personal info:empId, timestamp=1504600767568, value=20
2 column=Professional Info:Dept, timestamp=1504600768266,
value=Sales
Note: All the Rows are arranged by Row Keys along with columns in each row.
Column Selection:
The below command is used to Scan any particular column.
1 hbase(main):001:>scan 'employee',{COLUMNS => 'Personal info:Name'}
ROW COLUMN+CELL
2 column=Personal info:Name, timestamp=1504600767588, value=Bob
Limit Query:
The below command is used to Scan any particular column.
1 hbase(main):002:>scan 'employee',{COLUMNS => 'Personal info:Name',LIMIT =>1 }
ROW COLUMN+CELL
Update
To update any record HBase uses ‘put’ command. To update any column value, users need to put new
values and HBase will automatically update the new record with the latest timestamp.

The old value will not be deleted from the HBase table. Only the updated record with the latest
timestamp will be shown as query output.
To check the old value of any row use below command.
1 get 'Table Name', 'Row Key', {COLUMN => 'Column Family', VERSIONS => 3}
Delete
‘delete‘ command is used to delete individual cells of a record.
The below command is the syntax of delete command in the HBase Shell.
1 delete'Table Name','Row Key','ColumnFamily:Column'

1 delete'employee',1, 'Personal info:Name'
Drop Table:
To drop any table in HBase, first, it is required to disable the table. The query will return an error if the
user is trying to delete the table without disabling the table. Disable removes the indexes from memory.
The below command is used to disable and drop the table.
1 disable 'employee'
Once the table is disabled, the user can drop using below syntax.
1 drop 'employee'
You can verify the table in using ‘exist’ command and enable table which is already disabled, just use
‘enable’ command.
OBJECT ORIENTED DATABASE / OBJECT RELATIONAL

DATABASE
TYPES AND ROW TYPES IN ORDBMS
REFER EX10 WRITE UP AND EXAMPLES.

DDMUNIT5

Uploaded by

Copyright:

Available Formats

DDMUNIT5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DDMUNIT5

Uploaded by

Copyright:

Available Formats

UNIT -5 DATABASE DESIGN AND MANAGEMENT

✔ NoSQL (often interpreted as Not only SQL) database

✔ It provides a mechanism for storage and retrieval of data that is modeled in

Relational Database Management Non-relational or distributed database

Follows ACID property Follows BASE property

✔ MongoDB is an open source, document-oriented database designed with both

REFER : EXAMPLES DONE IN THE CLASS

Consistency: – Sequential consistency (a data item behaves as if there is one copy)

They are different!

CAP’s C(onsistency) = sequential consistency

Similar to ACID’s A(tomicity) = Visibility to all future operations

ACID’s C(onsistency) = Does the data satisfy schema constraints

CAP THEOREM PROOF

1 create 'employee', 'Personal info', 'Professional Info'

Create a table with Namespace:

1 put 'employee', 2, 'Personal info:empId', 20

To retrieve a specific column of row:

1 get 'table Name', 'Row Key',{COLUMN => 'column family:column’}

Get all version of a column

1 hbase(main):001:>scan 'employee',{COLUMNS => 'Personal info:Name'}

1 hbase(main):002:>scan 'employee',{COLUMNS => 'Personal info:Name',LIMIT =>1 }

1 put 'employee', 1, 'Personal info:empId', 30

To check the old value of any row use below command.

1 delete'Table Name','Row Key','ColumnFamily:Column'

OBJECT ORIENTED DATABASE / OBJECT RELATIONAL

REFER EX10 WRITE UP AND EXAMPLES.

You might also like