Experiment No 4
Experiment No 4
Experiment No 4
Lab Outcome : Program application using tools like Hive, Pig, NoSQL and MongoDB for Big Data Application.
Date of Submission:21/3/22
Program formation/ Documentation (02) Timely Submission Viva Answer (03) Experiment Marks Teacher Signature
Execution / Ethical (03) (15) with date
practices (07 )
EXPERIMENT NO : 4
THEORY :
What is NoSQL:
NoSQL(Not Only SQL) is a non-relational database management system, different from traditional relational database management systems in some significant ways. It is
designed for distributed data stores where a very large scale of data storing needs These types of data storing may not require fixed schema, avoid join operations and typically
scale horizontally.
Why NoSQL?
In today’s time data is becoming easier to access and capture through third parties such as Facebook, Google+ and others. Personal user information, social graphs, geo location
data, user- generated content and machine logging data are just a few examples where the data has been increasing exponentially. To avail the above service properly, it is
required to process huge amounts of data. Which SQL databases were never designed. The evolution of NoSql databases is to handle these huge data properly.
unstructured data.
● Scales Horizontally: In contrast to SQL databases which scale vertically, NoSQL scales horizontally by adding more servers and using concepts of sharding
and replication. This behavior of NoSQL fits with the cloud computing services such as Amazon Web Services (AWS) which allows you to handle virtual servers which can be
Document oriented databases treat a document as a whole and avoid splitting a document in its constituent name/value pairs. At a collection level, this allows for putting
together a diverse set of documents into a single collection. Document databases allow indexing of documents on the basis of not only its primary identifier but also its
properties. Different open-source document databases are available today but the most prominent among the available options are MongoDB and CouchDB. In fact, MongoDB
free adjacency. This means that every element contains a direct pointer to its adjacent element and no index lookups are necessary. General graph databases that can store any
graph are distinct from specialized graph databases such as triple-stores and network databases. Indexes are used for traversing the graph.
● Column BasedDatabases
The column-oriented storage allows data to be stored effectively. It avoids consuming space when storing nulls by simply not storing a column when a value doesn’t exist for
that column. Each unit of data can be thought of as a set of key/value pairs, where the unit itself is identified with the help of a primary identifier, often referred to as the
primary key. Bigtable and its clones tend to call this primary key the row-key.
● Key ValueDatabases
The key of a key/value pair is a unique value in the set and can be easily looked up to access the data. Key/value pairs are of varied types: some keep the data in memory and
some provide the capability to persist the data to disk. A simple, yet powerful, key/value store is Oracle’s BerkeleyDB.
MongoDB
MongoDB is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on the concept of collection and
document.
● Database: Database is a physical container for collections. Each database gets its own set of files on the file system. A single MongoDB server typically has
multiple databases.
● Collection: Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do
not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.
● Document: A document is a set of key-value pairs. Documents have a dynamic schema. Dynamic schema means that documents in the same collection do not
need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data.
Create a /etc/yum.repos.d/mongodb-org-4.2.repo file so that you can install MongoDB directly using yum:
gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc
Start MongoDB, a default configuration file is installed by yum so you can just run this to start on localhost and the default port 27017
mongod -f /etc/mongod.conf
Download MongoDB from the official MongoDB website. Choose Windows 32 bits or 64 bits.
Step 2 :There are two ways of installation of MongoDB,either download and extract zip file in to C drive or download msi package and Install it by double clicking on setup
for a developer like me who comes from a relational database background. MongoDB needs a folder (data directory) to store its data. By default, it will store
in “C:\data\db“, create this folder manually. MongoDB won’t create it for you. You can also specify an alternate data directory with --dbpath option.
Open command prompt type following command to start to start MongoDB server. c:\mongodb\bin>mongod //--configc:\
mongodb\mongo.config
Step 5 : Connect client to MongoDB Server by typing mongo on another command prompt
CONCLUSION :
MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. MongoDB obviates the need for
OUTPUT :
1. Show Databases:
2. Create Databases:
3. Create Collection:
4. Insert Document:
5. Query Document:
6. Projection:
7. Update Document:
8. Delete Document:
9. Sorting:
Descending Order: