Thebattlebetween No SQLDatabasesand RDBMSdatabases
Thebattlebetween No SQLDatabasesand RDBMSdatabases
net/publication/332885811
CITATIONS READS
8 1,391
1 author:
Sourav Mukherjee
University of the Cumberlands
28 PUBLICATIONS 100 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Sourav Mukherjee on 07 May 2019.
Sourav Mukherjee
Abstract
NoSQL is a free and open-source, scattered, extensive column store database management
system intended to handle large amounts of data across many product servers, providing high
obtainability and accessibility with no single point of failure. It is the easiest truly big-data
database that can scale and replicate data globally in a master-less configuration. A NoSQL
database delivers a mechanism for storage and recovery of data that is demonstrated in means
other than the tabular relations used in relational databases. NoSQL databases usually understood
by engineers as ‘not only SQL databases’ neither ‘no SQL’, it is an alternative to the most widely
used relational databases. As the given name proposed, it is a substitute for SQL that uses in such
a way that the SQL is co-existed. A relational database management system (RDBMS) is a
database management system based on the relational model of data and it is a completely
structured way of storing data. But NoSQL is an unstructured way of storing data. Is NoSQL an
RDBMS? This is an article we will discuss various features of NoSQL, various advantages over
RDBMS and the future of NoSQL. This article also reviews the basics of NoSQL and its usages.
Keywords: Apache, Cassandra, SQL, NoSQL, RDBMS, API, MongoDB, CouchDB, DynamodB,
GraphDB
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 2
Introduction
Is NoSQL an RDBMS?
NoSQL is not a relational database management system. NoSQL or ‘Not only SQL’ is a non-
relational database which supports a very simple query language with no fixed schema. NoSQL
is the easiest truly big-data database that can scale and replicate data globally in a master-less
configuration. This database shelters the storage of non-related data. NoSQL databases have a
distributed structure. It can grip of the data in a very high volume an also at high speed. NoSQL
database is organized in a horizontal manner. Few examples of NoSQL database are Cassandra,
RDBMS essentially uses SQL or Structured Query Language. This database supports a
controlling query language with a fixed schema which covers the storage of related data. This
database has a compacted and unified structure and is organized vertically. RDBMS handles a
restrained volume of data at low speed. Few examples are RDBMS, MongoDB, etc.
NoSQL RDBMS
NoSQL is used to deal with unstructured data. RDBMS is used to deal with structured data.
In NoSQL, the row is a unit of replication. In RDBMS, the row is a specific record.
In NoSQL, relationships are described using In RDBMS, there are concepts of primary and
In NoSQL, a table is a list of "nested key-value In RDBMS, a table is an array of arrays. (Row
attributes of a relation.
In NoSQL, tables or column families are the entity In RDBMS, tables are the entities of a
of key-space. database.
In NoSQL, key space is the furthermost container In RDBMS, the database is the furthermost
application. an application.
NoSQL databases have developed in recent years as a response to the limitations of relational
databases and to deliver the performance, scalability, and flexibility essential to modern
applications. Most features of these NoSQL skills differ significantly and have slight in mutual
except for the fact that they do not use a relational data model.
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 4
Key-value
store
database
Object- Column-
Oriented Oriented
Databases Databases
NoSQL
Types
Document
Graph
-Oriented
Databases
Databases
1. Key-value store database: The key-values store database is very well-organized and
communication protocols and tools for building software. The key-value data can be
stored in an appearance that schema may not be needed and data can be stored in tables
based on their data types of the programming language or an object. The data has as its
main feature and divided into two parts, a string which signifies the key and the actual
data which is to be associated as value thus generating a key-value pair. The values are
stored in hash tables where the keys are the indexes which makes it faster than RDBMS.
Hence the data model is very unexacting and easily manageable. The data is stored with
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 5
high extensibility over reliability and so the querying features like joins and collective
processes have been excluded. One of the drawbacks for key-value store database is since
there is no schema, it is difficult to create custom views. Key-values data storage mainly
used in user’s session or shopping cart, to get the list of favorite products saved, websites,
forums, online shopping, etc. Some of the examples of a key-value store database are
• Fast and flexible NoSQL database service for all applications that need consistent,
capacity make it a great fit for mobile, web, gaming, ad tech, IoT, and many other
applications.
management tasks.
• Fine-grained Access Control. It integrates with the AWS Identity and Access
Riak: Riak is a scattered NoSQL key-value data store that advances high accessibility,
version.
• Allocates data across the cluster to ensure fast performance and fault-tolerance.
• Faster reads and writes making it easier to store, query, and analyze time and
location data.
Riak should be evaded for highly centralized data storage projects with secure, fixed data
column instead of rows. Column stores in NoSQL are composite row-column storage,
databases, in NoSQL column-oriented stores, do not store data in tables. It stores data in
provides high flexibility and scalability in data storage. These types of database are
suitable for data mining and analytical applications. Some of the distinguished DBaaS
Cassandra: Apache Cassandra is the easiest truly big-data database that can scale and replicate
data globally in a master-less configuration. What used to be in the hands of only the biggest in
Silicon Valley is now available as a mature database to the masses. Originally created at
Facebook after they studied Amazon's DynamoDB and Google's BigTable whitepapers, the
Cassandra we know today is very different and has far surpassed its ancestors in feature set and
has now become a popular wire-protocol for other databases such as ScyllaDB, YugaByte, and
Azure's CosmosDB.
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 8
Apache Cassandra is written in Java language. Why it is chosen to be written in Java may be
because the security is a prime concern it is developed in Java rather than in C++. Another key
reason could be Performance. It might be slower at the startup, but once the code is ready and in
running state it is way faster as compared to C++. Java code is continuously optimized by the
JVM and in that consideration, it appears faster to C++. It may have other reasons as well such
designed with Apache Cassandra while achieving expressively higher throughput and
largescale scattered cloud services. It also supports APIs which are Cassandra compatible
YugaByte DB core is written in C++, but the repository contains a Java-based code that
• DataStax Enterprise offers Apache Cassandra flavor in a database platform which is built
knowingly for providing performance and availability demands of IOT, Web and Mobile
simple when scaled in a single or across multiple data centers and in the clouds.
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 9
Cassandra and DataStax Enterprise have helped the customers supporting multi-
datacenter and hybrid cloud deployments since the beginning. It is written in Java.
Cassandra is a unique platform to handle huge amount of unstructured data at scale. If you’re
trying to make your relational database faster and reliable, Cassandra may be your ultimate
companion. It combines the Amazon’s Dynamo storage system along with Google’s Bigtable
model, offering the near-constant availability required to support real-time querying for web and
mobile apps.
• It offers easy setup and maintenance (does not matter how big the dataset that you are
setting)
It is hard to find which large-scale organization does not use Cassandra nowadays. When dealing
with distributed databases, it is always the key requirement to identify how the data and the
workload will be distributed. Correspondingly, the data model must be correctly designed. For
example,
The most important point to highlight is even though distributed databases falls under the
category of the database, however, treating this application to behave like a traditional relational
database may incur excessive performance degradation and it may break the application as well.
Bigtable: Goggle’s big table is high performance and compressed data storage system which is
built on Google file system, Chubby Lock Service, SSTable and few other Google technologies.
Bigtable also inspires Google Cloud Datastore, which is accessible as a part of the Google Cloud
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 11
Platform. Bigtable is used by many Google applications such as web indexing, Google Maps,
Google Book search Google Earth, Google Code, YouTube, Gmail, etc. Google has developed
its own database to increase scalability and performance. Google's Spanner RDBMS is covered
on an application of Bigtable with a Paxos group for two-phase commits to each table. Google
Bigtable is one of the perfect examples of a wide column store. Bigtable is designed to add
• Three main mechanisms: the library, the master server, and many tablet servers. The
library is connected to many clients, the master server manages the schema changes,
• It is not a relational database and can be well defined as a light, scattered multi-
• When table size portends to raise beyond a definite limit, the tablets may be compressed
This offers great performance and horizontal scalability choices. The data stores in a form
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 12
of documents and the storage are like records but the data are more flexible as there are
no uses of schemas. Documents can be stored in the form of PDF. JSON, XML etc.
value pairs also known as key-document pairs. Documents may have special characters in
it. It’s easy to fetch the data if documents are portioned across some documents. Some of
open source distributed document database, highly optimized for JSON. Apache
CouchDB: an open source, Erlang based, database with a RESTful HTTP API.
platform as a service product and during 2009 it was initially released as an open source
development model. During 2013 the 10Gen changed the name to MongoDB. C++ was
documents that means the columns may vary document to document and the data
• Easy to work with as the object mapping is done by the document model in the
application code.
• The real-time aggregation, the indexing, and queries give significant ways to
horizontal scaling, and topographical circulation are built in and can be used
easily.
• Absolutely free to use, the versions are released before October 16, 2018, are
available under the AGPL. All versions are released after October 16, 2018,
including patch fixes for prior versions, are published under the Server-Side
Public.
CouchDB: Apache CouchDB is open-source database software that emphases on ease of use and
having a scalable architecture. This also has document-oriented NoSQL database architecture.
CouchDB was first introduced in 2005. Later it became the Apache foundation project in 2008.
never stores data and relationships in tables. Each document preserves its own data and uses its
own schema. Like MongoDB, CouchDB is developed using C++. The data storing can be done
• http-based REST interface using which documents can be easily created and managed.
• Easy to set up with multiple nodes. Easy repetition of a database across multiple server
instances.
• Gets data in the form of JSON format and to store the data it takes only a few minutes.
• The data stores in a format that space is not wasted leaving empty fields in the
documents.
• When the frontend editing option is available, it is possible to set up an application very
• CouchDB has flexible schema designs, fast indexing, and retrieval of data.
Choose the right database for your business based on the below factors -
4. Graph Databases: Graph databases are NoSQL databases which use to store data as a
graph. This data model encompassed of vertices, which is an object such as a person,
place, pertinent section of data and edges, which signify the connection between two
nodes. The graph also entails of possessions related to nodes. The associations permit
data in the store to be connected straightly and mostly recovered with one operation.
Graph databases hold the relations between data as a priority. Querying relations within a
graph database is fast because they are eternally stored within the database itself.
Relations can be instinctively imagined using graph databases, making it beneficial for
deeply inter-connected data. In the graph database, every node consists of a direct pointer
which points to the adjacent node. Millions of records can be traveled using this method.
Graph databases offer schema-less and effective storage of semi-organized data. The
queries are articulated as traversals, therefore creating graph databases are quicker than
relational databases. While the graph model clearly lays out the addictions between nodes
of data, the relational model and other NoSQL database models connect the data by
implicit ways.
Some of the Graph databases and the language it supports are listed below –
• Amazon Neptune
• Oracle Spatial and Graph; part of Oracle Database (Language: Java, PL/SQL)
Amazon Neptune: Fast, Reliable, Fully-managed graph database service that makes it easy to
build and run applications that work with highly connected datasets.
• Fast, reliable, fully-managed graph database service makes it easy to build and run
engine that is optimized for storing billions of relationships and also to query the graph
• Supports open graph APIs for both Gremlin and SPARQL and provides high
compliant.
• Fully Managed. User needn’t worry about database management tasks such as hardware
stored in the form of objects. It is different from the Relational database as this is not
the object-oriented programmers build the product, load them as objects and modify the
existing objects to make a new object. Accessing data in the object-oriented database are
comparatively faster as an object can directly be retrieved from its pointer. Some object-
oriented databases are intended to work fine with object-oriented programming languages
such as C++, C#, Python, Ruby, Delphi, Perl, JavaScript, Java, Visual Basic .NET,
Objective-C, and Smalltalk, etc. This database can be used in the applications relating to
complex object relationships, modifying object structures when the application describes
design are some places where object-oriented databases are mainly used. The scalability
• db4o
• GemStone/S
• InterSystems Caché
• JADE
• ObjectDatabase++
• ObjectDB
• Objectivity/DB
• ObjectStore
• ODABA
• Perst
• OpenLink Virtuoso
• ZODB
distribute effortlessly.
Advantages of NoSQL:
• NoSQL is also very flexible. The scalability and flexibility of NoSQL combined
• Flexible database service for all applications that require consistent, single-digit
• If index data is required, the view is the best option as the view can automatically
index data.
Disadvantages of NoSQL:
• Some of the NoSQL databases are not ACID (Atomicity, Consistency, Isolation,
Durability) competent.
• RDBMS uses a schema that means the structure of the data is known in advance to certify
• Mostly banking sector, retail, payroll uses RDBMS as easy to query the tables. Though it
has its own limits usually Engineers easily understand it and data validation becomes
simple.
Future of Database with NoSQL: Nowadays every organization deals with a massive amount
of records from a variety of sources at revolutionary speeds. Relational databases are sometimes
ineffective for businesses processing and investigating the vast amount of multifaceted and
unstructured data. As NoSQL is schema-less or fixed schema model databases, it is very efficient
to handle a large amount of data and it is set to real-time data accessibility data model. Mostly all
organizations are swamped with loads of data every second from a variety of sources retrieving
from the internet. With this data, validation can be done for making the best or most effective use
it and for making future predictions. Using NoSQL real-time analytics are much faster, response
times can be under a minute for all complex queries. In SQL databases tables are tied with the
primary, foreign keys whereas in NoSQL different data model can be used to deal with the
gigantic amount of data. When a user wants to use key-value pairs, the key-value databases can
be used, for data pointers graph databases can be used, more nodes can be added to the cluster
which is easily scalable instead of using big machines. NoSQL can be used by many advanced
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 21
applications. NoSQL makes machine learning must faster. Thousands of transactions every
second can easily be monitored using NoSQL so it’s useful when working with fraud detections
of banking transactions. Also, tentatively 2.5 quintillion bytes of data that generate from social
media, climate data, innovative pictures and footages, the conversation of data, and that's just the
starting. For these scenarios, numerous kinds of elements e.g. Pictures, Video, and Audio are
integrated and stored in the database. Various NoSQL databases are scalable to handle the
information.
With the growth of Big Data, the operation of NoSQL invention is growing faster among all web
organizations and creativities. Assistance includes elastic design, scaling and well switch over
convenience. NoSQL databases are serving well web organizations to influence their analytic
goals in the fast-paced world. Activists of NoSQL preparations are fast that it delivers faultless
client requirements and varied growing markets. This technology beats outlooks at processing
major unstructured data and the organization joins most popular open source products like
Cassandra, MongoDB, Redis, etc. NoSQL also provides a less-expensive substitute for data load
and retrieval. If we consider the pros and cons for both NoSQL and SQL, the best approach will
be to combine both for additional impulsion research horizons and make it more productive in
the future.
Technology is moving faster, and real-time analytics will help organizations to keep up to date.
NoSQL permits reliable deployment, distributed database swiftly that can gage with the
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 22
organization’s requirements. The article explains various types of NoSQL, its pros and cons, the
References
databases.html
[4] Mukherjee, S. (2019). Popular SQL Server Database Encryption Choices. arXiv preprint
arXiv:1901.03179.
[5] Mukherjee, S. (2019). Benefits of AWS in Modern Cloud. arXiv preprint arXiv:1903.03219.
preprint arXiv:1903.00831.
[8] Fatima, Haleemunnisa & Wasnik, Kumud. (2016). Comparison of SQL, NoSQL and
[9] Chakraborty, Moonmoon & Excellence, Operations. (2019). Supply Chain &
[10] Mukherjee, Sourav. (2019). Overview of the Importance of Corporate Security in business.
10.15680/IJIRSET.2019.0804002.
10.15680/IJIRSET.2019.0803265.
THE BATTLE BETWEEN NOSQL DATABASES AND RDBMS 23
[12] Chakraborty, M. (2019). Fog Computing Vs. Cloud Computing. arXiv preprint
arXiv:1904.04026.
[13] Mukherjee, Sourav. (2019). SQL Server Development Best Practices. International Journal
10.15680/IJIRSET.2019.0803266.
[14] Mukherjee, S. (2019). Indexes in Microsoft SQL Server. arXiv preprint arXiv:1903.08334.
[15] Yoon, Byoung-Ha; Kim, Seon-Kyu; Kim, Seon-Young (March 2017). "Use of Graph
[16] Author Craig Kerstiens (April 4, 2019), retrieve from Postgres and superuser access
https://www.citusdata.com/blog/
[17] Chakraborty, Moonmoon. (2019). Planning, Control Systems and Lean Operations in
[18] Chakraborty, Moonmoon. (2019). Managing Risk, Recovery & Project Management.
10.6084/m9.figshare.7886141.
Healthcare. 10.6084/m9.figshare.7886144.
[20] Adhikari, Mainak & Kar, Sukhendu. (2014). NoSQL databases. 109-152. 10.4018/978-1-
4666-6559-0.ch006.
AUTHOR’S PROFILE
Sourav Mukherjee is a Senior Database Administrator and Data Architect based out of Chicago.
He has more than 12 years of experience working with Microsoft SQL Server Database
Platform. His work focusses in Microsoft SQL Server started with SQL Server 2000. Being a
consultant architect, he has worked with different Chicago based clients. He has helped many
companies in designing and maintaining their high availability solutions, developing and
designing appropriate security models and providing query tuning guidelines to improve the
overall SQL Server health, performance and simplifying the automation needs. He is passionate
about SQL Server Database and the related community and contributing to articles in different
SQL Server Public sites and Forums helping the community members. He holds a bachelor's
Management. Currently pursuing Ph.D. In Information Technology from the University of the
Cumberlands. His areas of research interest include RDBMS, distributed database, Cloud
Security, AI and Machine Learning. He is an MCT (Microsoft Certified Trainer) since 2017 and
holds other premier certifications such as MCP, MCTS, MCDBA, MCITP, TOGAF, Prince2,