A Performance Comparison of SQL and NoSQL Databases
A Performance Comparison of SQL and NoSQL Databases
databases
Yishan Li and Sathiamoorthy Manoharan
Department of Computer Science
University of Auckland
New Zealand
Abstract—With the current emphasis on “Big Data”, NoSQL (such as that of Tudorica and Bucur [5] or Han et al. [6]) have
databases have surged in popularity. These databases are claimed therefore been published to address some of these questions.
to perform better than SQL databases. In this paper we aim to In addition, there are several online resources and blogs
independently investigate the performance of some NoSQL and
SQL databases in the light of key-value stores. We compare addressing these aspects as well.
read, write, delete, and instantiate operations on key-value The focus of our paper is to compare the key-value
stores implemented by NoSQL and SQL databases. Besides, we stores implementations on NoSQL and SQL databases. While
also investigate an additional operation: iterating through all NoSQL databases are generally designed for optimized key-
keys. An abstract key-value pair framework supporting these value stores, SQL databases are not. Yet, our findings sug-
basic operations is designed and implemented using all the
databases tested. Experimental results measure the timing of gest that not all NoSQL databases perform better than SQL
these operations and we summarize our findings of how the databases. We compare read, write, delete, and instantiate op-
databases stack up against each other. Our results show that erations on the key-value storage. We observe that even within
not all NoSQL databases perform better than SQL databases. NoSQL databases there is a wide variation in the performance
Some are much worse. And for each database, the performance of these operations. We also observe little correlation between
varies with each operation. Some are slow to instantiate, but fast
to read, write, and delete. Others are fast to instantiate but slow performance and the data model each database uses.
on the other operations. And there is little correlation between The rest of this paper is organized as follows. Section II
performance and the data model each database uses. introduces a selection of NoSQL databases. Section III reviews
Keywords-Database performance, SQL, NoSQL databases. related work. In particular, it looks at other surveys comparing
NoSQL offerings. Section IV discusses our experimental setup
I. I NTRODUCTION and what we evaluate. Section V presents and discusses
our experimental results. The final section concludes with a
Traditional database systems for storage have been based summary.
on the relational model. These are widely known as SQL
databases named after the language they were queried by [1]. II. N O SQL DATABASES
In the last few years, however, non-relational databases have It is widely believed that Google’s BigTable [3] was the
dramatically risen in popularity. These databases are com- first of the NoSQL databases. BigTable is based on three
monly known as NoSQL databases, clearly marking them keys: first one called the row key, second called the column
different from the traditional SQL databases. Most of these key, and the third one is the timestamp. This is effectively
are based on storing simple key-value pairs on the premise a multidimensional map. Column keys can be classified into
that simplicity leads to speed. groups. A group is accessed as a single unit.
With the increase in accessibility of Internet and the avail- The success of proprietary non-relational databases such as
ability of cheap storage, huge amounts of structured, semi- BigTable and Amazon’s Dynamo [7] initiated a number of
structured, and unstructured data are captured and stored for other open-source and closed-source non-relational database
a variety of applications. Such data is commonly referred developments. These NoSQL databases grew in popularity
to as Big Data [2]. Processing such vast amount of data because of the ease of access, speed, and scalability.
requires speed, flexible schemas, and distributed (i.e. non- Most of the NoSQL databases are based on storing key-
centralized) databases. NoSQL databases became the preferred value pairs. It is possible that the values could be a set of
currency for operating Big Data they claim to satisfy these secondary keys which in turn contain values.
requirements. This also lead to a surge in the number of A special type of a key-value pair database is a column-
NoSQL database offerings. There are several commercial and family database. This consists of columns and super columns,
open-source implementations of NoSQL databases (such as addressable with keys. A super column groups a number of
BigTable [3] and HBase [4]). related columns and is accessed as a single unit.
The large number of NoSQL offerings then leads to ques- The other special type of key-value pair database is a
tions on differences between these offerings and their suit- document-oriented database. Document-oriented databases go
ability in particular applications. A number of survey papers beyond simple values and have the ability to store objects. The
16
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on October 19,2022 at 22:21:07 UTC from IEEE Xplore. Restrictions apply.
it updates the value for the given key in the storage. 1800
This operation therefore combines Create and Update
1600
operations of the CRUD model.
4) Delete. This deletes the record (i.e. key-value pair) 1400
ge time (ms)
1200
storage. This is the same as the Delete operation of the
1000
CRUD model.
In addition to the above fundamental operations, there are 800
Averag
two supplementary operations that are commonly used: iter- 600
ating through all the keys and iterating through all values. To
400
enable testing these, we time one more operation: GetAllkeys
which fetches all the keys from the strorage. Note that, in 200
TABLE I TABLE II
V ERSION DETAIL OF DATABASE IMPLEMENTATIONS T IME FOR READING ( MS )
Table I illustrates the version detail of the database imple- Sorted by read performance we have the list of databases:
mentations that are experimented with in this paper. For each Couchbase, MongoDB, SQL Express, Hypertable, CouchDB,
of these databases, we run the chosen operation (such as Read Cassandra and RavenDB. Of these Cassandra and Hyper-
or Write) five times and take the average time. table are column-family databases; and Couchbase, MongoDB,
The data set for the experiment are auto-generated key-value CouchDB, and RavenDB are document-oriented databases.
pairs that are of the form (𝑘𝑁 , 𝑣𝑁 ) where 𝑁 is a sequence There is no observable correlation between the data model
number. and performance. We also see that the read performance of
SQL Express is better than some, but not all, of the NoSQL
V. R ESULTS AND E VALUATION databases.
Our first experiment measures the time taken to instantiate Our third experiment measures the time taken to write key-
a database bucket. See Figure 2 which summarizes the results value pairs to the bucket. If the key-value pair already exists
of this experiment. in the bucket, this amounts to updating the existing value.
17
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on October 19,2022 at 22:21:07 UTC from IEEE Xplore. Restrictions apply.
Number of keys to fetch
Otherwise it amounts to adding the key-value pair to the Database
10 50 100 1000 10000 100000
bucket. Table III summarizes the result. MongoDB 4 4 5 19 98 702
RavenDB 101 113 115 116 136 591
Number of operations CouchDB 67 196 19 173 1063 9512
Database
10 50 100 1000 10000 100000 Cassandra 47 50 55 76 237 709
MongoDB 61 75 84 387 2693 23354 Hypertable 3 3 3 5 25 159
RavenDB 570 898 1213 6939 71343 740450 MS SQL Express 4 4 4 4 11 76
CouchDB 90 374 616 6211 67216 932038
TABLE V
Cassandra 117 160 212 1200 9801 88197
T IME FOR FETCHING ALL KEYS ( MS )
Hypertable 55 90 184 1035 10938 114872
Couchbase 60 76 63 142 936 8492
MS SQL Express 30 94 129 1790 15588 216479
TABLE III
T IME FOR WRITING ( MS ) Fetching all values would take a similar amount of time to
fetching all keys so long as the values are small in size (i.e.
comparabable to the size of the keys).
Sorted by write performance we have the list of databases:
Couchbase, MongoDB, Cassandra, Hypertable, SQL Express, VI. S UMMARY AND C ONCLUSION
RavenDB, and CouchDB. We see that the write performance This paper compares key-value store implementations on
of RavenDB and CouchDB is worse than that of SQL Express. NoSQL and SQL databases. While NoSQL databases are
But other NoSQL databases perform better than SQL express. generally optimized for key-value stores, SQL databases are
Our fourth experiment measures the time taken to delete not. Yet, we find that not all NoSQL databases perform better
key-value pairs from the bucket. Table IV summarizes the than the SQL database we tested. We observe that even within
result. NoSQL databases there is a wide variation in performance
based on the type of operation (such as read and write). We
Number of operations
Database
10 50 100 1000 10000 100000
also observe little correlation between performance and the
MongoDB 4 15 29 235 2115 18688
data model each database uses.
RavenDB 90 499 809 8342 87562 799409 Of the NoSQL databases RavenDB and CouchDB do not
CouchDB 71 260 597 5945 67952 705684 perform well in the read, write and delete operations. Casandra
Cassandra 33 95 130 1061 9230 83694
Hypertable 19 63 110 1001 10324 130858 is slow on read operations, but is reasonably good for write
Couchbase 6 12 14 81 805 7634 and delete operations. Couchbase and MongoDB are the fastest
MS SQL Express 11 32 57 360 3571 32741 two overall for read, write and delete operations. Couchbase,
TABLE IV however, does not support fetching all the keys (or values).
T IME FOR DELETING ( MS ) If iterating through keys and values is not required for an
application, then Couchbase will be a good choice. Otherwise
one may choose MongoDB who comes the close second to
Sorted by delete performance we have the list of databases: Couchbase in the read, write, and delete operations.
Couchbase, MongoDB, SQL Express,Cassandra, Hypertable, Note that we did not test the databases for more complex
CouchDB and RavenDB. We see that the delete performance operations. The database rankings we noted may not hold
of SQL Express is better than that of all NoSQL databases but when it comes to complex operations.
Couchbase and MongoDB. Like any application software, NoSQL implementations go
Boicea et al. reported that, for 100000 records, insertion through changes and thus performance improvements and
time is a factor more in Oracle than in MongoDB and update degradations are likely to happen through these changes.
and delete times are several factors more in Oracle [18]. Consequently, one will need to compare databases not only
We did not observe such large performance gaps between at the application design stage but also at regular intervals to
MongoDB7 and SQL Express in our experiments here. enable switching to the most suitable database implementation.
Our final experiment measures the time taken to fetch all Tiered software development processes would help isolate the
the keys in the bucket. Table V summarizes the result. database backend to facilitate such switching when deemed
Except for CouchDB, all databases are quick to fetch the necessary.
keys. SQL Express was the fastest of all. Couchbase has no
API to support fetching all the keys, and thus this was excluded R EFERENCES
in the experiment. [1] K. Kline, SQL in a nutshell, 3rd ed. O’Reilly Media, November 2008.
Note that fetching all the keys is substantially faster than [2] P. Warden, Big Data Glossary. O’Reilly Media, September 2011.
[3] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Bur-
reading values one after the other. This difference obviously rows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: a distributed
is due to the database connection overhead. storage system for structured data,” in Proceedings of the 7th USENIX
Symposium on Operating Systems Design and Implementation - Volume
7 Boicea et al. did not include the versions of the databases tested, and 7, ser. OSDI ’06. Berkeley, CA, USA: USENIX Association, 2006,
therefore we do not therefore know if our version of MongoDB is different pp. 15–15.
to theirs or not. [4] L. George, HBase: The Definitive Guide. O’Reilly Media, August 2011.
18
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on October 19,2022 at 22:21:07 UTC from IEEE Xplore. Restrictions apply.
[5] B. Tudorica and C. Bucur, “A comparison between several NoSQL
databases with comments and notes,” in Roedunet International Con-
ference (RoEduNet), 2011 10th, june 2011, pp. 1 –5.
[6] J. Han, E. Haihong, G. Le, and J. Du, “Survey on NoSQL database,” in
Pervasive Computing and Applications (ICPCA), 2011 6th International
Conference on, oct. 2011, pp. 363 –366.
[7] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman,
A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, “Dynamo:
Amazon’s highly available key-value store,” SIGOPS Oper. Syst. Rev.,
vol. 41, no. 6, pp. 205–220, Oct. 2007.
[8] K. Chodorow and M. Dirolf, MongoDB: The Definitive Guide. O’Reilly
Media, September 2010.
[9] J. C. Anderson, J. Lehnardt, and N. Slater, CouchDB: The Definitive
Guide. O’Reilly Media, January 2010.
[10] E. Hewitt, Cassandra: The Definitive Guide. O’Reilly Media, Novem-
ber 2010.
[11] M. Brown, Getting Started with Couchbase Server. O’Reilly Media,
June 2012.
[12] N. Leavitt, “Will NoSQL databases live up to their promise?” Computer,
vol. 43, no. 2, pp. 12 –14, feb. 2010.
[13] D. Bartholomew, “SQL vs. NoSQL,” Linux Journal, no. 195, July 2010.
[14] S. Sakr, A. Liu, D. Batista, and M. Alomari, “A survey of large scale
data management approaches in cloud environments,” Communications
Surveys Tutorials, IEEE, vol. 13, no. 3, pp. 311–336, 2011.
[15] S. Tiwari, Professional NoSQL. Wiley/Wrox, August 2011.
[16] M. Indrawan-Santiago, “Database research: Are we at a crossroad?
Reflection on NoSQL,” in Network-Based Information Systems (NBiS),
2012 15th International Conference on, sept. 2012, pp. 45 –51.
[17] R. Hecht and S. Jablonski, “NoSQL evaluation: A use case oriented
survey,” in Cloud and Service Computing (CSC), 2011 International
Conference on, dec. 2011, pp. 336 –341.
[18] A. Boicea, F. Radulescu, and L. I. Agapin, “MongoDB vs Oracle –
database comparison,” in Emerging Intelligent Data and Web Technolo-
gies (EIDWT), 2012 Third International Conference on, sept. 2012, pp.
330 –335.
[19] B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears,
“Benchmarking cloud serving systems with ycsb,” in Proceedings of the
1st ACM symposium on Cloud computing, ser. SoCC ’10. ACM, 2010,
pp. 143–154.
19
Authorized licensed use limited to: UNIVERSIDAD POLITECNICA DE VALENCIA. Downloaded on October 19,2022 at 22:21:07 UTC from IEEE Xplore. Restrictions apply.