Distributed secondo: an extensible and scalable database management system

Nidzwetzki, Jan Kristof; Güting, Ralf Hartmut

doi:10.1007/s10619-017-7198-9

Distributed secondo: an extensible and scalable database management system

Published: 23 June 2017

Volume 35, pages 197–248, (2017)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

1331 Accesses
16 Citations
Explore all metrics

Abstract

This paper describes a novel method to couple a standalone database management system (DBMS) with a highly scalable key-value store. The system employs Apache Cassandra as data storage and the extensible DBMS Secondo as a query processing engine. The resulting system is a distributed, general-purpose DBMS which is highly scalable and fault tolerant. The logical ring of Cassandra is used to split up input data into smaller units of work (UOWs), which can be processed independently. A decentralized algorithm is responsible to assign the UOWs to query processing nodes. In case of a node failure, UOWs are recalculated on a different node. All the data models (e.g. relational, spatial and spatio-temporal) and functions (e.g. filter, aggregates, joins and spatial-joins) implemented in Secondo can be used in a scalable way without changing the implementation. Many aspects of the distribution are hidden from the user. Existing sequential queries can be easily converted into parallel ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed SECONDO: A Highly Available and Scalable System for Spatial Data Processing

Database Integration—Multidatabase Systems

A Survey on Parallel Database Systems from a Storage Perspective: Rows Versus Columns

Notes

Some special functions, like the interaction with other distributed systems, are excluded.
In Secondo, nested lists are used at some points to interchange structured data. For example: ((value1 value2) (value3)).
The two cases $begin = p_0$ and $end = p_n$ are ignored in the description to keep the examples clear.
Phase 3 is influenced by the speculative task execution of Hadoop [8, p. 3]. The table system_pending prevents, that all idle QPNs are processing the same UOW at the same time. This would lead to hot spots (parts of the logical ring that are read or written by many nodes simultaneously) and to longer query processing times.
The part of the logical ring which is read, is determined by the the UOW which is processed at the moment.
Each line contains 5000 characters + 4 field separators (e.g. $\gg ,\ll $) + 1 new line character (e.g. $\gg \textbackslash n\ll $ = 5005 bytes per line. By creating 10,000,000 lines with 5005 bytes each, 46,61 GB data in total is generated.
The data generator creates 46.61 GB of data, 38.84 GB needs to be transferred. With an 1 Gbit/s network link, the transfer takes 333.63 s.
The parallel version executes multiple Secondo-threads on one hardware node. This is the reason, why the parallel version can not use 6 GB memory for each thread. However, UOWs are small and with 1.5 GB memory only one MMR-tree needs to be created. As a consequence, the second relation needs to be analyzed only once.

References

Abouzeid, A., Bajda-Pawlikowski, K., Abadi, D., Silberschatz, A., Rasin, A.: Hadoopdb: an architectural hybrid of mapreduce and dbms technologies for analytical workloads. Proc. VLDB Endow. 2(1), 922–933 (2009)
Article Google Scholar
Apache license, version 2.0. http://www.apache.org/licenses/ (2004). Accessed 30 Jul 2015
Ceri, S., Pelagatti, G.: Distributed Databases Principles and Systems. McGraw-Hill Inc, New York (1984)
MATH Google Scholar
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, OSDI’06, vol. 7, pp. 15–15. USENIX Association, Berkeley (2006)
Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally-distributed database. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12, pp. 251–264. USENIX Association, Berkeley (2012)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, OSDI’04, vol. 6, pp. 10. USENIX Association, Berkeley (2004)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Article Google Scholar
Dinun, F., Ng, T.S.E.: Understanding the effects and implications of compute node related failures in hadoop. In: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, HPDC’12, pp. 187–198. ACM, New York (2012)
Dittrich, J.P., Seeger, B.: Data redundancy and duplicate detection in spatial join processing. In: ICDE, pp. 535–546 (2000)
Düntgen, C., Behr, T., Güting, R.H.: Berlinmod: a benchmark for moving object databases. VLDB J. 18(6), 1335–1368 (2009)
Article Google Scholar
Eldawy, A., Mokbel, M.F.: Pigeon: a spatial mapreduce language. In: IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31–April 4, 2014, pp. 1242–1245 (2014)
Eldawy, A., Mokbel, M.F.: SpatialHadoop: a mapreduce framework for spatial data. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, pp. 1352–1363, 13–17 April 2015
Gantz, J.F., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadow’s, and biggest growth in the far east. In: IDC (2012)
George, L.: HBase: The Definitive Guide. O’Reilly Media Inc, Sebastopol (2011)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.T.: The google file system. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles. SOSP’03, pp. 29–43. ACM, New York (2003)
Güting, R.H.: Operator Based Query Progress Estimation. Fern Universität in Hagen, Hagen (2008)
Google Scholar
Güting, R.H., Behr, T., Düntgen, C.: Secondo: a platform for moving objects database research and for publishing and integrating research implementations. IEEE Data Eng. Bull. 33(2), 56–63 (2010)
Google Scholar
Idreos, S., Liarou, E., Koubarakis, M.: Continuous multi-way joins over distributed hash tables. In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology. EDBT’08, pp. 594–605. ACM, New York (2008)
Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., Lewin, D.: Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, STOC’97, pp. 654–663. ACM, New York (1997)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Article Google Scholar
Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)
Article Google Scholar
Leach, P., Mealling, M., Salz, R.: RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace (2005)
Lu, J., Güting., R.H.: Parallel secondo: boosting database engines with hadoop. In: 2013 International Conference on Parallel and Distributed Systems, pp. 738–743 (2012)
Nidzwetzki, J.K.: Entwicklung eines skalierbaren und verteilten Datenbanksystems. Springer, Berlin (2016)
Book Google Scholar
Nidzwetzki, J.K., Güting, R.H.: Distributed SECONDO: a highly available and scalable system for spatial data processing. In: Advances in spatial and temporal databases—14th international symposium, SSTD 2015, Hong Kong, China, pp. 491–496, 26–28 August 2015
Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. SIGMOD’08, pp. 1099–1110. ACM, New York (2008)
Özsu, M.T., Valduriez, P. (eds.): Principles of Distributed Database Systems, vol. 3. Springer, New York (2011)
Palma, W., Akbarinia, R., Pacitti, E., Valduriez, P.: Distributed processing of continuous join queries using DHT networks. In: Proceedings of the 2009 EDBT/ICDT Workshops. EDBT/ICDT’09, pp. 34–41. ACM, New York (2009)
Patel, J.M., DeWitt, D.J.: Partition based spatial-merge join. SIGMOD Rec. 25(2), 259–270 (1996)
Article Google Scholar
Rothnie, J.B., Goodman, N.: A survey of research and development in distributed database management. In: Proceedings of the Third International Conference on Very Large Data Bases, VLDB’77, vol. 3, pp. 48–62. VLDB Endowment (1977)
Rothnie, J.B., Bernstein, P.A., Fox, S., Goodman, N., Hammer, M., Landers, T.A., Reeve, C., Shipman, D.W., Wong, E.: Introduction to a system for distributed databases (SDD-1). ACM Trans. Database Syst. 5(1), 1–17 (1980)
Article Google Scholar
Shute, J., Oancea, M., Ellner, S., Handy, B., Rollins, E., Samwel, B., Vingralek, R., Whipkey, C., Chen, X., Jegerlehner, B., Littleield, K., Tong, P.: F1: the fault-tolerant distributed RDBMS supporting googles ad business. In: SIGMOD, 2012. Talk given at SIGMOD (2012)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for internet applications. SIGCOMM Comput. Commun. Rev. 31(4), 149–160 (2001)
Article Google Scholar
Tanenbaum, A.S., Steen, Mv: Distributed Systems: Principles and Paradigms, vol. 2. Prentice-Hall, Inc., Upper Saddle River (2006)
MATH Google Scholar
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Anthony, S., Liu, H., Wyckoff, P., Murthy, R.: Hive: a warehousing solution over a map-reduce framework. Proc. VLDB Endow. 2(2), 1626–1629 (2009)
Article Google Scholar
Transaction Processing Performance Council. TPC BENCHMARK H (Decision Support) Standard Specification. http://www.tpc.org/tpch/. Accessed 15 May 2015
Vogels, W.: Eventually consistent. Commun. ACM 52(1), 40–44 (2009)
Article Google Scholar
Website of Apache Drill. http://drill.apache.org (2015). Accessed 20 July 2015
Website of Apache Spark. http://spark.apache.org/ (2015). Accessed 20 Jul 2015
Website of cpp-driver for Cassandra. https://github.com/datastax/cpp-driver (2015). Accessed 15 Sept 2015
Website of distributed secondo http://dna.fernuni-hagen.de/secondo/DSecondo/DSECONDO-Website/index.html (2015). Accessed 15 Nov 2015
Website of the Open Street Map Project. http://www.openstreetmap.org (2015). Accessed 09 July 2015
White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media Inc., Sebastopol (2009)
Google Scholar
Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: efficient in-memory spatial analytics. In: Proceedings of the 2016 International Conference on Management of Data. SIGMOD’16, pp. 1071–1085. ACM, New York (2016)
You, S., Zhang, J., Gruenwald, L.: Large-scale spatial join query processing in cloud. Technical Report http://www-cs.ccny.cuny.edu/~jzhang/papers/spatial_cc_tr.pdf (2016). Accessed 14 Mar 2017
Zhang, S., Han, J., Liu, Z., Wang, K., Xu, Z.: SJMR: parallelizing spatial join with mapreduce on clusters. In: Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31–September 4, 2009, New Orleans, Louisiana, USA, pp. 1–8 (2009)

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, FernUniversität Hagen, 58084, Hagen, Germany
Jan Kristof Nidzwetzki & Ralf Hartmut Güting

Authors

Jan Kristof Nidzwetzki
View author publications
You can also search for this author in PubMed Google Scholar
Ralf Hartmut Güting
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Kristof Nidzwetzki.

Queries of the experiments

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nidzwetzki, J.K., Güting, R.H. Distributed secondo: an extensible and scalable database management system. Distrib Parallel Databases 35, 197–248 (2017). https://doi.org/10.1007/s10619-017-7198-9

Download citation

Published: 23 June 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10619-017-7198-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed secondo: an extensible and scalable database management system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Distributed SECONDO: A Highly Available and Scalable System for Spatial Data Processing

Database Integration—Multidatabase Systems

A Survey on Parallel Database Systems from a Storage Perspective: Rows Versus Columns

Notes

References