Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Understanding the effect of data center resource disaggregation on production DBMSs

Published: 01 May 2020 Publication History

Abstract

Resource disaggregation is a new architecture for data centers in which resources like memory and storage are decoupled from the CPU, managed independently, and connected through a high-speed network. Recent work has shown that although disaggregated data centers (DDCs) provide operational benefits, applications running on DDCs experience degraded performance due to extra network latency between the CPU and their working sets in main memory. DBMSs are an interesting case study for DDCs for two main reasons: (1) DBMSs normally process data-intensive workloads and require data movement between different resource components; and (2) disaggregation drastically changes the assumption that DBMSs can rely on their own internal resource management.
We take the first step to thoroughly evaluate the query execution performance of production DBMSs in disaggregated data centers. We evaluate two popular open-source DBMSs (MonetDB and PostgreSQL) and test their performance with the TPC-H benchmark in a recently released operating system for resource disaggregation. We evaluate these DBMSs with various configurations and compare their performance with that of single-machine Linux with the same hardware resources. Our results confirm that significant performance degradation does occur, but, perhaps surprisingly, we also find settings in which the degradation is minor or where DDCs actually improve performance.

References

[1]
Apache spark - unified analytics engine for big data. https://spark.apache.org.
[2]
Big data analytics on-premises, in the cloud, or on hadoop --- vertica. https://www.vertica.com.
[3]
M. K. Aguilera, N. Amit, I. Calciu, X. Deguillard, J. Gandhi, S. Novakovic, A. Ramanathan, P. Subrahmanyam, L. Suresh, K. Tati, R. Venkatasubramanian, and M. Wei. Remote regions: a simple abstraction for remote memory. In Proceedings of the USENIX Annual Technical Conference (ATC), 2018.
[4]
M. K. Aguilera, N. Amit, I. Calciu, X. Deguillard, J. Gandhi, P. Subrahmanyam, L. Suresh, K. Tati, R. Venkatasubramanian, and M. Wei. Remote memory in the age of fast networks. In Proceedings of the ACM Symposium on Cloud Computing (SOCC), 2017.
[5]
M. K. Aguilera, K. Keeton, S. Novakovic, and S. Singhal. Designing far memory data structures: Think outside the box. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS), 2019.
[6]
Alibaba. ApsaraDB for POLARDB: A next-generation relational database - Alibaba cloud. https://www.alibabacloud.com/products/apsaradb-for-polardb, 2019.
[7]
Amazon-Aurora. Amazon aurora - Relational database built for the cloud - AWS. https://aws.amazon.com/rds/aurora/, 2019.
[8]
S. Angel, M. Nanavati, and S. Sen. Disaggregation and the application. In Proceedings of the USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), July 2020.
[9]
J. Arulraj and A. Pavlo. How to build a non-volatile memory database management system. In S. Salihoglu, W. Zhou, R. Chirkova, J. Yang, and D. Suciu, editors, Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pages 1753--1758. ACM, 2017.
[10]
C. Barthels, S. Loesing, G. Alonso, and D. Kossmann. Rack-scale in-memory join processing using RDMA. In Proceedings of the ACM SIGMOD Conference, 2015.
[11]
Q. Cai, W. Guo, H. Zhang, D. Agrawal, G. Chen, B. C. Ooi, K. Tan, Y. M. Teo, and S. Wang. Efficient distributed memory management with RDMA and caching. PVLDB, 11(11):1604--1617, 2018.
[12]
A. Carbonari and I. Beschastnikh. Tolerating Faults in Disaggregated Datacenters. In Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets), 2017.
[13]
Citus-Data. Citus data: Worry-free postgres. built to scale out. https://www.citusdata.com/, 2019.
[14]
A. Dragojević, D. Narayanan, O. Hodson, and M. Castro. FaRM: Fast Remote Memory. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2014.
[15]
D. Duplyakin, R. Ricci, A. Maricq, G. Wong, J. Duerig, E. Eide, L. Stoller, M. Hibler, D. Johnson, K. Webb, A. Akella, K. Wang, G. Ricart, L. Landweber, C. Elliott, M. Zink, E. Cecchet, S. Kar, and P. Mishra. The design and operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC), July 2019.
[16]
M. J. Franklin, M. J. Carey, and M. Livny. Global memory management in client-server database architectures. In L. Yuan, editor, 18th International Conference on Very Large Data Bases, August 23-27, 1992, Vancouver, Canada, Proceedings, pages 596--609. Morgan Kaufmann, 1992.
[17]
P. X. Gao, A. Narayan, S. Karandikar, J. Carreira, S. Han, R. Agarwal, S. Ratnasamy, and S. Shenker. Network requirements for resource disaggregation. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.
[18]
J. Giceva, G. Zellweger, G. Alonso, and T. Roscoe. Customized OS support for data-processing. In Proceedings of the 12th International Workshop on Data Management on New Hardware, DaMoN 2016, San Francisco, CA, USA, June 27, 2016, pages 2:1--2:6. ACM, 2016.
[19]
J. Gu, Y. Lee, Y. Zhang, M. Chowdhury, and K. G. Shin. Efficient memory disaggregation with Infiniswap. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2017.
[20]
F. Li, S. Das, M. Syamala, and V. R. Narasayya. Accelerating relational databases by leveraging remote memory and RDMA. In Proceedings of the ACM SIGMOD Conference, 2016.
[21]
K. T. Lim, J. Chang, T. N. Mudge, P. Ranganathan, S. K. Reinhardt, and T. F. Wenisch. Disaggregated memory for expansion and sharing in blade servers. In Proceedings of the International Symposium on Computer Architecture (ISCA), 2009.
[22]
K. T. Lim, Y. Turner, J. R. Santos, A. AuYoung, J. Chang, P. Ranganathan, and T. F. Wenisch. System-level implications of disaggregated memory. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2012.
[23]
X. Liu, A. Aboulnaga, K. Salem, and X. Li. CLIC: client-informed caching for storage servers. In M. I. Seltzer and R. Wheeler, editors, 7th USENIX Conference on File and Storage Technologies, February 24-27, 2009, San Francisco, CA, USA. Proceedings, pages 297--310. USENIX, 2009.
[24]
S. Manegold, P. A. Boncz, and N. Nes. Cache-conscious radix-decluster projections. In M. A. Nascimento, M. T. Özsu, D. Kossmann, R. J. Miller, J. A. Blakeley, and K. B. Schiefer, editors, (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, August 31 - September 3 2004, pages 684--695. Morgan Kaufmann, 2004.
[25]
Microsoft-SQL-Database. Sql database - cloud database as a service --- Microsoft Azure. https://azure.microsoft.com/en-us/services/sql-database/, 2019.
[26]
MonetDB. Monetdb - the column-store pioneer. https://www.monetdb.org/Home, 2019.
[27]
I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-efficient aggregation: Hashing is sorting. In T. K. Sellis, S. B. Davidson, and Z. G. Ives, editors, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, pages 1123--1136. ACM, 2015.
[28]
K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, and B. Chun. Making sense of performance in data analytics frameworks. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2015.
[29]
J. M. Patel, H. Deshmukh, J. Zhu, N. Potti, Z. Zhang, M. Spehlmann, H. Memisoglu, and S. Saurabh. Quickstep: A data platform based on the scaling-up approach. PVLDB, 11(6):663--676, 2018.
[30]
Postgres-XL. Postgres-xl: Open source scalable sql database cluster. https://www.postgres-xl.org/, 2019.
[31]
PostgreSQL. PostgreSQL: The world's most advanced open source relational database. https://www.postgresql.org/, 2019.
[32]
A. Shamis, M. Renzelmann, S. Novakovic, G. Chatzopoulos, A. Dragojevic, D. Narayanan, and M. Castro. Fast general distributed transactions with opacity. In P. A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, and T. Kraska, editors, Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, pages 433--448. ACM, 2019.
[33]
Y. Shan, Y. Huang, Y. Chen, and Y. Zhang. LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018.
[34]
V. Shrivastav, A. Valadarsky, H. Ballani, P. Costa, K. S. Lee, H. Wang, R. Agarwal, and H. Weatherspoon. Shoal: A Network Architecture for Disaggregated Racks. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2019.
[35]
M. Stonebraker. Operating system support for database management. Communications of the ACM, 24(7), June 1981.
[36]
A. van Renen, V. Leis, A. Kemper, T. Neumann, T. Hashida, K. Oe, Y. Doi, L. Harada, and M. Sato. Managing non-volatile memory in database systems. In G. Das, C. M. Jermaine, and P. A. Bernstein, editors, Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10-15, 2018, pages 1541--1555. ACM, 2018.
[37]
G. Yadgar, M. Factor, K. Li, and A. Schuster. Management of multilevel, multiclient cache hierarchies with application hints. ACM Trans. Comput. Syst., 29(2):5:1--5:51, 2011.
[38]
Q. Zhang, Y. Cai, S. Angel, A. Chen, V. Liu, and B. T. Loo. Rethinking data management systems for disaggregated data centers. In Proceedings of Conference on Innovative Data Systems Research (CIDR), Jan. 2020.

Cited By

View all
  • (2024)DEX: Scalable Range Indexing on Disaggregated MemoryProceedings of the VLDB Endowment10.14778/3675034.367505017:10(2603-2616)Online publication date: 1-Jun-2024
  • (2024)DiStore: A Fully Memory Disaggregation Friendly Key-Value Store with Improved Tail Latency and Space EfficiencyProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673088(607-617)Online publication date: 12-Aug-2024
  • (2024)Scalable Distributed Inverted List Indexes in Disaggregated MemoryProceedings of the ACM on Management of Data10.1145/36549742:3(1-27)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 13, Issue 9
May 2020
295 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 May 2020
Published in PVLDB Volume 13, Issue 9

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DEX: Scalable Range Indexing on Disaggregated MemoryProceedings of the VLDB Endowment10.14778/3675034.367505017:10(2603-2616)Online publication date: 1-Jun-2024
  • (2024)DiStore: A Fully Memory Disaggregation Friendly Key-Value Store with Improved Tail Latency and Space EfficiencyProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673088(607-617)Online publication date: 12-Aug-2024
  • (2024)Scalable Distributed Inverted List Indexes in Disaggregated MemoryProceedings of the ACM on Management of Data10.1145/36549742:3(1-27)Online publication date: 30-May-2024
  • (2024)Towards Buffer Management with Tiered Main MemoryProceedings of the ACM on Management of Data10.1145/36392862:1(1-26)Online publication date: 26-Mar-2024
  • (2023)Marlin: A Concurrent and Write-Optimized B+-tree Index on Disaggregated MemoryProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605576(695-704)Online publication date: 7-Aug-2023
  • (2023)Building Write-Optimized Tree Indexes on Disaggregated MemoryACM SIGMOD Record10.1145/3604437.360444852:1(45-52)Online publication date: 8-Jun-2023
  • (2023)Cowbird: Freeing CPUs to Compute by Offloading the Disaggregation of MemoryProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604833(1060-1073)Online publication date: 10-Sep-2023
  • (2023)Invited Paper: Disaggregating Applications Using UniservicesProceedings of the 5th workshop on Advanced tools, programming languages, and PLatforms for Implementing and Evaluating algorithms for Distributed systems10.1145/3584684.3597271(1-10)Online publication date: 19-Jun-2023
  • (2023)Persistent Memory Disaggregation for Cloud-Native Relational DatabasesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582055(498-512)Online publication date: 25-Mar-2023
  • (2023)Disaggregated Database SystemsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589403(37-44)Online publication date: 4-Jun-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media