Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ByteHTAP: bytedance's HTAP system with high data freshness and strong data consistency

Published: 01 August 2022 Publication History

Abstract

In recent years, at ByteDance, we see more and more business scenarios that require performing complex analysis over freshly imported data, together with transaction support and strong data consistency. In this paper, we describe our journey of building ByteHTAP, an HTAP system with high data freshness and strong data consistency. It adopts a separate-engine and shared-storage architecture. Its modular system design fully utilizes an existing ByteDance's OLTP system and an open source OLAP system. This choice saves us a lot of resources and development time and allows easy future extensions such as replacing the query processing engine with other alternatives.
ByteHTAP can provide high data freshness with less than one second delay, which enables many new business opportunities for our customers. Customers can also configure different data freshness thresholds based on their business needs. ByteHTAP also provides strong data consistency through global timestamps across its OLTP and OLAP system, which greatly relieves application developers from handling complex data consistency issues by themselves. In addition, we introduce some important performance optimizations to ByteHTAP, such as pushing computations to the storage layer and using delete bitmaps to efficiently handle deletes. Lastly, we will share our lessons and best practices in developing and running ByteHTAP in production.

References

[1]
2021. Flink Forward Asia 2021. Retrieved February 23, 2022 from https://flink-forward.org.cn/
[2]
2021. Improvements of Job Scheduler and Query Execution on Flink OLAP. Retrieved February 23, 2022 from https://www.bilibili.com/video/BV1j34y1B72o?p=7
[3]
2021. Powering HTAP at ByteDance with Apache Flink. Retrieved February 23, 2022 from https://www.bilibili.com/video/BV1j34y1B72o?p=3
[4]
2022. ANSI SQL Standard. Retrieved February 23, 2022 from https://webstore.ansi.org/Standards/ISO/ISOIEC90752016
[5]
2022. Apache Flink. Retrieved February 7, 2022 from https://flink.apache.org
[6]
2022. BaikalDB. Retrieved February 7, 2022 from https://github.com/baidu/BaikalDB
[7]
2022. Microsoft Azure Synapse Analytics. Retrieved February 23, 2022 from https://azure.microsoft.com/en-us/services/synapse-analytics/
[8]
2022. MySQL. Retrieved February 23, 2022 from https://www.mysql.com/
[9]
2022. OceanBase. Retrieved February 7, 2022 from https://open.oceanbase.com
[10]
2022. PolarDB-X. Retrieved February 7, 2022 from https://www.alibabacloud.com/product/polardb-x
[11]
2022. Presto. Retrieved February 23, 2022 from https://prestodb.io
[12]
2022. RocksDB. Retrieved February 18, 2022 from http://rocksdb.org/
[13]
2022. SingleStore. Retrieved February 7, 2022 from https://www.singlestore.com
[14]
2022. Sysbench. Retrieved February 11, 2022 from https://github.com/akopytov/sysbench
[15]
2022. TPC-C Specification. Retrieved February 23, 2022 from http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-c_v5.11.0.pdf
[16]
2022. TPC-DS. Retrieved February 11, 2022 from http://www.tpc.org/tpcds/
[17]
2022. TPC-H. Retrieved February 11, 2022 from http://www.tpc.org/tpch/
[18]
Anastassia Ailamaki, David J DeWitt, and Mark D Hill. 2002. Data page layouts for relational databases on deep memory hierarchies. The VLDB Journal 11, 3 (2002), 198--215.
[19]
Michael Armbrust, Reynold S Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K Bradley, Xiangrui Meng, Tomer Kaftan, Michael J Franklin, Ali Ghodsi, et al. 2015. Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD international conference on management of data. 1383--1394.
[20]
Hillel Avni, Alisher Aliev, Oren Amor, Aharon Avitzur, Ilan Bronshtein, Eli Ginot, Shay Goikhman, Eliezer Levy, Idan Levy, Fuyang Lu, et al. 2020. Industrial-strength OLTP using main memory and many cores. Proceedings of the VLDB Endowment 13, 12 (2020), 3099--3111.
[21]
Ronald Barber, Christian Garcia-Arellano, Ronen Grosman, Guy Lohman, C Mohan, Rene Muller, Hamid Pirahesh, Vijayshankar Raman, Richard Sidle, Adam Storm, et al. 2019. Wiser: A highly available HTAP DBMS for iot applications. In 2019 IEEE International Conference on Big Data (Big Data). IEEE, 268--277.
[22]
Ronald Barber, Christian Garcia-Arellano, Ronen Grosman, Rene Mueller, Vijayshankar Raman, Richard Sidle, Matt Spilchen, Adam J Storm, Yuanyuan Tian, Pinar Tözün, et al. 2017. Evolving Databases for New-Gen Big Data Applications. In CIDR.
[23]
Ronald Barber, Matt Huras, Guy Lohman, C Mohan, Rene Mueller, Fatma Özcan, Hamid Pirahesh, Vijayshankar Raman, Richard Sidle, Oleg Sidorkin, et al. 2016. Wildfire: Concurrent blazing data ingest and analytics. In Proceedings of the 2016 International Conference on Management of Data. 2077--2080.
[24]
Dipti Borkar, Ravi Mayuram, Gerald Sangudi, and Michael Carey. 2016. Have your data and query it too: From key-value caching to big data management. In Proceedings of the 2016 International Conference on Management of Data. 239--251.
[25]
Mokrane Bouzeghoub. 2004. A framework for analysis of data freshness. Proceedings of the 2004 international workshop on Information quality in information systems, 59--67.
[26]
Dennis Butterstein, Daniel Martin, Knut Stolze, Felix Beier, Jia Zhong, and Lingyun Wang. 2020. Replication at the speed of change: a fast, scalable replication solution for near real-time HTAP processing. Proceedings of the VLDB Endowment 13, 12 (2020), 3245--3257.
[27]
Le Cai, Jianjun Chen, Jun Chen, Yu Chen, Kuorong Chiang, Marko A. Dimitrijevic, Yonghua Ding, Yu Dong, Ahmad Ghazal, Jacques Hebert, Kamini Jagtiani, Suzhen Lin, Ye Liu, Demai Ni, Chunfeng Pei, Jason Sun, Li Zhang, Mingyi Zhang, and Cheng Zhu. 2018. FusionInsight LibrA: Huawei's Enterprise Cloud Data Analytics Platform. Proc. VLDB Endow. 11 (2018), 1822--1834.
[28]
Jianjun Chen, Yu Chen, Zhibiao Chen, Ahmad Ghazal, Guoliang Li, Sihao Li, Weijie Ou, Yang Sun, Mingyi Zhang, and Minqi Zhou. 2019. Data management at huawei: Recent accomplishments and future challenges. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 13--24.
[29]
Jack Chen, Samir Jindel, Robert Walzer, Rajkumar Sen, Nika Jimsheleishvilli, and Michael Andrews. 2016. The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database. Proceedings of the VLDB Endowment 9, 13 (2016), 1401--1412.
[30]
Richard Cole, Florian Funke, Leo Giakoumakis, Wey Guy, Alfons Kemper, Stefan Krompass, Harumi Kuno, Raghunath Nambiar, Thomas Neumann, Meikel Poess, et al. 2011. The mixed workload CH-benCHmark. In Proceedings of the Fourth International Workshop on Testing Database Systems. 1--6.
[31]
Alexandros G. Dimakis, Soummya Kar, José MF Moura, Michael G. Rabbat, and Anna Scaglione. 2010. Gossip algorithms for distributed signal processing. Proc. IEEE 98, 11, 1847--1864.
[32]
Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. 2011. SAP HANA database: data management for modern business applications. Proceedings of the VLDB Endowment 40, 4 (2011), 45--51.
[33]
David K Gifford. 1979. Weighted voting for replicated data. Proceedings of the seventh ACM symposium on Operating systems principles, 150--162.
[34]
Dongxu Huang, Qi Liu, Qiu Cui, Zhuhe Fang, Xiaoyu Ma, Fei Xu, Li Shen, Liu Tang, Yuxing Zhou, Menglong Huang, et al. 2020. TiDB: a Raft-based HTAP database. Proceedings of the VLDB Endowment 13, 12 (2020), 3072--3084.
[35]
Murtadha AI Hubail, Ali Alsuliman, Michael Blow, Michael Carey, Dmitry Lychagin, Ian Maxon, and Till Westmann. 2019. Couchbase analytics: NoETL for scalable NoSQL data analysis. Proceedings of the VLDB Endowment 12, 12 (2019), 2275--2286.
[36]
Patrick Hunt, Mahadev Konar, Flavio P Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free Coordination for Internet-scale Systems. In 2010 USENIX Annual Technical Conference (USENIX ATC 10).
[37]
Alfons Kemper and Thomas Neumann. 2011. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 195--206.
[38]
Tirthankar Lahiri, Shasank Chavan, Maria Colgan, Dinesh Das, Amit Ganesh, Mike Gleeson, Sanket Hase, Allison Holloway, Jesse Kamp, Teck-Hua Lee, et al. 2015. Oracle database in-memory: A dual format in-memory database. In 2015 IEEE 31st International Conference on Data Engineering. IEEE, 1253--1258.
[39]
Per-Åke Larson, Adrian Birka, Eric N Hanson, Weiyun Huang, Michal Nowakiewicz, and Vassilis Papadimos. 2015. Real-time analytical processing with SQL server. Proceedings of the VLDB Endowment 8, 12 (2015), 1740--1751.
[40]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004. IEEE, 75--86.
[41]
Juchang Lee, SeungHyun Moon, Kyu Hwan Kim, Deok Hoe Kim, Sang Kyun Cha, and Wook-Shin Han. 2017. Parallel replication across formats in SAP HANA for scaling out mixed OLTP/OLAP workloads. Proceedings of the VLDB Endowment 10, 12 (2017), 1598--1609.
[42]
Todd Lipcon, David Alves, Dan Burkert, Jean-Daniel Cryans, Adar Dembo, Mike Percy, Silvius Rus, Dave Wang, Matteo Bertozzi, Colin Patrick McCabe, et al. 2015. Kudu: Storage for fast analytics on fast data. Cloudera, inc 28 (2015).
[43]
Chen Luo, Pinar Tözün, Yuanyuan Tian, Ronald Barber, Vijayshankar Raman, and Richard Sidle. 2019. Umzi: Unified Multi-Zone Indexing for Large-Scale HTAP. In EDBT. 1--12.
[44]
Zhenghua Lyu, Huan Hubert Zhang, Gang Xiong, Gang Guo, Haozhou Wang, Jinbao Chen, Asim Praveen, Yu Yang, Xiaoming Gao, Alexandra Wang, et al. 2021. Greenplum: A Hybrid Database for Transactional and Analytical Workloads. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data. 2530--2542.
[45]
Darko Makreshanski, Jana Giceva, Claude Barthels, and Gustavo Alonso. 2017. BatchDB: Efficient isolated execution of hybrid OLTP+ OLAP workloads for interactive applications. In Proceedings of the 2017 ACM International Conference on Management of Data. 37--50.
[46]
Norman May, Alexander Böhm, and Wolfgang Lehner. 2017. SAP HANA-The Evolution of an In-Memory DBMS from Pure OLAP Processing Towards Mixed Workloads. Datenbanksysteme für Business, Technologie und Web (BTW 2017) (2017).
[47]
John Meehan, Nesime Tatbul, Stan Zdonik, Cansu Aslantas, Ugur Cetintemel, Jiang Du, Tim Kraska, Samuel Madden, David Maier, Andrew Pavlo, et al. 2015. S-Store: Streaming Meets Transaction Processing. Proceedings of the VLDB Endowment 8, 13 (2015).
[48]
Barzan Mozafari, Jags Ramnarayan, Sudhir Menon, Yogesh Mahajan, Soubhik Chakraborty, Hemant Bhanawat, and Kishor Bachhav. 2017. SnappyData: A Unified Cluster for Streaming, Transactions and Interactice Analytics. In CIDR.
[49]
Niloy Mukherjee, Shasank Chavan, Maria Colgan, Dinesh Das, Mike Gleeson, Sanket Hase, Allison Holloway, Hui Jin, Jesse Kamp, Kartik Kulkarni, et al. 2015. Distributed architecture of oracle database in-memory. Proceedings of the VLDB Endowment 8, 12 (2015), 1630--1641.
[50]
Fatma Özcan, Yuanyuan Tian, and Pinar Tözün. 2017. Hybrid transactional/analytical processing: A survey. In Proceedings of the 2017 ACM International Conference on Management of Data. 1771--1775.
[51]
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, Todd C Mowry, Matthew Perron, Ian Quah, et al. 2017. Self-Driving Database Management Systems. In CIDR, Vol. 4. 1.
[52]
Vijayshankar Raman, Gopi Attaluri, Ronald Barber, Naresh Chainani, David Kalmuk, Vincent KulandaiSamy, Jens Leenstra, Sam Lightstone, Shaorong Liu, Guy M Lohman, et al. 2013. DB2 with BLU acceleration: So much more than just a column store. Proceedings of the VLDB Endowment 6, 11 (2013), 1080--1091.
[53]
Jags Ramnarayan, Barzan Mozafari, Sumedh Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh Mahajan, Rishitesh Mishra, and Kishor Bachhav. 2016. Snappydata: A hybrid transactional analytical store built on spark. In Proceedings of the 2016 International Conference on Management of Data. 2153--2156.
[54]
Aunn Raza, Periklis Chrysogelos, Angelos Christos Anadiotis, and Anastasia Ailamaki. 2020. Adaptive HTAP through elastic resource scheduling. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2043--2054.
[55]
Aunn Raza, Periklis Chrysogelos, Angelos Christos Anadiotis, and Anastasia Ailamaki. 2020. Adaptive HTAP through elastic resource scheduling. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2043--2054.
[56]
Bart Samwel, John Cieslewicz, Ben Handy, Jason Govig, Petros Venetis, Chanjun Yang, Keith Peters, Jeff Shute, Daniel Tenedorio, Himani Apte, et al. 2018. F1 query: Declarative querying at scale. Proceedings of the VLDB Endowment 11, 12 (2018), 1835--1848.
[57]
Hemant Saxena, Lukasz Golab, Stratos Idreos, and Ihab F Ilyas. 2021. Real-Time LSM-Trees for HTAP Workloads. arXiv preprint arXiv:2101.06801 (2021).
[58]
Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang. 2021. Retrofitting High Availability Mechanism to Tame Hybrid Transaction/Analytical Processing. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). 219--238.
[59]
Reza Sherkat, Colin Florendo, Mihnea Andrei, Rolando Blanco, Adrian Dragusanu, Amit Pathak, Pushkar Khadilkar, Neeraj Kulkarni, Christian Lemke, Sebastian Seifert, et al. 2019. Native store extension for SAP HANA. Proceedings of the VLDB Endowment 12, 12 (2019), 2047--2058.
[60]
Vishal Sikka, Franz Färber, Wolfgang Lehner, Sang Kyun Cha, Thomas Peh, and Christof Bornhövd. 2012. Efficient transaction processing in SAP HANA database: the end of a column store myth. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 731--742.
[61]
Alexandre Verbitski, Anurag Gupta, Debanjan Saha, Murali Brahmadesam, Kamal Gupta, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvili, and Xiaofeng Bao. 2017. Amazon aurora: Design considerations for high throughput cloud-native relational databases. In Proceedings of the 2017 ACM International Conference on Management of Data. 1041--1052.
[62]
Jiacheng Yang, Ian Rae, Jun Xu, Jeff Shute, Zhan Yuan, Kelvin Lau, Qiang Zeng, Xi Zhao, Jun Ma, Ziyang Chen, et al. 2020. F1 Lightning: HTAP as a Service. Proceedings of the VLDB Endowment 13, 12 (2020), 3313--3325.
[63]
Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10).

Cited By

View all
  • (2024)Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAPProceedings of the VLDB Endowment10.14778/3681954.368200117:11(3290-3303)Online publication date: 1-Jul-2024
  • (2023)Krypton: Real-Time Serving and Analytical SQL Engine at ByteDanceProceedings of the VLDB Endowment10.14778/3611540.361154516:12(3528-3542)Online publication date: 1-Aug-2023
  • (2023)Rethink Query Optimization in HTAP DatabasesProceedings of the ACM on Management of Data10.1145/36267501:4(1-27)Online publication date: 12-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 12
August 2022
551 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2022
Published in PVLDB Volume 15, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)131
  • Downloads (Last 6 weeks)22
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAPProceedings of the VLDB Endowment10.14778/3681954.368200117:11(3290-3303)Online publication date: 1-Jul-2024
  • (2023)Krypton: Real-Time Serving and Analytical SQL Engine at ByteDanceProceedings of the VLDB Endowment10.14778/3611540.361154516:12(3528-3542)Online publication date: 1-Aug-2023
  • (2023)Rethink Query Optimization in HTAP DatabasesProceedings of the ACM on Management of Data10.1145/36267501:4(1-27)Online publication date: 12-Dec-2023
  • (2023)PolarDB-IMCI: A Cloud-Native HTAP Database System at AlibabaProceedings of the ACM on Management of Data10.1145/35897851:2(1-25)Online publication date: 20-Jun-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media