Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448016.3457270acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

Instance-Optimized Data Layouts for Cloud Analytics Workloads

Published: 18 June 2021 Publication History

Abstract

Today, businesses rely on efficiently running analytics on large amounts of operational and historical data to gain business insights and competitive advantage. Increasingly, such analytics are run using cloud-based data analytics services, such as Google BigQuery, Microsoft Azure Synapse, Amazon Redshift, and Snowflake. These services persist and process data in compressed, columnar formats, stored in large blocks, each of which contains thousands or millions of records. For these services, disk I/O from (remote) cloud storage is often one of the dominant costs for query processing. To reduce the amount of I/O, services often maintain per-block metadata, such as zone maps, which are used to skip blocks that are irrelevant to the query, leading to lower query execution times. However, the effectiveness of block skipping via zone maps is dependent on how the records are assigned to blocks. Recent work on instance-optimized data layouts aims to maximize block skipping by specializing the block assignment strategy to a specific dataset and workload. However, these existing approaches only optimize the layout for a single table.
In this paper, we propose MTO, an instance-optimized data layout framework that determines the blocking strategy for all tables in a multi-table database in the presence of joins, such as in a star or snowflake schema common in real-world workloads. MTO takes advantage of sideways information passing through joins to jointly optimize the layout for all tables, which results in better block skipping and hence reduced query execution times. Experiments on a commercial cloud-based analytics service show that MTO achieves up to 93% reduction in blocks accessed and 75% reduction in end-to-end query times compared to alternative blocking strategies.

Supplementary Material

MP4 File (3448016.3457270.mp4)
Today, businesses rely on efficiently running analytics on large amounts of operational and historical data to gain business insights and competitive advantage. Increasingly, such analytics are run using cloud-based data analytics services, such as Google BigQuery, Microsoft Azure Synapse, Amazon Redshift, and Snowflake. These services persist and process data in compressed, columnar formats, stored in large blocks, each of which contains thousands or millions of records. For these services, disk I/O from (remote) cloud storage is often one of the dominant costs for query processing. To reduce the amount of I/O, services often maintain per-block metadata, such as zone maps, which are used to skip blocks that are irrelevant to the query, leading to lower query execution times. However, the effectiveness of block skipping via zone maps is dependent on how the records are assigned to blocks. Recent work on instance-optimized data layouts aims to maximize block skipping by specializing the block assignment strategy to a specific dataset and workload. However, these existing approaches only optimize the layout for a single table.In this paper, we propose MTO, an instance-optimized data layout framework that determines the blocking strategy for all tables in a multi-table database in the presence of joins, such as in a star or snowflake schema common in real-world workloads. We show that MTO can achieve better block skipping and hence reduced query execution times by taking advantage of sideways information passing through joins to jointly optimize the layout for all tables. Experiments on a commercial cloud data warehouse show that MTO achieves up to 93% reduction in blocks accessed and 75% reduction in end-to-end query times compared to alternative blocking strategies.

References

[1]
Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollar, Arun Marathe, Vivek Narasayya, and Manoj Syamala. 2005. Database Tuning Advisor for Microsoft SQL Server 2005: Demo. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data(Baltimore, Maryland) (SIGMOD '05). Association for Computing Machinery, New York, NY, USA, 930--932. https://doi.org/10.1145/1066157.1066292
[2]
Manos Athanassoulis, Kenneth S. Bogh, and Stratos Idreos. 2019. Optimal Column Layout for Hybrid Workloads. Proc. VLDB Endow. 12, 13 (Sept. 2019), 2393--2407. https://doi.org/10.14778/3358701.3358707
[3]
Philip A. Bernstein and Dah-Ming W. Chiu. 1981. Using Semi-Joins to Solve Relational Queries. J. ACM28, 1 (Jan. 1981), 25--40. https://doi.org/10.1145/322234.322238
[4]
Vivek Bharathan, Lakshmikant Shrinivas, Sreenath Bodagala, Ramakrishna Varadarajan, Chuck Bear, and Ariel Cary. 2013. Materialization Strategies in the Vertica Analytic Database: Lessons Learned. In Proceedings of the 2013IEEE International Conference on Data Engineering (ICDE 2013) (ICDE '13). IEEE Computer Society, USA, 1196--1207. https://doi.org/10.1109/ICDE.2013.6544909
[5]
Samy Chambi, Daniel Lemire, Owen Kaser, and Robert Godin. 2016. Better bitmap performance with Roaring bitmaps. Software: Practice and Experience 46, 5 (2016), 709--719. https://doi.org/10.1002/spe.2325 arXiv: https://onlinelibrary.wiley.com/doi/pdf/10.1002/spe.2325
[6]
Zach Christopherson. 2016.Amazon Redshift Engineering's Advanced Table Design Playbook: Compound and Interleaved Sort Keys. https://aws.amazon.com/blogs/big-data/amazon-redshift-engineerings-advanced-table-design-playbook-compound-and-interleaved-sort-keys/
[7]
Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: A Workload-Driven Approach to Database Replication and Partitioning. 3, 1--2(Sept. 2010), 48--57. https://doi.org/10.14778/1920841.1920853
[8]
Databricks. 2020. Data skipping index. https://docs.databricks.com/spark/latest/spark-sql/dataskipping-index.html
[9]
Databricks Delta Engine. 2020. Z-Ordering (multi-dimensional clustering). https://docs.databricks.com/delta/optimizations/file-mgmt.html#z-ordering-multi-dimensional-clustering
[10]
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data(Portland, OR, USA)(SIGMOD '20). Association for Computing Machinery,New York, NY, USA, 969--984. https://doi.org/10.1145/3318464.3389711
[11]
Jialin Ding, Vikram Nathan, Mohammad Alizadeh, and Tim Kraska. 2020.Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads. arXiv:2006.13282 [cs.DB]
[12]
Anshuman Dutt, Chi Wang, Azade Nazi, Srikanth Kandula, Vivek Narasayya, and Surajit Chaudhuri. 2019.Selectivity Estimation for Range Predicates Using Lightweight Models. Proc. VLDB Endow.12, 9 (May 2019), 1044--1057. https://doi.org/10.14778/3329772.3329780
[13]
George Eadon, Eugene Inseok Chong, Shrikanth Shankar, Ananth Raghavan, Jagannathan Srinivasan, and Souripriya Das. 2008. Supporting Table Partitioning by Reference in Oracle. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data(Vancouver, Canada) (SIGMOD'08). Association for Computing Machinery, New York, NY, USA, 1111--1122. https://doi.org/10.1145/1376616.1376727
[14]
Mostafa Elhemali, César A. Galindo-Legaria, Torsten Grabs, and Milind M. Joshi. 2007. Execution Strategies for SQL Subqueries. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data(Beijing, China) (SIGMOD '07). Association for Computing Machinery, New York, NY, USA, 993--1004. https://doi.org/10.1145/1247480.1247598
[15]
Ronald Fagin, Amnon Lotem, and Moni Naor. 2001. Optimal Aggregation Algorithms for Middleware. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems(Santa Barbara, California, USA) (PODS '01). Association for Computing Machinery, New York, NY, USA, 102--113. https://doi.org/10.1145/375551.375567
[16]
Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. FITing-Tree: A Data-Aware Index Structure. In Proceedings of the 2019 International Conference on Management of Data(Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA,1189--1206. https://doi.org/10.1145/3299869.3319860
[17]
Cesar A. Galindo-Legaria, Torsten Grabs, Sreenivas Gukal, Steve Herbert, Aleksandras Surna, Shirley Wang, Wei Yu, Peter Zabback, and Shin Zhang. 2008. Optimizing Star Join Queries for Data Warehousing in Microsoft SQL Server. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE '08). IEEE Computer Society, USA, 1190--1199. https://doi.org/10.1109/ICDE.2008.4497528
[18]
Benjamin Hilprecht, Carsten Binnig, and Uwe Röhm. 2020. Learning a Partitioning Advisor for Cloud Databases. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data(Portland, OR, USA) (SIGMOD'20). Association for Computing Machinery, New York, NY, USA, 143--157. https://doi.org/10.1145/3318464.3389704
[19]
Dawei Huang, Dong Young Yoon, Seth Pettie, and Barzan Mozafari. 2019. Joins on Samples: A Theoretical Guide for Practitioners. Proc. VLDB Endow.13, 4 (Dec.2019), 547--560. https://doi.org/10.14778/3372716.3372726
[20]
Stratos Idreos, Kostas Zoumpatianos, Brian Hentschel, Michael S. Kester, and Demi Guo. 2018. The Data Calculator: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models. In Proceedings of the 2018 International Conference on Management of Data(Houston, TX, USA) (SIGMOD'18). Association for Computing Machinery, New York, NY, USA, 535--550. https://doi.org/10.1145/3183713.3199671
[21]
Zachary G. Ives and Nicholas E. Taylor. 2008. Sideways Information Passing for Push-Style Query Processing. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering (ICDE '08). IEEE Computer Society, USA, 774--783. https://doi.org/10.1109/ICDE.2008.4497486
[22]
Srikanth Kandula, Laurel Orr, and Surajit Chaudhuri. 2019. Pushing Data-Induced Predicates through Joins in Big-Data Clusters. Proc. VLDB Endow. 13, 3 (Nov.2019), 252--265. https://doi.org/10.14778/3368289.3368292
[23]
Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, and Alfons Kemper. 2018. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. arXiv preprint arXiv:1809.00677(2018).
[24]
Tim Kraska, Mohammad Alizadeh, Alex Beutel, Ed H. Chi, Jialin Ding,Ani Kristo, Guillaume Leclerc, Samuel Madden, Hongzi Mao, and Vikram Nathan. 2019. SageDB: A Learned Database System. In CIDR 2019, 9th Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2019/papers/p117-kraska-cidr19.pdf
[25]
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data(Houston, TX, USA) (SIGMOD'18). Association for Computing Machinery, New York, NY, USA, 489--504. https://doi.org/10.1145/3183713.3196909
[26]
Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. 2018. Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196(2018).
[27]
Per-Ake Larson, Cipri Clinciu, Campbell Fraser, Eric N. Hanson, Mostafa Mokhtar, Michal Nowakiewicz, Vassilis Papadimos, Susan L. Price, Srikumar Rangarajan, Remus Rusanu, and Mayukh Saubhasik. 2013. Enhancements to SQL Server Column Stores(SIGMOD '13). Association for Computing Machinery, New York, NY, USA, 1159--1168. https://doi.org/10.1145/2463676.2463708
[28]
Pengfei Li, Hua Lu, Qian Zheng, Long Yang, and Gang Pan. 2020. LISA: A Learned Index Structure for Spatial Data. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data(Portland, OR, USA) (SIGMOD'20). Association for Computing Machinery, New York, NY, USA, 2119--2133. https://doi.org/10.1145/3318464.3389703
[29]
Yi Lu, Anil Shanbhag, Alekh Jindal, and Samuel Madden. 2017. AdaptDB: Adaptive Partitioning for Distributed Joins.Proc. VLDB Endow.10, 5 (Jan. 2017), 589--600.https://doi.org/10.14778/3055540.3055551
[30]
Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, and Geoffrey J. Gordon. 2018. Query-Based Workload Forecasting for Self-Driving Data-base Management Systems. In Proceedings of the 2018 International Conference on Management of Data(Houston, TX, USA)(SIGMOD'18). Association for Computing Machinery, New York, NY, USA, 631--645. https://doi.org/10.1145/3183713.3196908
[31]
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2019. Learning Scheduling Algorithms for Data Processing Clusters. In Proceedings of the ACM Special Interest Group on Data Communication(Beijing, China)(SIGCOMM '19). Association for Computing Machinery, New York, NY, USA, 270--288. https://doi.org/10.1145/3341302.3342080
[32]
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: ALearned Query Optimizer. Proc. VLDB Endow.12, 11 (July 2019), 1705--1718. https://doi.org/10.14778/3342263.3342644
[33]
Ryan Marcus and Olga Papaemmanouil. 2019. Towards a Hands-Free Query Optimizer through Deep Learning. In CIDR 2019, 9th Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2019/papers/p96-marcus-cidr19.pdf
[34]
Microsoft. 2019. Columnstore indexes - Query performance.https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-query-performance
[35]
G. M. Morton. 1966.A computer Oriented Geodetic Data Base; and a New Technique in File Sequencing (PDF). Technical Report. IBM.
[36]
Vikram Nathan, Jialin Ding, Mohammad Alizadeh, and Tim Kraska. 2020. Learning Multi-dimensional Indexes. In Proceedings of the 2020 International Conference on Management of Data(Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3318464.3380579
[37]
Rimma Nehme and Nicolas Bruno. 2011.Automated Partitioning Design in Parallel Database Systems. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data(Athens, Greece)(SIGMOD'11). Association for Computing Machinery, New York, NY, USA, 1137--1148. https://doi.org/10.1145/1989323.1989444
[38]
Pat O'Neil, Elizabeth O'Neil, Xuedong Chen, and Stephen Revilak. 2020. Star Schema Benchmark. https://www.cs.umb.edu/~poneil/StarSchemaB.PDF.
[39]
Oracle. 2020. Bitmap Join Indexes. https://docs.oracle.com/cd/B10500_01/server.920/a96520/indexes.htm
[40]
Oracle. 2020. Database Data Warehousing Guide: Using Zone Maps. https://docs.oracle.com/database/121/DWHSG/zone_maps.htm
[41]
Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, and S Sathiya Keerthi. 2018. Learning State Representations for Query Optimization with Deep Reinforcement Learning. In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning. ACM, 4.
[42]
Jignesh M. Patel, Harshad Deshmukh, Jianqiao Zhu, Navneet Potti, Zuyu Zhang, Marc Spehlmann, Hakan Memisoglu, and Saket Saurabh. 2018. Quickstep: A Data Platform Based on the Scaling-up Approach. Proc. VLDB Endow.11, 6 (Feb. 2018), 663--676. https://doi.org/10.14778/3184470.3184471
[43]
Andrew Pavlo, Carlo Curino, and Stanley Zdonik. 2012. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data(Scottsdale, Arizona, USA) (SIGMOD '12). Association for Computing Machinery,New York, NY, USA, 61--72. https://doi.org/10.1145/2213836.2213844
[44]
Abdul Quamar, K. Ashwin Kumar, and Amol Deshpande. 2013. SWORD: Scalable Workload-Aware Data Placement for Transactional Workloads. In Proceedings of the 16th International Conference on Extending Database Technology(Genoa, Italy) (EDBT '13). Association for Computing Machinery, New York, NY, USA,430--441. https://doi.org/10.1145/2452376.2452427
[45]
Jun Rao, Chun Zhang, Nimrod Megiddo, and Guy Lohman. 2002. Automating Physical Database Design in a Parallel Database. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data(Madison, Wisconsin) (SIGMOD '02). Association for Computing Machinery, New York, NY, USA, 558--569. https://doi.org/10.1145/564691.564757
[46]
Praveen Seshadri, Joseph M. Hellerstein, Hamid Pirahesh, T. Y. Cliff Leung, Raghu Ramakrishnan, Divesh Srivastava, Peter J. Stuckey, and S. Sudarshan. 1996. Cost-Based Optimization for Magic: Algebra and Implementation. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data(Montreal, Quebec, Canada) (SIGMOD '96). Association for Computing Machinery, New York, NY, USA, 435--446. https://doi.org/10.1145/233269.233360
[47]
Anil Shanbhag, Alekh Jindal, Samuel Madden, Jorge Quiane, and Aaron J. Elmore. 2017. A Robust Partitioning Scheme for Ad-Hoc Query Workloads. In Proceedings of the 2017 Symposium on Cloud Computing(Santa Clara, California) (SoCC '17). Association for Computing Machinery, New York, NY, USA, 229--241. https://doi.org/10.1145/3127479.3131613
[48]
Liwen Sun, Michael J. Franklin, Sanjay Krishnan, and Reynold S. Xin. 2014. Fine-Grained Partitioning for Aggressive Data Skipping. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data(Snowbird, Utah, USA) (SIGMOD '14). Association for Computing Machinery, New York, NY, USA, 1115--1126. https://doi.org/10.1145/2588555.2610515
[49]
Liwen Sun, Michael J. Franklin, Jiannan Wang, and Eugene Wu. 2016. Skipping-Oriented Partitioning for Columnar Layouts. Proc. VLDB Endow.10, 4 (Nov. 2016),421--432. https://doi.org/10.14778/3025111.3025123
[50]
TPC. 2020. TPC-DS. http://www.tpc.org/tpcds/.
[51]
TPC. 2020. TPC-H. http://www.tpc.org/tpch/.
[52]
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). 1009--1024. https://db.cs.cmu.edu/papers/2017/p1009-van-aken.pdf
[53]
Jianguo Wang, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2017. An Experimental Study of Bitmap Compression vs. Inverted List Compression. In Proceedings of the 2017 ACM International Conference on Management of Data(Chicago, Illinois, USA)(SIGMOD '17). Association for Computing Machinery,New York, NY, USA, 993--1008. https://doi.org/10.1145/3035918.3064007
[54]
Wikipedia. 2020. Correlated Subquery. https://en.wikipedia.org/wiki/Correlated_subquery
[55]
Wikipedia. 2021. Referential Integrity. https://en.wikipedia.org/wiki/Referential_integrity
[56]
Yingjun Wu, Jia Yu, Yuanyuan Tian, Richard Sidle, and Ronald Barber. 2019. Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations. In Proceedings of the 2019 International Conference on Management of Data(Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA, 1223--1240. https://doi.org/10.1145/3299869.3319861
[57]
Zongheng Yang, Badrish Chandramouli, Chi Wang, Johannes Gehrke, Yinan Li, Umar Farooq Minhas, Per-Ake Larson, Donald Kossmann, and Rajeev Acharya. 2020. Qd-Tree: Learning Data Layouts for Big Data Analytics. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data(Portland,OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 193--208. https://doi.org/10.1145/3318464.3389770
[58]
Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen,and Ion Stoica. 2020. NeuroCard: One Cardinality Estimator for All Tables. arXiv:2006.08109 [cs.DB]
[59]
Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M. Hellerstein, Sanjay Krishnan, and Ion Stoica. 2019. Deep Unsupervised Cardinality Estimation. Proc. VLDB Endow. 13, 3 (Nov. 2019),279--292. https://doi.org/10.14778/3368289.3368294
[60]
Zack Slayton. 2017. Z-Order Indexing for Multifaceted Queries in Amazon DynamoDB. https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-1/.

Cited By

View all
  • (2025)LRP: learned robust data partitioning for efficient processing of large dynamic queriesFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40509-419:9Online publication date: 1-Sep-2025
  • (2024)Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRADProceedings of the VLDB Endowment10.14778/3681954.368202617:11(3629-3643)Online publication date: 1-Jul-2024
  • (2024)Partition, Don't Sort! Compression Boosters for Cloud Data Ingestion PipelinesProceedings of the VLDB Endowment10.14778/3681954.368201317:11(3456-3469)Online publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
June 2021
2969 pages
ISBN:9781450383431
DOI:10.1145/3448016
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Check for updates

Author Tags

  1. cloud analytics
  2. instance-optimized databases

Qualifiers

  • Research-article

Funding Sources

  • NSF IIS

Conference

SIGMOD/PODS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)371
  • Downloads (Last 6 weeks)90
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)LRP: learned robust data partitioning for efficient processing of large dynamic queriesFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40509-419:9Online publication date: 1-Sep-2025
  • (2024)Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRADProceedings of the VLDB Endowment10.14778/3681954.368202617:11(3629-3643)Online publication date: 1-Jul-2024
  • (2024)Partition, Don't Sort! Compression Boosters for Cloud Data Ingestion PipelinesProceedings of the VLDB Endowment10.14778/3681954.368201317:11(3456-3469)Online publication date: 1-Jul-2024
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)Automated Clustering Recommendation With Database Zone MapsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653397(68-79)Online publication date: 9-Jun-2024
  • (2024)Automated Multidimensional Data Layouts in Amazon RedshiftCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653379(55-67)Online publication date: 9-Jun-2024
  • (2024)Dynamic Data Layout Optimization with Worst-Case Guarantees2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00327(4288-4301)Online publication date: 13-May-2024
  • (2024)Enhancing Storage Efficiency and Performance: A Survey of Data Partitioning TechniquesJournal of Computer Science and Technology10.1007/s11390-024-3538-139:2(346-368)Online publication date: 6-Jun-2024
  • (2024)AVPS: Automatic Vertical Partitioning for Dynamic WorkloadAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5618-6_13(146-157)Online publication date: 1-Aug-2024
  • (2023)Anser: Adaptive Information Sharing Framework of AnalyticDBProceedings of the VLDB Endowment10.14778/3611540.361155316:12(3636-3648)Online publication date: 1-Aug-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media