Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

TraNCE: transforming nested collections efficiently

Published: 01 July 2021 Publication History

Abstract

Nested relational query languages have long been seen as an attractive tool for scenarios involving large hierarchical datasets. There has been a resurgence of interest in nested relational languages. One driver has been the affinity of these languages for large-scale processing platforms such as Spark and Flink.
This demonstration gives a tour of TraNCE, a new system for processing nested data on top of distributed processing systems. The core innovation of the system is a compiler that processes nested relational queries in a series of transformations; these include variants of two prior techniques, shredding and unnesting, as well as a materialization transformation that customizes the way levels of the nested output are generated. The TraNCE platform builds on these techniques by adding components for users to create and visualize queries, as well as data exploration and notebook execution targets to facilitate the construction of large-scale data science applications. The demonstration will both showcase the system from the viewpoint of usability by data scientists and illustrate the data management techniques employed.

References

[1]
Maaz Bin Safeer Ahmad and Alvin Cheung. 2018. Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications. In SIGMOD.
[2]
Alexander Alexandrov, Asterios Katsifodimos, Georgi Krastev, and Volker Markl. 2016. Implicit Parallelism through Deep Language Embedding. SIGMOD Rec. 45, 1 (2016), 51--58.
[3]
James Cheney, Sam Lindley, and Philip Wadler. 2014. Query Shredding: Efficient Relational Evaluation of Queries over Nested Multisets. In SIGMOD.
[4]
Leonidas Fegaras and David Maier. 2000. Optimizing Object Queries Using an Effective Calculus. TODS 25, 4 (2000).
[5]
Leonidas Fegaras and Md Hasanuzzaman Noor. 2020. Translation of Array-Based Loops to Distributed Data-Parallel Programs. In VLDB.
[6]
Andreas Kunft, Asterios Katsifodimos, Sebastian Schelter, Sebastian Breß, Tilmann Rabl, and Volker Markl. 2019. An Intermediate Representation for Optimizing Machine Learning Pipelines. In VLDB.
[7]
Ingo Müller, Ghislain Fourny, Stefan Irimescu, Can Berker Cikis, and Gustavo Alonso. 2021. Rumble: Data Independence for Large Messy Data Sets. In VLDB.
[8]
Erik Pasternak, Rachel Fenichel, and Andrew N. Marshall. 2017. Tips for creating a block language with blockly. In 2017 IEEE Blocks and Beyond Workshop (B B). 21--24.
[9]
Jaclyn Smith, Michael Benedikt, Milos Nikolic, and Amir Shaikhha. 2021. Scalable Querying of Nested Data. In VLDB.
[10]
Jaclyn Smith, Michael Benedikt, Milos Nikolic, and Yao Shi. 2020. Scalable Analysis of Multi-Modal Biomedical Data. bioarxiv.org.
[11]
Alexander Ulrich. 2019. Query Flattening and the Nested Data Parallelism Paradigm. Ph.D. Dissertation. University of Tübingen, Germany. https://publikationen.uni-tuebingen.de/xmlui/handle/10900/87698/
[12]
Alexander Ulrich and Torsten Grust. 2015. The Flatter, the Better: Query Compilation Based on the Flattening Transformation. In SIGMOD.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 14, Issue 12
July 2021
587 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2021
Published in PVLDB Volume 14, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 24
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media