Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3384677acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Open access

RASQL: A Powerful Language and its System for Big Data Applications

Published: 31 May 2020 Publication History

Abstract

There is a growing interest in supporting advanced Big Data applications on distributed data processing platforms. Most of these systems support SQL or its dialect as the query interface due to its portability and declarative nature. However, current SQL standard cannot effectively express advanced analytical queries due to its limitation in supporting recursive queries. In this demonstration, we show that this problem can be resolved via a simple SQL extension that delivers greater expressive power by allowing aggregates in recursion. To this end, we propose the Recursive-aggregate-SQL (RASQL) language and its system on top of Apache Spark to express and execute complex queries and declarative algorithms in many applications, such as graph search and machine learning. With a variety of examples, we will (i) show how complicated analytic queries can be expressed with RASQL; (ii) illustrate formal semantics of the powerful new constructs; and (iii) present a user-friendly interface to interact with the RASQL system and monitor the query results.

References

[1]
M. Armbrust, R. S. Xin, and C. L. et al. Spark SQL: relational data processing in spark. In SIGMOD, pages 1383--1394, 2015.
[2]
F. Arni, K. Ong, S. Tsur, H. Wang, and C. Zaniolo. The deductive database system LDL+. TPLP, 3(1):61--94, 2003.
[3]
A. Das, Y. Li, J. Wang, M. Li, and C. Zaniolo. Bigdata applications from graph analytics to machine learning by aggregates in recursion. In ICLP, pages 273--279, 2019.
[4]
J. Gu, Y. Watanabe, W. Mazza, A. Shkapsky, M. Yang, L. Ding, and C. Zaniolo. Rasql: Greater power and performance for big data analytics with recursive-aggregate-sql on spark. In SIGMOD, pages 467--484, 2019.
[5]
K. W. Ong, Y. Papakonstantinou, and R. Vernoux. The SQL+ semi-structured data model and query language: A capabilities survey of sql-on-hadoop, nosql and newsql databases. CoRR, abs/1405.3631, 2014.
[6]
M. Stonebraker, D. J. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. Mapreduce and parallel dbmss: friends or foes? Commun. ACM, 53(1):64--71, 2010.
[7]
C. Zaniolo, A. Das, J. Gu, Y. Li, M. Li, and J. Wang. Monotonic properties of completed aggregates in recursive queries. CoRR, abs/1910.08888, 2019.
[8]
C. Zaniolo, M. Yang, A. Das, A. Shkapsky, T. Condie, and M. Interlandi. Fixpoint semantics and optimization of recursive datalog programs with aggregates. TPLP, 17(5--6):1048--1065, 2017.

Cited By

View all
  • (2023)Bring Your Own Data Structures to DatalogProceedings of the ACM on Programming Languages10.1145/36228407:OOPSLA2(1198-1223)Online publication date: 16-Oct-2023
  • (2023)Communication-Avoiding Recursive Aggregation2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00024(197-208)Online publication date: 31-Oct-2023
  • (2021)KDDLog:Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00113(1260-1271)Online publication date: Apr-2021
  • Show More Cited By

Index Terms

  1. RASQL: A Powerful Language and its System for Big Data Applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
    June 2020
    2925 pages
    ISBN:9781450367356
    DOI:10.1145/3318464
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. big data
    2. query language
    3. recursive query

    Qualifiers

    • Short-paper

    Conference

    SIGMOD/PODS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)121
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Bring Your Own Data Structures to DatalogProceedings of the ACM on Programming Languages10.1145/36228407:OOPSLA2(1198-1223)Online publication date: 16-Oct-2023
    • (2023)Communication-Avoiding Recursive Aggregation2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00024(197-208)Online publication date: 31-Oct-2023
    • (2021)KDDLog:Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00113(1260-1271)Online publication date: Apr-2021
    • (2021)Formal semantics and high performance in declarative machine learning using DatalogThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00665-630:5(859-881)Online publication date: 31-May-2021
    • (2020)Proceedings 36th International Conference on Logic Programming (Technical Communications)Electronic Proceedings in Theoretical Computer Science10.4204/EPTCS.325.9325(35-37)Online publication date: 19-Sep-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media