Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3384677acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Open access

RASQL: A Powerful Language and its System for Big Data Applications

Published: 31 May 2020 Publication History
  • Get Citation Alerts
  • Abstract

    There is a growing interest in supporting advanced Big Data applications on distributed data processing platforms. Most of these systems support SQL or its dialect as the query interface due to its portability and declarative nature. However, current SQL standard cannot effectively express advanced analytical queries due to its limitation in supporting recursive queries. In this demonstration, we show that this problem can be resolved via a simple SQL extension that delivers greater expressive power by allowing aggregates in recursion. To this end, we propose the Recursive-aggregate-SQL (RASQL) language and its system on top of Apache Spark to express and execute complex queries and declarative algorithms in many applications, such as graph search and machine learning. With a variety of examples, we will (i) show how complicated analytic queries can be expressed with RASQL; (ii) illustrate formal semantics of the powerful new constructs; and (iii) present a user-friendly interface to interact with the RASQL system and monitor the query results.

    References

    [1]
    M. Armbrust, R. S. Xin, and C. L. et al. Spark SQL: relational data processing in spark. In SIGMOD, pages 1383--1394, 2015.
    [2]
    F. Arni, K. Ong, S. Tsur, H. Wang, and C. Zaniolo. The deductive database system LDL+. TPLP, 3(1):61--94, 2003.
    [3]
    A. Das, Y. Li, J. Wang, M. Li, and C. Zaniolo. Bigdata applications from graph analytics to machine learning by aggregates in recursion. In ICLP, pages 273--279, 2019.
    [4]
    J. Gu, Y. Watanabe, W. Mazza, A. Shkapsky, M. Yang, L. Ding, and C. Zaniolo. Rasql: Greater power and performance for big data analytics with recursive-aggregate-sql on spark. In SIGMOD, pages 467--484, 2019.
    [5]
    K. W. Ong, Y. Papakonstantinou, and R. Vernoux. The SQL+ semi-structured data model and query language: A capabilities survey of sql-on-hadoop, nosql and newsql databases. CoRR, abs/1405.3631, 2014.
    [6]
    M. Stonebraker, D. J. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. Mapreduce and parallel dbmss: friends or foes? Commun. ACM, 53(1):64--71, 2010.
    [7]
    C. Zaniolo, A. Das, J. Gu, Y. Li, M. Li, and J. Wang. Monotonic properties of completed aggregates in recursive queries. CoRR, abs/1910.08888, 2019.
    [8]
    C. Zaniolo, M. Yang, A. Das, A. Shkapsky, T. Condie, and M. Interlandi. Fixpoint semantics and optimization of recursive datalog programs with aggregates. TPLP, 17(5--6):1048--1065, 2017.

    Cited By

    View all
    • (2023)Bring Your Own Data Structures to DatalogProceedings of the ACM on Programming Languages10.1145/36228407:OOPSLA2(1198-1223)Online publication date: 16-Oct-2023
    • (2023)Communication-Avoiding Recursive Aggregation2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00024(197-208)Online publication date: 31-Oct-2023
    • (2021)KDDLog:Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00113(1260-1271)Online publication date: Apr-2021
    • Show More Cited By

    Index Terms

    1. RASQL: A Powerful Language and its System for Big Data Applications

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
      June 2020
      2925 pages
      ISBN:9781450367356
      DOI:10.1145/3318464
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 31 May 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. big data
      2. query language
      3. recursive query

      Qualifiers

      • Short-paper

      Conference

      SIGMOD/PODS '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)116
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Bring Your Own Data Structures to DatalogProceedings of the ACM on Programming Languages10.1145/36228407:OOPSLA2(1198-1223)Online publication date: 16-Oct-2023
      • (2023)Communication-Avoiding Recursive Aggregation2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00024(197-208)Online publication date: 31-Oct-2023
      • (2021)KDDLog:Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00113(1260-1271)Online publication date: Apr-2021
      • (2021)Formal semantics and high performance in declarative machine learning using DatalogThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00665-630:5(859-881)Online publication date: 31-May-2021
      • (2020)Proceedings 36th International Conference on Logic Programming (Technical Communications)Electronic Proceedings in Theoretical Computer Science10.4204/EPTCS.325.9325(35-37)Online publication date: 19-Sep-2020

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media