Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A demonstration of the BigDAWG polystore system

Published: 01 August 2015 Publication History

Abstract

This paper presents BigDAWG, a reference implementation of a new architecture for "Big Data" applications. Such applications not only call for large-scale analytics, but also for real-time streaming support, smaller analytics at interactive speeds, data visualization, and cross-storage-system queries. Guided by the principle that "one size does not fit all", we build on top of a variety of storage engines, each designed for a specialized use case. To illustrate the promise of this approach, we demonstrate its effectiveness on a hospital application using data from an intensive care unit (ICU). This complex application serves the needs of doctors and researchers and provides real-time support for streams of patient data. It showcases novel approaches for querying across multiple storage engines, data visualization, and scalable real-time analytics.

References

[1]
Spark Streaming. https://spark.apache.org/docs/0.9.0/streaming-programming-guide.html#setting-the-right-batch-size.
[2]
U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, J. Meehan, A. Pavlo, M. Stonebraker, E. Sutherland, and N. Tatbul. S-Store: A Streaming NewSQL System for Big Velocity Applications. PVLDB, 7(13), 2014.
[3]
S. Chandrasekaran, O. Cooper, A. Deshpande, M. J. Franklin, J. M. Hellerstein, W. Hong, S. Krishnamurthy, S. R. Madden, F. Reiss, and M. A. Shah. TelegraphCQ: continuous dataflow processing. In SIGMOD, pages 668--668, 2003.
[4]
J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In SIGMOD Record, volume 29, pages 379--390, 2000.
[5]
P. Cudré-Mauroux, H. Kimura, K.-T. Lim, J. Rogers, R. Simakov, E. Soroush, P. Velikhov, D. Wang, M. Balazinska, J. Becla, et al. A Demonstration of SciDB: A Science-Oriented DBMS. PVLDB, 2(2):1534--1537, 2009.
[6]
M. Franklin, A. Halevy, and D. Maier. From databases to dataspaces: a new abstraction for information management. Sigmod Record, 34(4):27--33, 2005.
[7]
D. Halperin, V. Teixeira de Almeida, L. L. Choo, S. Chu, P. Koutris, D. Moritz, J. Ortiz, V. Ruamviboonsuk, J. Wang, A. Whitaker, et al. Demonstration of the Myria big data management service. In SIGMOD. ACM, 2014.
[8]
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. B. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: A high-performance, distributed main memory transaction processing system. In PVLDB, volume 1, 2008.
[9]
S. Kandel, A. Paepcke, J. M. Hellerstein, and J. Heer. Enterprise data analysis and visualization: An interview study. Visualization and Computer Graphics, IEEE Transactions on, 18(12):2917--2926, 2012.
[10]
J. Kepner, W. Arcand, W. Bergeron, N. Bliss, R. Bond, C. Byun, G. Condon, K. Gregson, M. Hubbell, and J. Kurz. Dynamic distributed dimensional data model (d4m) database and computation system. In ICASSP. IEEE, 2012.
[11]
H. Lim, Y. Han, and S. Babu. How to fit when no one size fits. In CIDR, volume 4, page 35, 2013.
[12]
A. Nandi and H. Jagadish. Guided interaction: Rethinking the query-result paradigm. PVLDB, 4(12):1466--1469, 2011.
[13]
M. Saeed, M. Villarroel, A. T. Reisner, G. Clifford, L.-W. Lehman, G. Moody, T. Heldt, T. H. Kyaw, B. Moody, and R. G. Mark. Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): A public-access intensive care unit database. Critical Care Medicine, 39: 952--960, 2011.
[14]
M. Stonebraker and U. Cetintemel. "One Size Fits All": An Idea Whose time has come and gone. In ICDE, pages 2--11, 2005.

Cited By

View all
  • (2025)A universal approach for simplified redundancy-aware cross-model queryingInformation Systems10.1016/j.is.2024.102456127:COnline publication date: 7-Jan-2025
  • (2024)A systematic overview of data federation systemsSemantic Web10.3233/SW-22320115:1(107-165)Online publication date: 12-Jan-2024
  • (2024)Generating Cross-model Analytics Workloads Using LLMsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679932(4303-4307)Online publication date: 21-Oct-2024
  • Show More Cited By
  1. A demonstration of the BigDAWG polystore system

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 8, Issue 12
    Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii
    August 2015
    728 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 August 2015
    Published in PVLDB Volume 8, Issue 12

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A universal approach for simplified redundancy-aware cross-model queryingInformation Systems10.1016/j.is.2024.102456127:COnline publication date: 7-Jan-2025
    • (2024)A systematic overview of data federation systemsSemantic Web10.3233/SW-22320115:1(107-165)Online publication date: 12-Jan-2024
    • (2024)Generating Cross-model Analytics Workloads Using LLMsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679932(4303-4307)Online publication date: 21-Oct-2024
    • (2023)Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data MeshesProceedings of the VLDB Endowment10.14778/3611479.361152616:11(3293-3301)Online publication date: 1-Jul-2023
    • (2023)Proactive Streaming Analytics at Scale: A Journey from the State-of-the-art to a Production PlatformProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615293(5204-5207)Online publication date: 21-Oct-2023
    • (2023)Multi-model query languages: taming the variety of big dataDistributed and Parallel Databases10.1007/s10619-023-07433-142:1(31-71)Online publication date: 31-May-2023
    • (2022)Processing Analytical Queries over Polystore System for a Large Astronomy Data RepositoryApplied Sciences10.3390/app1205266312:5(2663)Online publication date: 4-Mar-2022
    • (2022)Self-Adapting Design and Maintenance of Multi-Model DatabasesProceedings of the 26th International Database Engineered Applications Symposium10.1145/3548785.3548810(9-15)Online publication date: 22-Aug-2022
    • (2022)Skeena: Efficient and Consistent Cross-Engine TransactionsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526171(34-48)Online publication date: 10-Jun-2022
    • (2022)Tearing Down the Tower of Babel: Unified and Efficient Spatio-temporal Queries for NoSQL Stores2022 23rd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM55031.2022.00024(19-28)Online publication date: Jun-2022
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media