Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1951365.1951432acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Big data and cloud computing: current state and future opportunities

Published: 21 March 2011 Publication History

Abstract

Scalable database management systems (DBMS)---both for update intensive application workloads as well as decision support systems for descriptive and deep analytics---are a critical part of the cloud infrastructure and play an important role in ensuring the smooth transition of applications from the traditional enterprise infrastructures to next generation cloud infrastructures. Though scalable data management has been a vision for more than three decades and much research has focussed on large scale data management in traditional enterprise setting, cloud computing brings its own set of novel challenges that must be addressed to ensure the success of data management solutions in the cloud environment. This tutorial presents an organized picture of the challenges faced by application developers and DBMS designers in developing and deploying internet scale applications. Our background study encompasses both classes of systems: (i) for supporting update heavy applications, and (ii) for ad-hoc analytics and decision support. We then focus on providing an in-depth analysis of systems for supporting update intensive web-applications and provide a survey of the state-of-the-art in this domain. We crystallize the design choices made by some successful systems large scale database management systems, analyze the application demands and access patterns, and enumerate the desiderata for a cloud-bound DBMS.

References

[1]
A. Abouzeid, K. B. Pawlikowski, D. J. Abadi, A. Rasin, and A. Silberschatz. HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. PVLDB, 2(1):922--933, 2009.
[2]
D. Agrawal, S. Das, and A. E. Abbadi. Big data and cloud computing: New wine or just new bottles? PVLDB, 3(2):1647--1648, 2010.
[3]
D. Agrawal, A. El Abbadi, S. Antony, and S. Das. Data Management Challenges in Cloud Computing Infrastructures. In DNIS, pages 1--10, 2010.
[4]
P. Agrawal, A. Silberstein, B. F. Cooper, U. Srivastava, and R. Ramakrishnan. Asynchronous view maintenance for vlsd databases. In SIGMOD Conference, pages 179--192, 2009.
[5]
S. Aulbach, D. Jacobs, A. Kemper, and M. Seibold. A comparison of flexible schemas for software as a service. In SIGMOD, pages 881--888, 2009.
[6]
P. Bernstein, C. Rein, and S. Das. Hyder -- A Transactional Record Manager for Shared Flash. In CIDR, 2011.
[7]
M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska. Building a database on S3. In SIGMOD, pages 251--264, 2008.
[8]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A Distributed Storage System for Structured Data. In OSDI, pages 205--218, 2006.
[9]
J. Cohen, B. Dolan, M. Dunlap, J. M. Hellerstein, and C. Welton. Mad skills: New analysis practices for big data. PVLDB, 2(2):1481--1492, 2009.
[10]
B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. PNUTS: Yahoo!'s hosted data serving platform. Proc. VLDB Endow., 1(2):1277--1288, 2008.
[11]
C. Curino, E. Jones, Y. Zhang, E. Wu, and S. Madden. Relational Cloud: The Case for a Database Service. Technical Report 2010-14, CSAIL, MIT, 2010. http://hdl.handle.net/1721.1/52606.
[12]
S. Das, S. Agarwal, D. Agrawal, and A. El Abbadi. ElasTraS: An Elastic, Scalable, and Self Managing Transactional Database for the Cloud. Technical Report 2010-04, CS, UCSB, 2010.
[13]
S. Das, D. Agrawal, and A. El Abbadi. ElasTraS: An Elastic Transactional Data Store in the Cloud. In USENIX HotCloud, 2009.
[14]
S. Das, D. Agrawal, and A. El Abbadi. G-Store: A Scalable Data Store for Transactional Multi key Access in the Cloud. In ACM SOCC, 2010.
[15]
S. Das, S. Nishimura, D. Agrawal, and A. El Abbadi. Live Database Migration for Elasticity in a Multitenant Database for Cloud Platforms. Technical Report 2010-09, CS, UCSB, 2010.
[16]
S. Das, Y. Sismanis, K. Beyer, R. Gemulla, P. Haas, and J. McPherson. Ricardo: Integrating R and Hadoop. In SIGMOD, 2010.
[17]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In OSDI, pages 137--150, 2004.
[18]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In SOSP, pages 205--220, 2007.
[19]
D. J. Dewitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H. I. Hsiao, and R. Rasmussen. The Gamma Database Machine Project. IEEE Trans. on Knowl. and Data Eng., 2(1):44--62, 1990.
[20]
The Apache Hadoop Project. http://hadoop.apache.org/core/, 2009.
[21]
P. Helland. Life beyond Distributed Transactions: An Apostate's Opinion. In CIDR, pages 132--141, 2007.
[22]
D. Jacobs and S. Aulbach. Ruminations on multi-tenant databases. In BTW, pages 514--521, 2007.
[23]
T. Kraska, M. Hentschel, G. Alonso, and D. Kossmann. Consistency Rationing in the Cloud: Pay only when it matters. PVLDB, 2(1):253--264, 2009.
[24]
D. B. Lomet, A. Fekete, G. Weikum, and M. J. Zwilling. Unbundling transaction services in the cloud. In CIDR Perspectives, 2009.
[25]
A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stonebraker. A comparison of approaches to large-scale data analysis. In SIGMOD, pages 165--178, 2009.
[26]
B. Reinwald. Database support for multi-tenant applications. In IEEE Workshop on Information and Software as Services, 2010.
[27]
J. B. Rothnie Jr., P. A. Bernstein, S. Fox, N. Goodman, M. Hammer, T. A. Landers, C. L. Reeve, D. W. Shipman, and E. Wong. Introduction to a System for Distributed Databases (SDD-1). ACM Trans. Database Syst., 5(1):1--17, 1980.
[28]
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy. Hive - A Warehousing Solution Over a Map-Reduce Framework. PVLDB, 2(2):1626--1629, 2009.
[29]
H. T. Vo, C. Chen, and B. C. Ooi. Towards elastic transactional cloud storage with range query support. PVLDB, 3(1):506--517, 2010.
[30]
W. Vogels. Data access patterns in the amazon.com technology platform. In VLDB, pages 1--1. VLDB Endowment, 2007.
[31]
C. D. Weissman and S. Bobrowski. The design of the force.com multitenant internet application development platform. In SIGMOD, pages 889--896, 2009.
[32]
F. Yang, J. Shanmugasundaram, and R. Yerneni. A scalable data platform for a large number of small applications. In CIDR, 2009.

Cited By

View all
  • (2024)Big Data IssuesEncyclopedia of Information Science and Technology, Sixth Edition10.4018/978-1-6684-7366-5.ch020(1-23)Online publication date: 1-Jul-2024
  • (2024)IntegratingOLAP with NoSQL Databases in Big Data Environments: Systematic MappingBig Data and Cognitive Computing10.3390/bdcc80600648:6(64)Online publication date: 5-Jun-2024
  • (2024)A Design For Comprehensive Information System Management Framework Integrating Secure Software Development, Resource Management, and Real-Time Monitoring2024 7th International Conference on Informatics and Computational Sciences (ICICoS)10.1109/ICICoS62600.2024.10636894(209-214)Online publication date: 17-Jul-2024
  • Show More Cited By

Index Terms

  1. Big data and cloud computing: current state and future opportunities

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EDBT/ICDT '11: Proceedings of the 14th International Conference on Extending Database Technology
    March 2011
    587 pages
    ISBN:9781450305280
    DOI:10.1145/1951365

    Sponsors

    • Microsoft Research: Microsoft Research

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 March 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    EDBT/ICDT '11
    Sponsor:
    • Microsoft Research
    EDBT/ICDT '11: EDBT/ICDT '11 joint conference
    March 21 - 24, 2011
    Uppsala, Sweden

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)238
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Big Data IssuesEncyclopedia of Information Science and Technology, Sixth Edition10.4018/978-1-6684-7366-5.ch020(1-23)Online publication date: 1-Jul-2024
    • (2024)IntegratingOLAP with NoSQL Databases in Big Data Environments: Systematic MappingBig Data and Cognitive Computing10.3390/bdcc80600648:6(64)Online publication date: 5-Jun-2024
    • (2024)A Design For Comprehensive Information System Management Framework Integrating Secure Software Development, Resource Management, and Real-Time Monitoring2024 7th International Conference on Informatics and Computational Sciences (ICICoS)10.1109/ICICoS62600.2024.10636894(209-214)Online publication date: 17-Jul-2024
    • (2023)Data Attributes in Quality Monitoring of Manufacturing Processes: The Welding CaseApplied Sciences10.3390/app13191058013:19(10580)Online publication date: 22-Sep-2023
    • (2023)Security and Privacy Issues in IoT-Based Big Data Cloud Systems in a Digital Twin ScenarioApplied Sciences10.3390/app1302075813:2(758)Online publication date: 5-Jan-2023
    • (2023)Advancements in Data Ingestion and Processing using HadoopSSRN Electronic Journal10.2139/ssrn.4666485Online publication date: 2023
    • (2023)Big Data Analytics and Cloud Computing: Implementation Issues and EffectsSSRN Electronic Journal10.2139/ssrn.4574676Online publication date: 2023
    • (2023)A systematic review on big data applications and scope for industrial processing and healthcare sectorsJournal of Big Data10.1186/s40537-023-00808-210:1Online publication date: 27-Aug-2023
    • (2023)Study on Big Data Applications in Civil Aviation Airport Management and ServicesProceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management10.1145/3656766.3656945(1080-1086)Online publication date: 24-Nov-2023
    • (2023)Evaluate Solutions for Achieving High Availability or Near Zero Downtime for Cloud Native Enterprise ApplicationsIEEE Access10.1109/ACCESS.2023.330343011(85384-85394)Online publication date: 2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media