Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3012071.3012077acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

The next information architecture evolution: the data lake wave

Published: 01 November 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Data warehouses and data marts have long been considered as the unique solution for providing end-users with decisional information. More recently, data lakes have been proposed in order to govern data swamps. However, no formal definition has been proposed in the literature. Existing works are not complete and miss important parts of the topic. In particular, they do not focus on the influence of the data gravity, the infrastructure role of those solutions and of course are proposing divergent definitions and positioning regarding the usage and the interaction with existing decision support system.
    In this paper, we propose a novel definition of data lakes, together with a comparison with other over several criteria as the way to populate them, how to use, what is the Data Lake end user profile. We claim that data lakes are complementary components in decisional information systems and we discuss their position and interactions regarding the other components by proposing an interaction model.

    References

    [1]
    Cloudera. Turn Your Data Lake into an Enterprise Data Hub. https://vision.cloudera.com/turn-your-data-lake-into-an-enterprise-data-hub/, 2014.
    [2]
    M. J. Druzdzel and R. R. Flynn. Decision support systems, 2000.
    [3]
    H. Fang. Managing data lakes in big data era: What's a data lake and why has it became popular in data management ecosystem. In International Conference on Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), pages 820--824. IEEE, 2015.
    [4]
    Gartner. Gartner Says Beware of the Data Lake Fallacy. http://www.gartner.com/newsroom/id/2809117, 2014.
    [5]
    Hortonworks. A Modern Data Architecture whith Apache Hadoop. http://info.hortonworks.com/rs/h2source/images/Hadoop-Data-Lake-white-paper.pdf, 2014.
    [6]
    IBM. Governing and Managing Big Data for Analytics and Decision Makers. http://www.redbooks.ibm.com/abstracts/redp5120.html?Open, 2014.
    [7]
    W. H. Inmon. Building the Data Warehouse. QED Information Sciences, Inc., Wellesley, MA, USA, 1992.
    [8]
    R. Kimball and J. Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming and Delivering Data. John Wiley & Sons, 2004.
    [9]
    R. Kimball and M. Ross. The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. Wiley Publishing, 3rd edition, 2013.
    [10]
    J. Ladley. Data Governance: How to Design, Deploy and Sustain an Effective Data Governance Program. The Morgan Kaufmann Series on Business Intelligence. Elsevier Science, 2012.

    Cited By

    View all
    • (2024)Implementing Federated Governance in Data Mesh ArchitectureFuture Internet10.3390/fi1604011516:4(115)Online publication date: 29-Mar-2024
    • (2024)Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse PipelineJMIR Medical Informatics10.2196/5459012(e54590-e54590)Online publication date: 17-Jul-2024
    • (2024)Challenges of CPS/IoT Network Architecture in 6G EraIEEE Access10.1109/ACCESS.2024.339536312(62804-62817)Online publication date: 2024
    • Show More Cited By

    Index Terms

    1. The next information architecture evolution: the data lake wave

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      MEDES: Proceedings of the 8th International Conference on Management of Digital EcoSystems
      November 2016
      243 pages
      ISBN:9781450342674
      DOI:10.1145/3012071
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 November 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. data governance
      2. data lab
      3. data laboratory
      4. data lakes
      5. data reservoirs
      6. data warehouses
      7. digital transformation
      8. internet of things

      Qualifiers

      • Research-article

      Conference

      MEDES'16

      Acceptance Rates

      Overall Acceptance Rate 267 of 682 submissions, 39%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)157
      • Downloads (Last 6 weeks)13
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Implementing Federated Governance in Data Mesh ArchitectureFuture Internet10.3390/fi1604011516:4(115)Online publication date: 29-Mar-2024
      • (2024)Data Lake, Data Warehouse, Datamart, and Feature Store: Their Contributions to the Complete Data Reuse PipelineJMIR Medical Informatics10.2196/5459012(e54590-e54590)Online publication date: 17-Jul-2024
      • (2024)Challenges of CPS/IoT Network Architecture in 6G EraIEEE Access10.1109/ACCESS.2024.339536312(62804-62817)Online publication date: 2024
      • (2024)Tools for Healthcare Data Lake Infrastructure BenchmarkingInformation Systems Frontiers10.1007/s10796-023-10468-5Online publication date: 17-Jan-2024
      • (2024)A Multi-dimensional Model for the Design and Development of Analytical Information SystemsEnterprise, Business-Process and Information Systems Modeling10.1007/978-3-031-61007-3_22(291-306)Online publication date: 31-May-2024
      • (2024)Data Mesh Adoption: A Multi-case and Multi-method Readiness ApproachInformation Systems10.1007/978-3-031-56481-9_2(16-29)Online publication date: 30-Mar-2024
      • (2023)Data Lake Management System based on Topic ModelingData and Metadata10.56294/dm20231832(183)Online publication date: 28-Dec-2023
      • (2023)Data integration for digital twins in the built environment based on federated data modelsProceedings of the Institution of Civil Engineers - Smart Infrastructure and Construction10.1680/jsmic.23.00002176:4(194-211)Online publication date: 1-Dec-2023
      • (2023)Data Lakes: A Survey of Functions and SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.327010135:12(12571-12590)Online publication date: 1-Dec-2023
      • (2023)Key Lessons from Microservices for Data Mesh Adoption2023 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC)10.1109/MIUCC58832.2023.10278300(1-8)Online publication date: 27-Sep-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media