Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
Yugen SDL: Semantic Data Lake Design for Relational Database from Enterprise Data Platforms
WSSE '24: Proceedings of the 2024 The 6th World Symposium on Software Engineering (WSSE)Pages 54–61https://doi.org/10.1145/3698062.3698070Although data lake technology has received increasing research attention in recent years with the popularity of Big Data and heterogeneous data technologies, it still receives little attention in the enterprise space and relation to relational databases. ...
- ArticleJune 2024
Implementation Patterns for Zone Architectures in Enterprise-Grade Data Lakes
AbstractIn industry practice, zone models have been established as data lake architectures of choice to enable the reuse of data preparation, data modeling, and analytical results across the entire platform. However, when implementing a zone-based data ...
- research-articleSeptember 2024
Implementation of Data Lake: Narratives of Quasi-outsourcing
Procedia Computer Science (PROCS), Volume 238, Issue CPages 550–557https://doi.org/10.1016/j.procs.2024.06.059AbstractThis study explores implementation of Data Lake for Big data analytics marked by intervention of quasi-outsourcing personnel. Though an agentic interventionist, it takes place in an active Multinational Consumer Goods company (MCGC) of socially ...
- ArticleOctober 2023
Multi-dimensional Complex Query Optimization for Disease-Specific Data Exploration Based on Data Lake
AbstractIn the medical field, huge amounts of multi-modal medical data are generated daily from various smart devices. Besides EMRs, medical data include a large amount of unstructured data such as MRI scans, CT scans, and X-rays. These massive, ...
- ArticleSeptember 2023
Assessment of Data Quality Through Multi-granularity Data Profiling
Advances in Databases and Information SystemsPages 195–209https://doi.org/10.1007/978-3-031-42914-9_14AbstractThe management of modern solutions for Big Data management and analytics, most notably Data Lakes and Data Lakehouses, is faced with new challenges stemming from the versatility offered by such technologies, as well as the continuously evolving ...
-
- ArticleJuly 2023
Precision Learning ThroughData Intelligence
AbstractSimulations and learning environments often generate massive amounts of human performance data, which can be a challenge to manage and interpret in a meaningful manner. Lacking a context that provides meaning to data, it exists as information ...
- research-articleJanuary 2023
Data Platforms for Real-time Insights in Healthcare: Systematic Review
Procedia Computer Science (PROCS), Volume 220, Issue CPages 826–831https://doi.org/10.1016/j.procs.2023.03.110AbstractThe ever-growing usage and popularity of Internet of Things devices, coupled with Big Data technologies and machine learning algorithms, have allowed for data engineers to explore new opportunities in healthcare and continuous care. Furthermore, ...
- research-articleJanuary 2023
Lessons learnt in industrial data platform integration
- Sylvain Lacroix,
- Emeric Ostermeyer,
- Julien Le Duigou,
- Florent Bornard,
- Sylvain Rival,
- Marie-France Mary,
- Benoit Eynard
Procedia Computer Science (PROCS), Volume 217, Issue CPages 1660–1669https://doi.org/10.1016/j.procs.2022.12.366AbstractIn a paradigm of value creation from data collection and analysis, using Big Data, and particularly mining techniques, industrial actors are confronted to a series of issue regarding data collection, storage, distribution, and management.
In this ...
- ArticleSeptember 2022
A Knowledge-Based Approach to Support Analytic Query Answering in Semantic Data Lakes
Advances in Databases and Information SystemsPages 179–192https://doi.org/10.1007/978-3-031-15740-0_14AbstractThe increased flexibility brought by Data Lake technologies, along with size and heterogeneity of quickly changing data sources, bring novel challenges to their management. Making sense of disparate data and supporting users to identify the most ...
- research-articleJuly 2022
DLToDW: Transferring Relational and NoSQL Databases from a Data Lake
AbstractOver the past decade, digital transformation has led to the evolution of databases towards Big Data. A need to collect and analyze data from different sources has emerged. At the same time, traditional decision support systems are unable to meet ...
- articleJune 2022
A Big Data Pipeline and Machine Learning for Uniform Semantic Representation of Data and Documents From IT Systems of the Italian Ministry of Justice
- Beniamino Di Martino,
- Luigi Colucci Cante,
- Salvatore D'Angelo,
- Antonio Esposito,
- Mariangela Graziano,
- Fiammetta Marulli,
- Pietro Lupi,
- Alessandra Cataldi
International Journal of Grid and High Performance Computing (IJGHPC-IGI), Volume 14, Issue 1Pages 1–31https://doi.org/10.4018/IJGHPC.301579In this paper a Big Data Pipeline is presented, taking in consideration both structured and unstructured data made available by the Italian Ministry of Justice, regarding their Telematic Civil Process. Indeed, the complexity and volume of the data ...
- research-articleJanuary 2022
A Scalable framework for data lakes ingestion
Procedia Computer Science (PROCS), Volume 215, Issue CPages 809–814https://doi.org/10.1016/j.procs.2022.12.083AbstractIn the age of big data, the way we store and analyze heterogeneous data has changed. The complexity of various data inputs in the lakes indicates the significant importance of data ingestion that aids companies in making sense and getting more ...
- research-articleJanuary 2022
Data Mesh: Concepts and Principles of a Paradigm Shift in Data Architectures
Procedia Computer Science (PROCS), Volume 196, Issue CPages 263–271https://doi.org/10.1016/j.procs.2021.12.013AbstractInherent to the growing use of the most varied forms of software (e.g., social applications), there is the creation and storage of data that, due to its characteristics (volume, variety, and velocity), make the concept of Big Data emerge. Big Data ...
- research-articleDecember 2021
A Semantic Data Lake Model for Analytic Query-Driven Discovery
iiWAS2021: The 23rd International Conference on Information Integration and Web IntelligencePages 183–186https://doi.org/10.1145/3487664.3487783Data Lake (DL) architectures have recently emerged as an effective solution to the problem of data analytics with big, highly heterogeneous, and quickly changing data sources. However, novel challenges arise too, including how to make sense of ...
- ArticleOctober 2021
MKGB: A Medical Knowledge Graph Construction Framework Based on Data Lake and Active Learning
AbstractMedical knowledge graph (MKG) provides ideal technical support for integrating multi-source heterogeneous data and enhancing graph-based services. These multi-source data are usually huge, heterogeneous, and difficult to manage. To ensure that the ...
- ArticleSeptember 2021
MHDP: An Efficient Data Lake Platform for Medical Multi-source Heterogeneous Data
- Peng Ren,
- Shuaibo Li,
- Wei Hou,
- Wenkui Zheng,
- Zhen Li,
- Qin Cui,
- Wang Chang,
- Xin Li,
- Chun Zeng,
- Ming Sheng,
- Yong Zhang
AbstractIn medical domain, huge amounts of data are generated at all times. These data are usually difficult to access, with poor data quality and many data islands. Besides, with a wide range of sources and complex structure, these data contain essential ...
- ArticleSeptember 2021
Intelligent Visualization System for Big Multi-source Medical Data Based on Data Lake
AbstractWith the rapid development of information technology, large amounts of multi-source data are constantly being generated in medical field. The automatic visualization system based on them has gained a lot of attention, since the intuitive data ...
- research-articleSeptember 2021
Analysis-oriented Metadata for Data Lakes
IDEAS '21: Proceedings of the 25th International Database Engineering & Applications SymposiumPages 194–203https://doi.org/10.1145/3472163.3472273Data lakes are supposed to enable analysts to perform more efficient and efficacious data analysis by crossing multiple existing data sources, processes and analyses. However, it is impossible to achieve that when a data lake does not have a metadata ...
- research-articleSeptember 2021
A Zone-Based Data Lake Architecture for IoT, Small and Big Data
IDEAS '21: Proceedings of the 25th International Database Engineering & Applications SymposiumPages 94–102https://doi.org/10.1145/3472163.3472185Data lakes are supposed to enable analysts to perform more efficient and efficacious data analysis by crossing multiple existing data sources, processes and analyses. However, it is impossible to achieve that when a data lake does not have a metadata ...
- research-articleApril 2020
iStory: Intelligent Storytelling with Social Data
WWW '20: Companion Proceedings of the Web Conference 2020Pages 253–256https://doi.org/10.1145/3366424.3383553The production of knowledge from ever increasing amount of social data is seen by many organizations as an increasingly important capability that can complement the traditional analytics sources. Examples include extracting knowledge and deriving ...