Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleSeptember 2024
A Deep Learning Multi-omics Framework to Combine Microbiome and Metabolome Profiles for Disease Classification
Artificial Neural Networks and Machine Learning – ICANN 2024Pages 3–14https://doi.org/10.1007/978-3-031-72353-7_1AbstractMicrobiome and metabolome contain information about host disease. Therefore, a multi-omics analysis of these data types can provide key constraints for disease classification. However, due to multi-omics data’s complex and high-dimensional nature, ...
- research-articleSeptember 2024
Data integration from traditional to big data: main features and comparisons of ETL approaches
The Journal of Supercomputing (JSCO), Volume 80, Issue 19Pages 26687–26725https://doi.org/10.1007/s11227-024-06413-1AbstractData integration combines information from different sources to provide a comprehensive view for making informed business decisions. The ETL (Extract, Transform, and Load) process is essential in data integration. In the past two decades, modeling ...
- research-articleAugust 2024
Empirical Bayes linked matrix decomposition
Machine Language (MALE), Volume 113, Issue 10Pages 7451–7477https://doi.org/10.1007/s10994-024-06599-8AbstractData for several applications in diverse fields can be represented as multiple matrices that are linked across rows or columns. This is particularly common in molecular biomedical research, in which multiple molecular “omics” technologies may ...
- ArticleSeptember 2024
High-Dimensional Nearest Neighbor Search-Based Blocking in Entity Resolution
AbstractEntity resolution is a key task in data integration and fusion, aiming to find all records describing the same real-world entity from multiple data sources. Blocking is an important step in entity resolution tasks to address the secondary time ...
- review-articleJune 2024
Situational Data Integration in Question Answering systems: a survey over two decades
- Maria Helena Franciscatto,
- Luis Carlos Erpen de Bona,
- Celio Trois,
- Marcos Didonet Del FabroFabro,
- João Carlos Damasceno Lima
Knowledge and Information Systems (KAIS), Volume 66, Issue 10Pages 5875–5918https://doi.org/10.1007/s10115-024-02136-0AbstractQuestion Answering (QA) systems provide accurate answers to questions; however, they lack the ability to consolidate data from multiple sources, making it difficult to manage complex questions that could be answered with additional data retrieved ...
-
- review-articleJuly 2024
Deep generative models in single-cell omics
Computers in Biology and Medicine (CBIM), Volume 176, Issue Chttps://doi.org/10.1016/j.compbiomed.2024.108561AbstractDeep Generative Models (DGMs) are becoming instrumental for inferring probability distributions inherent to complex processes, such as most questions in biomedical research. For many years, there was a lack of mathematical methods that would ...
Graphical abstractDisplay Omitted
Highlights- Deep generative models propose a latent probability distribution to explain the data.
- Deep generative models rely on abundant data and good task definition.
- Deep generative models expand integrative single-cell -omics data ...
- research-articleJune 2024
Integrating social media data: Venues, groups and activities
Expert Systems with Applications: An International Journal (EXWA), Volume 243, Issue Chttps://doi.org/10.1016/j.eswa.2023.122902AbstractSocial media has been fuelling necessary research in different areas, including the large-scale study of urban societies. Most research is done with a single source of information. Integrating data from multiple sources provides several benefits; ...
Highlights- Integration solution from the perspective of venues that improves state-of-the-art.
- Group of users integration solution; no other effort in this direction was found.
- Integration solution based on the activity performed by users in ...
- research-articleJuly 2024
Developing a goal-driven data integration framework for effective data analytics
AbstractData integration plays a crucial role in business intelligence, aiding decision-makers by consolidating data from heterogeneous sources to provide deep insights into business operations and performance. In the big data era, automated data ...
Highlights- This study designs and instantiates a goal-driven data integration framework for data analytics.
- The proposed innovative design automates data integration for non-technical data users.
- Our artifact shows promising performance in ...
- research-articleApril 2024
How opportunistic mobile monitoring can enhance air quality assessment?
Geoinformatica (KLU-GEIN), Volume 28, Issue 4Pages 679–710https://doi.org/10.1007/s10707-024-00516-wAbstractThe deteriorating air quality in urban areas, particularly in developing countries, has led to increased attention being paid to the issue. Daily reports of air pollution are essential to effectively manage public health risks. Pollution ...
- surveyApril 2024
Semantic Data Integration and Querying: A Survey and Challenges
- Maroua Masmoudi,
- Sana Ben Abdallah Ben Lamine,
- Mohamed Hedi Karray,
- Bernard Archimede,
- Hajer Baazaoui Zghal
ACM Computing Surveys (CSUR), Volume 56, Issue 8Article No.: 209, Pages 1–35https://doi.org/10.1145/3653317Digital revolution produces massive, heterogeneous and isolated data. These latter remain underutilized, unsuitable for integrated querying and knowledge discovering. Hence the importance of this survey on data integration which identifies challenging ...
- research-articleApril 2024
Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects
- Elisa Warner,
- Joonsang Lee,
- William Hsu,
- Tanveer Syeda-Mahmood,
- Charles E. Kahn Jr.,
- Olivier Gevaert,
- Arvind Rao
International Journal of Computer Vision (IJCV), Volume 132, Issue 9Pages 3753–3769https://doi.org/10.1007/s11263-024-02032-8AbstractMachine learning (ML) applications in medical artificial intelligence (AI) systems have shifted from traditional and statistical methods to increasing application of deep learning models. This survey navigates the current landscape of multimodal ...
- research-articleMay 2024
Balancing observational data and experiential knowledge in environmental flows modeling
Environmental Modelling & Software (ENMS), Volume 173, Issue Chttps://doi.org/10.1016/j.envsoft.2024.105943AbstractEnvironmental flow (e-flow) decision making relies on flow-ecology models to predict ecological outcomes under different flow regimes. While expert knowledge has traditionally informed these models, there is increasing use of data-driven ...
Highlights- Exploration of a method to incorporate monitoring data into expert-opinion built Bayesian CPNs.
- Synthetic data records of varying characteristics were used to assess impact on model outcomes.
- Demonstrate that complementary use of ...
- review-articleMarch 2024
GSM: A generalized approach to Supervised Meta-blocking for scalable entity resolution
AbstractEntity Resolution (ER) constitutes a core data integration task that relies on Blocking in order to tame its quadratic time complexity. Schema-agnostic blocking achieves very high recall, requires no domain knowledge and applies to data of any ...
Highlights- Formalization of meta-blocking as a probabilistic classification task.
- A supervised meta-blocking algorithm that requires only 50 examples for training.
- Four new weighting schemes that enhance the meta-blocking performance.
- ...
- research-articleJanuary 2024
Record Fusion via Inference and Data Augmentation
ACM / IMS Journal of Data Science (JDS), Volume 1, Issue 1Article No.: 2, Pages 1–23https://doi.org/10.1145/3593579We introduce a learning framework for the problem of unifying conflicting data in multiple records referring to the same entity—we call this problem “record fusion.” Record fusion generalizes two known problems: “data fusion” and “golden record.” Our ...
HighlightsProblem statement
Record fusion involves merging duplicate records from different sources into a single, unified record. It helps to improve data quality, reduce redundancy, and enable more accurate analysis and decision-making. However, the ...
- research-articleDecember 2023
Heterogeneous multi-task feature learning with mixed regularization
- ArticleDecember 2023
Provenance-Aware Data Integration and Summarization Querying for Knowledge Graphs
Information Integration and Web IntelligencePages 293–308https://doi.org/10.1007/978-3-031-48316-5_29AbstractKnowledge graphs are an increasingly popular choice for integrating heterogeneous data that have been collected independently from multiple sources. In many scenarios, including research and industry collaborations, the source data can be provided ...
- research-articleNovember 2023
A multi-facet analysis of BERT-based entity matching models
The VLDB Journal — The International Journal on Very Large Data Bases (VLDB), Volume 33, Issue 4Pages 1039–1064https://doi.org/10.1007/s00778-023-00824-xAbstractState-of-the-art Entity Matching approaches rely on transformer architectures, such as BERT, for generating highly contextualized embeddings of terms. The embeddings are then used to predict whether pairs of entity descriptions refer to the same ...
- ArticleNovember 2023
Managing Personal Information
AbstractThere is an increasing awareness of the potential that our own self-gathered personal information has for our wellness and our health. This is partly because of our increasing awareness of what others – the major internet companies mainly – have ...
- research-articleFebruary 2024
Integration of incomplete multi-omics data using Knowledge Distillation and Supervised Variational Autoencoders for disease progression prediction
Journal of Biomedical Informatics (JOBI), Volume 147, Issue Chttps://doi.org/10.1016/j.jbi.2023.104512Abstract Objective:The rapid advancement of high-throughput technologies in the biomedical field has resulted in the accumulation of diverse omics data types, such as mRNA expression, DNA methylation, and microRNA expression, for studying various ...
Graphical abstractDisplay Omitted
Highlights- Proposed method distinguishes long and short-term breast and kidney cancer survivors.
- Knowledge distillation-based VAE enables the utilization of all modalities.
- Predicting disease progression leads to more personalized and ...