Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3650203acmconferencesBook PagePublication PagesmodConference Proceedingsconference-collections
DEEM '24: Proceedings of the Eighth Workshop on Data Management for End-to-End Machine Learning
ACM2024 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
SIGMOD/PODS '24: International Conference on Management of Data Santiago AA Chile 9 June 2024
ISBN:
979-8-4007-0611-0
Published:
09 June 2024
Sponsors:

Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
research-article
Open Access
Croissant: A Metadata Format for ML-Ready Datasets

Data is a critical resource for Machine Learning (ML), yet working with data remains a key friction point. This paper introduces Croissant, a metadata format for datasets that simplifies how data is used by ML tools and frameworks. Croissant makes ...

research-article
Open Access
Towards Interactively Improving ML Data Preparation Code via "Shadow Pipelines"

Data scientists develop ML pipelines in an iterative manner: they repeatedly screen a pipeline for potential issues, debug it, and then revise and improve its code according to their findings. However, this manual process is tedious and error-prone. ...

research-article
Open Access
tailwiz: Empowering Domain Experts with Easy-to-Use, Task-Specific Natural Language Processing Models

Experts outside the field of machine learning (ML) are interested in using ML techniques to analyze their textual data, but they are inhibited by a lack of convenient natural language processing (NLP) tools. To address this issue, we present tailwiz, an ...

research-article
Open Access
AIDB: a Sparsely Materialized Database for Queries using Machine Learning

Analysts and scientists are interested in automatically analyzing the semantic contents of unstructured, non-tabular data (videos, images, text, and audio). These analysts have turned to unstructured data systems leveraging machine learning (ML). The ...

research-article
Open Access
Reaching the Edge of the Edge: Image Analysis in Space

Satellites have become more widely available due to the reduction in size and cost of their components. As a result, there has been an advent of smaller organizations having the ability to deploy satellites with a variety of data-intensive applications ...

research-article
Open Access
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly

With the emergence of AI regulations, such as the EU AI Act, requirements for simple data lineage, enforcement of low data bias, and energy efficiency have become a priority for everyone offering AI services. Being pre-trained on versatile and a vast ...

research-article
Open Access
Reactive Dataflow for Inflight Error Handling in ML Workflows

Modern data analytics pipelines comprise traditional data transformation operations and pre-trained ML models deployed as user-defined functions (UDFs). Such pipelines, which we call ML workflows, generally produce erroneous results due to data errors ...

research-article
Open Access
Towards Efficient Data Wrangling with LLMs using Code Generation

While LLM-based data wrangling approaches that process each row of data have shown promising benchmark results, computational costs still limit their suitability for real-world use cases on large datasets. We revisit code generation using LLMs for ...

research-article
Reproducible data science over data lakes: replayable data pipelines with Bauplan and Nessie

As the Lakehouse architecture becomes more widespread, ensuring the reproducibility of data workloads over data lakes emerges as a crucial concern for data practitioners. However, achieving reproducibility remains challenging. The size of data pipelines ...

research-article
Open Access
Nautilus: A Benchmarking Platform for DBMS Knob Tuning

Recent research has shown the importance of tuning DBMS configuration knobs to achieve high performance. As a result, a large number of search-based and learning-based auto-tuning methods have been proposed. However, despite the promising results, we ...

research-article
DLProv: A Data-Centric Support for Deep Learning Workflow Analyses

The Deep Learning (DL) workflow involves several steps of data transformation. Evaluating various configurations at each step of the workflow may be a complex task when it comes to selecting DL models. This decision-making process requires basing ...

Recommendations

Acceptance Rates

DEEM '24 Paper Acceptance Rate 12 of 17 submissions, 71%;
Overall Acceptance Rate 44 of 67 submissions, 66%
YearSubmittedAcceptedRate
DEEM '24171271%
DEEM '2313969%
DEEM '2213969%
DEEM '208450%
DEEM'18161063%
Overall674466%