Author: Rezig, el Kindi : Search

research-article

Open Access

Eastwood-Tidy: C Linting for Automated Code Style Assessment in Programming Courses

SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1March 2023, Pages 799–805https://doi.org/10.1145/3545945.3569817

Computer Science students receive significant instruction towards writing functioning code that correctly satisfies requirements. Auto-graders have been shown effective at scalably running student code and determining whether the code correctly ...

proceeding

Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2022 and DMAH 2022, Virtual Event, September 9, 2022, Revised Selected Papers

Article

A Survey of Data Challenges Across a Modernizing Bureaucracy: A New Perspective on Examining Old Government Problems

Heterogeneous Data Management, Polystores, and Analytics for HealthcareSep 2022, Pages 10–23https://doi.org/10.1007/978-3-031-23905-2_2

Abstract

The introduction and increasing popularity of artificial intelligence (AI) and machine learning (ML) technologies allow organizations to gain valuable insights from their copious amounts of data. However, legacy organizations often struggle to ...

proceeding

Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2021 and DMAH 2021, Virtual Event, August 20, 2021, Revised Selected Papers

research-article

DICE: data discovery by example

Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 12Pages 2819–2822https://doi.org/10.14778/3476311.3476353

In order to conduct analytical tasks, data scientists often need to find relevant data from an avalanche of sources (e.g., data lakes, large organizational databases). This effort is typically made in an ad hoc, non-systematic manner, which makes it a ...

research-article

Horizon: scalable dependency-driven data cleaning

Proceedings of the VLDB Endowment (PVLDB), Volume 14, Issue 11Pages 2546–2554https://doi.org/10.14778/3476249.3476301

A large class of data repair algorithms rely on integrity constraints to detect and repair errors. A well-studied class of constraints is Functional Dependencies (FDs, for short). Although there has been an increased interest in developing general data ...

Article

Towards Data Discovery by Example

Heterogeneous Data Management, Polystores, and Analytics for HealthcareSep 2020, Pages 66–71https://doi.org/10.1007/978-3-030-71055-2_6

Abstract

Data scientists today have to query an avalanche of multi-source data (e.g., data lakes, company databases) for diverse analytical tasks. Data discovery is labor-intensive as users have to find the right tables, and the combination thereof to ...

research-article

Debugging large-scale data science pipelines using dagger

Proceedings of the VLDB Endowment (PVLDB), Volume 13, Issue 12Pages 2993–2996https://doi.org/10.14778/3415478.3415527

Data pipelines are the new code. Consequently, data scientists need new tools to support the often time-consuming process of debugging their pipelines. We introduce Dagger, an end-to-end system to debug and mitigate data-centric errors in data pipelines,...

research-article

Data Civilizer 2.0: a holistic framework for data preparation and analytics

Proceedings of the VLDB Endowment (PVLDB), Volume 12, Issue 12Pages 1954–1957https://doi.org/10.14778/3352063.3352108

Data scientists spend over 80% of their time (1) parameter-tuning machine learning models and (2) iterating between data cleaning and machine learning model execution. While there are existing efforts to support the first requirement, there is currently ...

research-article

Open Access

Towards an End-to-End Human-Centric Data Cleaning Framework

HILDA '19: Proceedings of the Workshop on Human-In-the-Loop Data AnalyticsJuly 2019, Article No.: 1, Pages 1–7https://doi.org/10.1145/3328519.3329133

Data Cleaning refers to the process of detecting and fixing errors in the data. Human involvement is instrumental at several stages of this process such as providing rules or validating computed repairs. There is a plethora of data cleaning algorithms ...

research-article

Tornado: a distributed spatio-textual stream processing system

Proceedings of the VLDB Endowment (PVLDB), Volume 8, Issue 12Pages 2020–2023https://doi.org/10.14778/2824032.2824126

The widespread use of location-aware devices together with the increased popularity of micro-blogging applications (e.g., Twitter) led to the creation of large streams of spatio-textual data. In order to serve real-time applications, the processing of ...

article

Leveraging human experts' knowledge to detect and publish compositions of Semantic Web services in a repository

International Journal of Business Information Systems (IJBIS), Volume 14, Issue 1July 2013, Pages 83–95https://doi.org/10.1504/IJBIS.2013.055548

Web services have added a considerable abstraction level to interact with applications regardless of their environment. Semantic Web services have augmented web services with rigorous models to describe web services' functionalities and how they ...

demonstration

U-MAP: a system for usage-based schema matching and mapping

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataJune 2011, Pages 1287–1290https://doi.org/10.1145/1989323.1989478

This demo shows how usage information buried in query logs can play a central role in data integration and data exchange. More specifically, our system U-Map uses query logs to generate correspondences between the attributes of two different schemas and ...

Applied Filters

People

Names

Institutions

Authors

Editors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Eastwood-Tidy: C Linting for Automated Code Style Assessment in Programming Courses

Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2022 and DMAH 2022, Virtual Event, September 9, 2022, Revised Selected Papers

A Survey of Data Challenges Across a Modernizing Bureaucracy: A New Perspective on Examining Old Government Problems

Heterogeneous Data Management, Polystores, and Analytics for Healthcare: VLDB Workshops, Poly 2021 and DMAH 2021, Virtual Event, August 20, 2021, Revised Selected Papers

DICE: data discovery by example

Horizon: scalable dependency-driven data cleaning

Towards Data Discovery by Example

Debugging large-scale data science pipelines using dagger

Data Civilizer 2.0: a holistic framework for data preparation and analytics

Towards an End-to-End Human-Centric Data Cleaning Framework

Tornado: a distributed spatio-textual stream processing system

Leveraging human experts' knowledge to detect and publish compositions of Semantic Web services in a repository

U-MAP: a system for usage-based schema matching and mapping

Applied Filters

People

Names

Institutions

Authors

Editors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder