Data Clustering using Expectation Maximization algorithm. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711021001771
-
Updated
Oct 25, 2021 - R
Data Clustering using Expectation Maximization algorithm. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711021001771
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Implementation of data typology for imbalanced datasets.
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
TRIAGE: Characterizing and auditing training data for improved regression (NeurIPS 2023)
Enhancing Efficiency in Multidevice Federated Learning through Data Selection
A curated, but incomplete, list of data-centric AI resources.
You can’t handle the (dirty) truth: Data-centric insights improve pseudo-labeling
A multi-view panorama of Data-Centric AI: Techniques, Tools, and Applications (ECAI Tutorial 2024)
Collaboratively Learning Federated Models from Noisy Decentralized Data
Code for our paper "Towards Trustworthy Dataset Distillation" (Pattern Recognition 2025)
A Doctor for your data
Add a description, image, and links to the data-centric-machine-learning topic page so that developers can more easily learn about it.
To associate your repository with the data-centric-machine-learning topic, visit your repo's landing page and select "manage topics."