Jun 29, 2022 · We present a scalable method for automatically distilling a model's failure modes. Specifically, we harness linear classifiers to identify consistent error ...
Feb 1, 2023 · We present a scalable method for automatically distilling and captioning a model's failure modes as directions in a latent space.
Distilling Model Failures as Directions in Latent Space - MadryLab/failure-directions.
Distilling Model Failures as Directions in Latent Space. Saachi Jain, Hannah ... At a high level, our approach aims to model failure modes as directions within a ...
Sep 8, 2024 · Distilling Model Failures as Directions in Latent Space. June 2022. DOI:10.48550/arXiv.2206.14754. Authors: Saachi Jain · Saachi Jain.
How can we detect model failures on these subpopulations? The guiding principle of our framework is to model such failure modes as directions within a certain ...
Jun 29, 2022 · Distilling Model Failures as Directions in Latent Space. 2022-06-29 ... distilling a model's failure modes.Specifically, we harness ...
Jul 4, 2022 · Saachi Jain, Hannah Lawrence, Ankur Moitra, Aleksander Madry: Distilling Model Failures as Directions in Latent Space.
Explore all code implementations available for Distilling Model Failures as Directions in Latent Space.