Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–32 of 32 results for author: Ravichandran, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10958  [pdf, other

    cs.CV

    InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models

    Authors: Nirat Saini, Navaneeth Bodla, Ashish Shrivastava, Avinash Ravichandran, Xiao Zhang, Abhinav Shrivastava, Bharat Singh

    Abstract: We introduce InVi, an approach for inserting or replacing objects within videos (referred to as inpainting) using off-the-shelf, text-to-image latent diffusion models. InVi targets controlled manipulation of objects and blending them seamlessly into a background video unlike existing video editing methods that focus on comprehensive re-styling or entire scene alterations. To achieve this goal, we… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2406.10722  [pdf, other

    cs.CV cs.AI cs.LG

    GenMM: Geometrically and Temporally Consistent Multimodal Data Generation for Video and LiDAR

    Authors: Bharat Singh, Viveka Kulharia, Luyu Yang, Avinash Ravichandran, Ambrish Tyagi, Ashish Shrivastava

    Abstract: Multimodal synthetic data generation is crucial in domains such as autonomous driving, robotics, augmented/virtual reality, and retail. We propose a novel approach, GenMM, for jointly editing RGB videos and LiDAR scans by inserting temporally and geometrically consistent 3D objects. Our method uses a reference image and 3D bounding boxes to seamlessly insert and blend new objects into target video… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2303.15591  [pdf, other

    cs.CV

    Learning Expressive Prompting With Residuals for Vision Transformers

    Authors: Rajshekhar Das, Yonatan Dukler, Avinash Ravichandran, Ashwin Swaminathan

    Abstract: Prompt learning is an efficient approach to adapt transformers by inserting learnable set of parameters into the input and intermediate representations of a pre-trained model. In this work, we present Expressive Prompts with Residuals (EXPRES) which modifies the prompt learning paradigm specifically for effective adaptation of vision transformers (ViT). Out method constructs downstream representat… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted at CVPR (2023)

  4. arXiv:2303.14814  [pdf, other

    cs.CV cs.AI cs.CL

    WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation

    Authors: Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, Onkar Dabeer

    Abstract: Visual anomaly classification and segmentation are vital for automating industrial quality inspection. The focus of prior research in the field has been on training custom models for each quality inspection task, which requires task-specific images and annotation. In this paper we move away from this regime, addressing zero-shot and few-normal-shot anomaly classification and segmentation. Recently… ▽ More

    Submitted 26 March, 2023; originally announced March 2023.

    Comments: Accepted to Conference on Computer Vision and Pattern Recognition (CVPR) 2023

  5. arXiv:2303.04105  [pdf, other

    cs.LG cs.CV

    Your representations are in the network: composable and parallel adaptation for large scale models

    Authors: Yonatan Dukler, Alessandro Achille, Hao Yang, Varsha Vivek, Luca Zancato, Benjamin Bowman, Avinash Ravichandran, Charless Fowlkes, Ashwin Swaminathan, Stefano Soatto

    Abstract: We propose InCA, a lightweight method for transfer learning that cross-attends to any activation layer of a pre-trained model. During training, InCA uses a single forward pass to extract multiple activations, which are passed to external cross-attention adapters, trained anew and combined or selected for downstream tasks. We show that, even when selecting a single top-scoring adapter, InCA achieve… ▽ More

    Submitted 31 October, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: Accepted to NeurIPS 2023

  6. arXiv:2303.01598  [pdf, other

    cs.CV cs.LG

    A Meta-Learning Approach to Predicting Performance and Data Requirements

    Authors: Achin Jain, Gurumurthy Swaminathan, Paolo Favaro, Hao Yang, Avinash Ravichandran, Hrayr Harutyunyan, Alessandro Achille, Onkar Dabeer, Bernt Schiele, Ashwin Swaminathan, Stefano Soatto

    Abstract: We propose an approach to estimate the number of samples required for a model to reach a target performance. We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset (e.g., 5 samples per class) for extrapolation. This is because the log-performance error against the log-dataset size follows a nonlinear progression in the few-… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  7. arXiv:2209.05654  [pdf, other

    cs.CV

    ComplETR: Reducing the cost of annotations for object detection in dense scenes with vision transformers

    Authors: Achin Jain, Kibok Lee, Gurumurthy Swaminathan, Hao Yang, Bernt Schiele, Avinash Ravichandran, Onkar Dabeer

    Abstract: Annotating bounding boxes for object detection is expensive, time-consuming, and error-prone. In this work, we propose a DETR based framework called ComplETR that is designed to explicitly complete missing annotations in partially annotated dense scene datasets. This reduces the need to annotate every object instance in the scene thereby reducing annotation cost. ComplETR augments object queries i… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  8. arXiv:2208.05688  [pdf, other

    cs.CV cs.AI cs.LG

    Semi-supervised Vision Transformers at Scale

    Authors: Zhaowei Cai, Avinash Ravichandran, Paolo Favaro, Manchen Wang, Davide Modolo, Rahul Bhotika, Zhuowen Tu, Stefano Soatto

    Abstract: We study semi-supervised learning (SSL) for vision transformers (ViT), an under-explored topic despite the wide adoption of the ViT architectures to different tasks. To tackle this problem, we propose a new SSL pipeline, consisting of first un/self-supervised pre-training, followed by supervised fine-tuning, and finally semi-supervised fine-tuning. At the semi-supervised fine-tuning stage, we adop… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

  9. arXiv:2208.02131  [pdf, other

    cs.CV cs.CL cs.LG

    Masked Vision and Language Modeling for Multi-modal Representation Learning

    Authors: Gukyeong Kwon, Zhaowei Cai, Avinash Ravichandran, Erhan Bas, Rahul Bhotika, Stefano Soatto

    Abstract: In this paper, we study how to use masked signal modeling in vision and language (V+L) representation learning. Instead of developing masked language modeling (MLM) and masked image modeling (MIM) independently, we propose to build joint masked vision and language modeling, where the masked signal of one modality is reconstructed with the help from another modality. This is motivated by the nature… ▽ More

    Submitted 14 March, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

    Comments: International Conference on Learning Representations (ICLR) 2023

  10. arXiv:2207.11169  [pdf, other

    cs.CV

    Rethinking Few-Shot Object Detection on a Multi-Domain Benchmark

    Authors: Kibok Lee, Hao Yang, Satyaki Chakraborty, Zhaowei Cai, Gurumurthy Swaminathan, Avinash Ravichandran, Onkar Dabeer

    Abstract: Most existing works on few-shot object detection (FSOD) focus on a setting where both pre-training and few-shot learning datasets are from a similar domain. However, few-shot algorithms are important in multiple domains; hence evaluation needs to reflect the broad applications. We propose a Multi-dOmain Few-Shot Object Detection (MoFSOD) benchmark consisting of 10 datasets from a wide range of dom… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  11. arXiv:2206.06029  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Mediators: Conversational Agents Explaining NLP Model Behavior

    Authors: Nils Feldhus, Ajay Madhavan Ravichandran, Sebastian Möller

    Abstract: The human-centric explainable artificial intelligence (HCXAI) community has raised the need for framing the explanation process as a conversation between human and machine. In this position paper, we establish desiderata for Mediators, text-based conversational agents which are capable of explaining the behavior of neural models interactively using natural language. From the perspective of natural… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: Accepted to IJCAI-ECAI 2022 Workshop on Explainable Artificial Intelligence (XAI)

  12. arXiv:2204.05626  [pdf, other

    cs.CV cs.CL cs.LG

    X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks

    Authors: Zhaowei Cai, Gukyeong Kwon, Avinash Ravichandran, Erhan Bas, Zhuowen Tu, Rahul Bhotika, Stefano Soatto

    Abstract: In this paper, we study the challenging instance-wise vision-language tasks, where the free-form language is required to align with the objects instead of the whole image. To address these tasks, we propose X-DETR, whose architecture has three major components: an object detector, a language encoder, and vision-language alignment. The vision and language streams are independent until the end and t… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  13. arXiv:2204.03634  [pdf, other

    cs.CV cs.LG

    Class-Incremental Learning with Strong Pre-trained Models

    Authors: Tz-Ying Wu, Gurumurthy Swaminathan, Zhizhong Li, Avinash Ravichandran, Nuno Vasconcelos, Rahul Bhotika, Stefano Soatto

    Abstract: Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes). Instead, we explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes. We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be d… ▽ More

    Submitted 12 September, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPR 2022, code is available at https://github.com/amazon-research/sp-cil

  14. arXiv:2203.16708  [pdf, other

    cs.LG cs.CV

    Task Adaptive Parameter Sharing for Multi-Task Learning

    Authors: Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto

    Abstract: Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a general method for tuning a base model to a new ta… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: CVPR 2022 Camera Ready. 15 pages, 11 figures

  15. arXiv:2111.09785  [pdf, other

    cs.LG

    DIVA: Dataset Derivative of a Learning Task

    Authors: Yonatan Dukler, Alessandro Achille, Giovanni Paolini, Avinash Ravichandran, Marzia Polito, Stefano Soatto

    Abstract: We present a method to compute the derivative of a learning task with respect to a dataset. A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN). The "dataset derivative" is a linear operator, computed around the trained model, that informs how perturbations of the weight of each training sample affect the valida… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

  16. arXiv:2108.01662  [pdf, other

    cs.LG cs.AI cs.CV

    Uniform Sampling over Episode Difficulty

    Authors: Sébastien M. R. Arnold, Guneet S. Dhillon, Avinash Ravichandran, Stefano Soatto

    Abstract: Episodic training is a core ingredient of few-shot learning to train models on tasks with limited labelled data. Despite its success, episodic training remains largely understudied, prompting us to ask the question: what is the best way to sample episodes? In this paper, we first propose a method to approximate episode sampling distributions based on their difficulty. Building on this method, we p… ▽ More

    Submitted 15 January, 2022; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: NeurIPS'21 camera ready

  17. arXiv:2107.08039  [pdf, other

    cs.CV cs.LG

    Representation Consolidation for Training Expert Students

    Authors: Zhizhong Li, Avinash Ravichandran, Charless Fowlkes, Marzia Polito, Rahul Bhotika, Stefano Soatto

    Abstract: Traditionally, distillation has been used to train a student model to emulate the input/output functionality of a teacher. A more useful goal than emulation, yet under-explored, is for the student to learn feature representations that transfer well to future tasks. However, we observe that standard distillation of task-specific teachers actually *reduces* the transferability of student representat… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  18. arXiv:2102.00084  [pdf, other

    cs.CV cs.LG

    A linearized framework and a new benchmark for model selection for fine-tuning

    Authors: Aditya Deshpande, Alessandro Achille, Avinash Ravichandran, Hao Li, Luca Zancato, Charless Fowlkes, Rahul Bhotika, Stefano Soatto, Pietro Perona

    Abstract: Fine-tuning from a collection of models pre-trained on different domains (a "model zoo") is emerging as a technique to improve test accuracy in the low-data regime. However, model selection, i.e. how to pre-select the right model to fine-tune from a model zoo without performing any training, remains an open topic. We use a linearized framework to approximate fine-tuning, and introduce two new base… ▽ More

    Submitted 29 January, 2021; originally announced February 2021.

    Comments: 14 pages

  19. arXiv:2101.11058  [pdf, other

    cs.CV cs.LG

    Supervised Momentum Contrastive Learning for Few-Shot Classification

    Authors: Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Alessandro Achille, Marzia Polito, Stefano Soatto

    Abstract: Few-shot learning aims to transfer information from one task to enable generalization on novel tasks given a few examples. This information is present both in the domain and the class labels. In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervise… ▽ More

    Submitted 21 June, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

    Comments: V2 version; updated with new experiments and figures

  20. arXiv:2101.08482  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Exponential Moving Average Normalization for Self-supervised and Semi-supervised Learning

    Authors: Zhaowei Cai, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Zhuowen Tu, Stefano Soatto

    Abstract: We present a plug-in replacement for batch normalization (BN) called exponential moving average normalization (EMAN), which improves the performance of existing student-teacher based self- and semi-supervised learning techniques. Unlike the standard BN, where the statistics are computed within each batch, EMAN, used in the teacher, updates its statistics by exponential moving average from the BN s… ▽ More

    Submitted 18 June, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: accepted by CVPR21 as Oral presentation

  21. arXiv:2101.06640  [pdf, other

    cs.LG stat.ML

    Estimating informativeness of samples with Smooth Unique Information

    Authors: Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights. Though related, we show that these quantities have a qualitatively different behavior. We give efficient approximations of these quantities using a lin… ▽ More

    Submitted 28 March, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: ICLR 2021, 22 pages

  22. arXiv:2012.13431  [pdf, other

    cs.LG cs.AI cs.CV

    Mixed-Privacy Forgetting in Deep Networks

    Authors: Aditya Golatkar, Alessandro Achille, Avinash Ravichandran, Marzia Polito, Stefano Soatto

    Abstract: We show that the influence of a subset of the training samples can be removed -- or "forgotten" -- from the weights of a network trained on large-scale image classification tasks, and we provide strong computable bounds on the amount of remaining information after forgetting. Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy se… ▽ More

    Submitted 20 June, 2021; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: CVPR 2021

  23. arXiv:2012.11140  [pdf, other

    cs.LG cs.CV stat.ML

    LQF: Linear Quadratic Fine-Tuning

    Authors: Alessandro Achille, Aditya Golatkar, Avinash Ravichandran, Marzia Polito, Stefano Soatto

    Abstract: Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization. Such desirable properties are absent in deep neural networks (DNNs), typically trained by non-linear fine-tuning of a pre-trained model. Previous attempts to linearize DNNs have led to intere… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

  24. arXiv:2008.12478  [pdf, other

    cs.LG stat.ML

    Predicting Training Time Without Training

    Authors: Luca Zancato, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function. To do so, we leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model. This allows us to approximate the training loss and accuracy at any point during training by… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  25. arXiv:2002.11770  [pdf, other

    cs.CV cs.LG stat.ML

    Rethinking the Hyperparameters for Fine-tuning

    Authors: Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: Fine-tuning from pre-trained ImageNet models has become the de-facto standard for various computer vision tasks. Current practices for fine-tuning typically involve selecting an ad-hoc choice of hyperparameters and keeping them fixed to values normally used for training from scratch. This paper re-examines several common practices of setting hyperparameters for fine-tuning. Our findings are based… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

    Comments: Published as a conference paper at ICLR 2020

  26. arXiv:2002.05347  [pdf, other

    cs.CV

    Multi-Task Incremental Learning for Object Detection

    Authors: Xialei Liu, Hao Yang, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: Multi-task learns multiple tasks, while sharing knowledge and computation among them. However, it suffers from catastrophic forgetting of previous knowledge when learned incrementally without access to the old data. Most existing object detectors are domain-specific and static, while some are learned incrementally but only within a single domain. Training an object detector incrementally across va… ▽ More

    Submitted 18 November, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

  27. arXiv:2002.04162  [pdf, other

    cs.LG cs.CV stat.ML

    Incremental Meta-Learning via Indirect Discriminant Alignment

    Authors: Qing Liu, Orchid Majumder, Alessandro Achille, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen d… ▽ More

    Submitted 21 April, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  28. arXiv:1911.12528  [pdf, other

    cs.LG cs.CV stat.ML

    Unbiased Evaluation of Deep Metric Learning Algorithms

    Authors: Istvan Fehervari, Avinash Ravichandran, Srikar Appalaraju

    Abstract: Deep metric learning (DML) is a popular approach for images retrieval, solving verification (same or not) problems and addressing open set classification. Arguably, the most common DML approach is with triplet loss, despite significant advances in the area of DML. Triplet loss suffers from several issues such as collapse of the embeddings, high sensitivity to sampling schemes and more importantly… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  29. arXiv:1909.02729  [pdf, other

    cs.LG cs.CV stat.ML

    A Baseline for Few-Shot Image Classification

    Authors: Guneet S. Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto

    Abstract: Fine-tuning a deep network trained with the standard cross-entropy loss is a strong baseline for few-shot learning. When fine-tuned transductively, this outperforms the current state-of-the-art on standard datasets such as Mini-ImageNet, Tiered-ImageNet, CIFAR-FS and FC-100 with the same hyper-parameters. The simplicity of this approach enables us to demonstrate the first few-shot learning results… ▽ More

    Submitted 21 October, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

    Journal ref: International Conference on Learning Representations (ICLR), 2020

  30. arXiv:1905.04398  [pdf, other

    cs.LG cs.CV stat.ML

    Few-Shot Learning with Embedded Class Models and Shot-Free Meta Training

    Authors: Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

    Abstract: We propose a method for learning embeddings for few-shot learning that is suitable for use with any number of ways and any number of shots (shot-free). Rather than fixing the class prototypes to be the Euclidean average of sample embeddings, we allow them to live in a higher-dimensional space (embedded class models) and learn the prototypes along with the model parameters. The class representation… ▽ More

    Submitted 21 April, 2020; v1 submitted 10 May, 2019; originally announced May 2019.

    Comments: Accepted to ICCV 2019

  31. arXiv:1904.03758  [pdf, other

    cs.CV cs.LG

    Meta-Learning with Differentiable Convex Optimization

    Authors: Kwonjoon Lee, Subhransu Maji, Avinash Ravichandran, Stefano Soatto

    Abstract: Many meta-learning approaches for few-shot learning rely on simple base learners such as nearest-neighbor classifiers. However, even in the few-shot regime, discriminatively trained linear predictors can offer better generalization. We propose to use these predictors as base learners to learn representations for few-shot learning and show they offer better tradeoffs between feature size and perfor… ▽ More

    Submitted 23 April, 2019; v1 submitted 7 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019 (Oral)

  32. arXiv:1902.03545  [pdf, other

    cs.LG cs.AI stat.ML

    Task2Vec: Task Embedding for Meta-Learning

    Authors: Alessandro Achille, Michael Lam, Rahul Tewari, Avinash Ravichandran, Subhransu Maji, Charless Fowlkes, Stefano Soatto, Pietro Perona

    Abstract: We introduce a method to provide vectorial representations of visual classification tasks which can be used to reason about the nature of those tasks and their relations. Given a dataset with ground-truth labels and a loss function defined over those labels, we process images through a "probe network" and compute an embedding based on estimates of the Fisher information matrix associated with the… ▽ More

    Submitted 10 February, 2019; originally announced February 2019.