Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–9 of 9 results for author: Kadian, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (508 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

  2. arXiv:2309.15807  [pdf, other

    cs.CV

    Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack

    Authors: Xiaoliang Dai, Ji Hou, Chih-Yao Ma, Sam Tsai, Jialiang Wang, Rui Wang, Peizhao Zhang, Simon Vandenhende, Xiaofang Wang, Abhimanyu Dubey, Matthew Yu, Abhishek Kadian, Filip Radenovic, Dhruv Mahajan, Kunpeng Li, Yue Zhao, Vladan Petrovic, Mitesh Kumar Singh, Simran Motwani, Yi Wen, Yiwen Song, Roshan Sumbaly, Vignesh Ramanathan, Zijian He, Peter Vajda , et al. (1 additional authors not shown)

    Abstract: Training text-to-image models with web scale image-text pairs enables the generation of a wide range of visual concepts from text. However, these pre-trained models often face challenges when it comes to generating highly aesthetic images. This creates the need for aesthetic alignment post pre-training. In this paper, we propose quality-tuning to effectively guide a pre-trained model to exclusivel… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  3. arXiv:2301.02280  [pdf, other

    cs.CV

    Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

    Authors: Filip Radenovic, Abhimanyu Dubey, Abhishek Kadian, Todor Mihaylov, Simon Vandenhende, Yash Patel, Yi Wen, Vignesh Ramanathan, Dhruv Mahajan

    Abstract: Vision-language models trained with contrastive learning on large-scale noisy data are becoming increasingly popular for zero-shot recognition problems. In this paper we improve the following three aspects of the contrastive pre-training pipeline: dataset noise, model initialization and the training objective. First, we propose a straightforward filtering strategy titled Complexity, Action, and Te… ▽ More

    Submitted 29 March, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: CVPR 2023

  4. arXiv:2301.01795  [pdf, other

    cs.CV

    PACO: Parts and Attributes of Common Objects

    Authors: Vignesh Ramanathan, Anmol Kalia, Vladan Petrovic, Yi Wen, Baixue Zheng, Baishan Guo, Rui Wang, Aaron Marquez, Rama Kovvuri, Abhishek Kadian, Amir Mousavi, Yiwen Song, Abhimanyu Dubey, Dhruv Mahajan

    Abstract: Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part cate… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  5. arXiv:2212.12667  [pdf, other

    cs.LG

    Visualizing Information Bottleneck through Variational Inference

    Authors: Cipta Herwana, Abhishek Kadian

    Abstract: The Information Bottleneck theory provides a theoretical and computational framework for finding approximate minimum sufficient statistics. Analysis of the Stochastic Gradient Descent (SGD) training of a neural network on a toy problem has shown the existence of two phases, fitting and compression. In this work, we analyze the SGD training process of a Deep Neural Network on MNIST classification a… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: text overlap with arXiv:1703.00810, arXiv:2202.06749 by other authors

  6. arXiv:1912.06321  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Sim2Real Predictivity: Does Evaluation in Simulation Predict Real-World Performance?

    Authors: Abhishek Kadian, Joanne Truong, Aaron Gokaslan, Alexander Clegg, Erik Wijmans, Stefan Lee, Manolis Savva, Sonia Chernova, Dhruv Batra

    Abstract: Does progress in simulation translate to progress on robots? If one method outperforms another in simulation, how likely is that trend to hold in reality on a robot? We examine this question for embodied PointGoal navigation, developing engineering tools and a research paradigm for evaluating a simulator by its sim2real predictivity. First, we develop Habitat-PyRobot Bridge (HaPy), a library for s… ▽ More

    Submitted 16 August, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

    Journal ref: IEEE Robotics and Automation Letters (RA-L) 2020

  7. arXiv:1911.00357  [pdf, other

    cs.CV cs.AI cs.LG

    DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

    Authors: Erik Wijmans, Abhishek Kadian, Ari Morcos, Stefan Lee, Irfan Essa, Devi Parikh, Manolis Savva, Dhruv Batra

    Abstract: We present Decentralized Distributed Proximal Policy Optimization (DD-PPO), a method for distributed reinforcement learning in resource-intensive simulated environments. DD-PPO is distributed (uses multiple machines), decentralized (lacks a centralized server), and synchronous (no computation is ever stale), making it conceptually simple and easy to implement. In our experiments on training virtua… ▽ More

    Submitted 19 January, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

  8. arXiv:1905.07512  [pdf, other

    cs.CV

    SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation

    Authors: Daniel Gordon, Abhishek Kadian, Devi Parikh, Judy Hoffman, Dhruv Batra

    Abstract: We propose SplitNet, a method for decoupling visual perception and policy learning. By incorporating auxiliary tasks and selective learning of portions of the model, we explicitly decompose the learning objectives for visual navigation into perceiving the world and acting on that perception. We show dramatic improvements over baseline models on transferring between simulators, an encouraging step… ▽ More

    Submitted 23 October, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

  9. arXiv:1904.01201  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO

    Habitat: A Platform for Embodied AI Research

    Authors: Manolis Savva, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, Dhruv Batra

    Abstract: We present Habitat, a platform for research in embodied artificial intelligence (AI). Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast -- when rendering a scen… ▽ More

    Submitted 24 November, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

    Comments: ICCV 2019