Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–16 of 16 results for author: Tomar, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09533  [pdf, other

    cs.CV cs.AI

    Video Occupancy Models

    Authors: Manan Tomar, Philippe Hansen-Estruch, Philip Bachman, Alex Lamb, John Langford, Matthew E. Taylor, Sergey Levine

    Abstract: We introduce a new family of video prediction models designed to support downstream control tasks. We call these models Video Occupancy models (VOCs). VOCs operate in a compact latent space, thus avoiding the need to make predictions about individual pixels. Unlike prior latent-space world models, VOCs directly predict the discounted distribution of future states in a single step, thus avoiding th… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  2. arXiv:2406.17688  [pdf, other

    cs.CV cs.AI

    Unified Auto-Encoding with Masked Diffusion

    Authors: Philippe Hansen-Estruch, Sriram Vishwanath, Amy Zhang, Manan Tomar

    Abstract: At the core of both successful generative and self-supervised representation learning models there is a reconstruction objective that incorporates some form of image corruption. Diffusion models implement this approach through a scheduled Gaussian corruption process, while masked auto-encoder models do so by masking patches of the image. Despite their different approaches, the underlying similarit… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 19 Pages, 8 Figures, 3Tables

    ACM Class: I.2.10

  3. arXiv:2405.11181  [pdf, other

    cs.AI cs.CL

    Towards Knowledge-Infused Automated Disease Diagnosis Assistant

    Authors: Mohit Tomar, Abhisek Tiwari, Sriparna Saha

    Abstract: With the advancement of internet communication and telemedicine, people are increasingly turning to the web for various healthcare activities. With an ever-increasing number of diseases and symptoms, diagnosing patients becomes challenging. In this work, we build a diagnosis assistant to assist doctors, which identifies diseases based on patient-doctor interaction. During diagnosis, doctors utiliz… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  4. arXiv:2405.09999  [pdf, other

    cs.LG cs.AI

    Reward Centering

    Authors: Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutton

    Abstract: We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average. The improvement is substantial at commonly used discount factors and increases further as the discount factor approaches one. In addition, we show that if a problem's rewards are shifted by a constant… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: In Proceedings of RLC 2024

  5. arXiv:2401.06807  [pdf, other

    cs.CL cs.AI

    An EcoSage Assistant: Towards Building A Multimodal Plant Care Dialogue Assistant

    Authors: Mohit Tomar, Abhisek Tiwari, Tulika Saha, Prince Jha, Sriparna Saha

    Abstract: In recent times, there has been an increasing awareness about imminent environmental challenges, resulting in people showing a stronger dedication to taking care of the environment and nurturing green life. The current $19.6 billion indoor gardening industry, reflective of this growing sentiment, not only signifies a monetary value but also speaks of a profound human desire to reconnect with the n… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  6. arXiv:2309.13041  [pdf, other

    cs.RO cs.CV cs.LG

    Robotic Offline RL from Internet Videos via Value-Function Pre-Training

    Authors: Chethan Bhateja, Derek Guo, Dibya Ghosh, Anikait Singh, Manan Tomar, Quan Vuong, Yevgen Chebotar, Sergey Levine, Aviral Kumar

    Abstract: Pre-training on Internet data has proven to be a key ingredient for broad generalization in many modern ML systems. What would it take to enable such capabilities in robotic reinforcement learning (RL)? Offline RL methods, which learn from datasets of robot experience, offer one way to leverage prior data into the robotic learning pipeline. However, these methods have a "type mismatch" with video… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: First three authors contributed equally

  7. arXiv:2303.06121  [pdf, other

    cs.LG cs.AI

    Ignorance is Bliss: Robust Control via Information Gating

    Authors: Manan Tomar, Riashat Islam, Matthew E. Taylor, Sergey Levine, Philip Bachman

    Abstract: Informational parsimony provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations. We propose \textit{information gating} as a way to learn parsimonious representations that identify the minimal information required for a task. When gating information, we can learn to reveal as little information as possible… ▽ More

    Submitted 8 December, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  8. arXiv:2212.13835  [pdf, other

    cs.LG

    Representation Learning in Deep RL via Discrete Information Bottleneck

    Authors: Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb

    Abstract: Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real-world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in th… ▽ More

    Submitted 30 May, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: AISTATS 2023

  9. arXiv:2211.00164  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

    Authors: Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

    Abstract: Learning to control an agent from data collected offline in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenou… ▽ More

    Submitted 13 August, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: ICML 2023

  10. arXiv:2111.07775  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Representations for Pixel-based Control: What Matters and Why?

    Authors: Manan Tomar, Utkarsh A. Mishra, Amy Zhang, Matthew E. Taylor

    Abstract: Learning representations for pixel-based control has garnered significant attention recently in reinforcement learning. A wide range of methods have been proposed to enable efficient learning, leading to sample complexities similar to those in the full state setting. However, moving beyond carefully curated pixel data sets (centered crop, appropriate lighting, clear background, etc.) remains chall… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

  11. arXiv:2102.09850  [pdf, other

    cs.LG cs.AI cs.RO

    Model-Invariant State Abstractions for Model-Based Reinforcement Learning

    Authors: Manan Tomar, Amy Zhang, Roberto Calandra, Matthew E. Taylor, Joelle Pineau

    Abstract: Accuracy and generalization of dynamics models is key to the success of model-based reinforcement learning (MBRL). As the complexity of tasks increases, so does the sample inefficiency of learning accurate dynamics models. However, many complex tasks also exhibit sparsity in the dynamics, i.e., actions have only a local effect on the system dynamics. In this paper, we exploit this property with a… ▽ More

    Submitted 7 June, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

  12. arXiv:2005.09814  [pdf, other

    cs.LG cs.AI stat.ML

    Mirror Descent Policy Optimization

    Authors: Manan Tomar, Lior Shani, Yonathan Efroni, Mohammad Ghavamzadeh

    Abstract: Mirror descent (MD), a well-known first-order method in constrained convex optimization, has recently been shown as an important tool to analyze trust-region algorithms in reinforcement learning (RL). However, there remains a considerable gap between such theoretically analyzed algorithms and the ones used in practice. Inspired by this, we propose an efficient RL algorithm, called {\em mirror desc… ▽ More

    Submitted 7 June, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

  13. arXiv:1910.02919  [pdf, other

    cs.LG stat.ML

    Multi-step Greedy Reinforcement Learning Algorithms

    Authors: Manan Tomar, Yonathan Efroni, Mohammad Ghavamzadeh

    Abstract: Multi-step greedy policies have been extensively used in model-based reinforcement learning (RL), both when a model of the environment is available (e.g.,~in the game of Go) and when it is learned. In this paper, we explore their benefits in model-free RL, when employed using multi-step dynamic programming algorithms: $κ$-Policy Iteration ($κ$-PI) and $κ$-Value Iteration ($κ$-VI). These methods it… ▽ More

    Submitted 12 July, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: ICML 2020

  14. arXiv:1905.07193  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    MaMiC: Macro and Micro Curriculum for Robotic Reinforcement Learning

    Authors: Manan Tomar, Akhil Sathuluri, Balaraman Ravindran

    Abstract: Shaping in humans and animals has been shown to be a powerful tool for learning complex tasks as compared to learning in a randomized fashion. This makes the problem less complex and enables one to solve the easier sub task at hand first. Generating a curriculum for such guided learning involves subjecting the agent to easier goals first, and then gradually increasing their difficulty. This paper… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: To appear in the Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019). (Extended Abstract)

  15. arXiv:1905.05731  [pdf, other

    cs.LG cs.AI stat.ML

    Successor Options: An Option Discovery Framework for Reinforcement Learning

    Authors: Rahul Ramesh, Manan Tomar, Balaraman Ravindran

    Abstract: The options framework in reinforcement learning models the notion of a skill or a temporally extended sequence of actions. The discovery of a reusable set of skills has typically entailed building options, that navigate to bottleneck states. This work adopts a complementary approach, where we attempt to discover options that navigate to landmark states. These states are prototypical representative… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: To appear in the proceedings of the International Joint Conference on Artificial Intelligence 2019 (IJCAI)

  16. arXiv:1103.1205  [pdf

    cs.AI

    A Directional Feature with Energy based Offline Signature Verification Network

    Authors: Minal Tomar, Pratibha Singh

    Abstract: Signature used as a biometric is implemented in various systems as well as every signature signed by each person is distinct at the same time. So, it is very important to have a computerized signature verification system. In offline signature verification system dynamic features are not available obviously, but one can use a signature as an image and apply image processing techniques to make an ef… ▽ More

    Submitted 7 March, 2011; originally announced March 2011.

    Comments: 10 pages, 6 figures

    Journal ref: International Journal on Soft Computing ( IJSC ), Vol.2, No.1, February 2011