Devesh Jha

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement... more

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement learning tasks with continuous controls. The algorithm has achieved state-of-the-art performance when used in reinforcement learning across a wide range of tasks. However, the algorithm suffers from a number of drawbacks including: lack of stepsize selection criterion, and slow convergence. We investigate the use of a trust region method using dogleg step and a Quasi-Newton approximation for the Hessian for policy optimization. We demonstrate through numerical experiments over a wide range of challenging continuous control tasks that our particular choice is efficient in terms of number of samples and improves performance.

Publication Date: 2019

Download (.pdf)

Time-varying network topology plays a key role in mobile sensor networks for detection of an event of interest and subsequent awareness propagation within a monitoring and surveillance framework. While physical space parameters such as... more

Time-varying network topology plays a key role in mobile sensor networks for detection of an event of interest and subsequent awareness propagation within a monitoring and surveillance framework. While physical space parameters such as communication range and mobility characteristics directly drive the network structure, feedback from the information space can be useful to improve network topology and facilitate efficient information management. In this context, the paper proposes a feedback control scheme for tuning key network topology parameters, such as average degree and degree distribution under the recently proposed generalized gossip framework for distributed belief/awareness propagation in mobile sensor networks. The crux of this decentralized control policy is to modify the timelines of the asynchronous belief update protocol depending on the node-level belief/awareness. Using a proximity network representation for a mobile sensor network, the paper presents both analytic ...

Publisher: 2016 American Control Conference (ACC)

Publication Date: 2016

Publication Name: 2016 American Control Conference (ACC)

Research Interests:
Computer Science, Distributed Computing, Topology Control, Network Topology, and Wireless Sensor Network

Download (.pdf)

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement... more

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement learning tasks with continuous controls. The algorithm has achieved state-of-the-art performance when used in reinforcement learning across a wide range of tasks. However, the algorithm suffers from a number of drawbacks including: lack of stepsize selection criterion, and slow convergence. We investigate the use of a trust region method using dogleg step and a Quasi-Newton approximation for the Hessian for policy optimization. We demonstrate through numerical experiments over a wide range of challenging continuous control tasks that our particular choice is efficient in terms of number of samples and improves performance Optimization Foundations for Reinforcement Learning Workshop at NeurIPS This work may not be copied or reproduced in whole or in p...

Publication Date: 2019

Download (.pdf)

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement... more

We propose a trust region method for policy optimization that employs QuasiNewton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement learning tasks with continuous controls. The algorithm has achieved state-of-the-art performance when used in reinforcement learning across a wide range of tasks. However, the algorithm suffers from a number of drawbacks including: lack of stepsize selection criterion, and slow convergence. We investigate the use of a trust region method using dogleg step and a Quasi-Newton approximation for the Hessian for policy optimization. We demonstrate through numerical experiments over a wide range of challenging continuous control tasks that our particular choice is efficient in terms of number of samples and improves performance Optimization Foundations for Reinforcement Learning Workshop at NeurIPS c © 2019 MERL. This work may not be copied or reproduced i...

Publication Date: 2021

Download (.pdf)

One of the main challenges in peg-in-a-hole (PiH) insertion tasks is in handling the uncertainty in the location of the target hole. In order to address it, high-dimensional sensor inputs from sensor modalities such as vision,... more

One of the main challenges in peg-in-a-hole (PiH) insertion tasks is in handling the uncertainty in the location of the target hole. In order to address it, high-dimensional sensor inputs from sensor modalities such as vision, force/torque sensing, and proprioception can be combined to learn control policies that are robust to this uncertainty in the target pose. Whereas deep learning has shown success in recognizing objects and making decisions with high-dimensional inputs, the learning procedure might damage the robot when applying directly trial- and-error algorithms on the real system. At the same time, learning from Demonstration (LfD) methods have been shown to achieve compelling performance in real robotic systems by leveraging demonstration data provided by experts. In this paper, we investigate the merits of multiple sensor modalities such as vision, force/torque sensors, and proprioception when combined to learn a controller for real world assembly operation tasks using Lf...

Publisher: ArXiv

Publication Date: 2020

Publication Name: ArXiv

Research Interests:
Computer Science, Artificial Intelligence, Perception, Modalities, Modal, and 3 moreActive Perception, PEG Ratio, and arXiv

Download (.pdf)

Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large... more

Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very intuitive and easy for humans to solve, it can be very difficult and inefficient for standard reinforcement learning algorithms to learn meaningful policies. We present a model that learns to move a marble in the complex environment within minutes of interacting with the real system. Learning consists of initializing a physics engine with parameters estimated using data from the real system. The error in the physics engine is then corrected using Gaussian process regression, which is used to model the residual between real observations and physics engine simulations. The physics engine equip...

Publisher: ArXiv

Publication Date: 2020

Publication Name: ArXiv

Research Interests:
Computer Science, Artificial Intelligence, Reinforcement Learning, Machine Learning, Physics Engine, and 3 morePhysical System, Residual, and arXiv

Download (.pdf)

In this paper, we present algorithms for synthesizing controllers to distribute a group (possibly swarms) of homogeneous robots (agents) over heterogeneous tasks which are operated in parallel. We present algorithms as well as analysis... more

In this paper, we present algorithms for synthesizing controllers to distribute a group (possibly swarms) of homogeneous robots (agents) over heterogeneous tasks which are operated in parallel. We present algorithms as well as analysis for global and local-feedback-based controller for the swarms. Using ergodicity property of irreducible Markov chains, we design a controller for global swarm control. Furthermore, to provide some degree of autonomy to the agents, we augment this global controller by a local feedback-based controller using Language measure theory. We provide analysis of the proposed algorithms to show their correctness. Numerical experiments are shown to illustrate the performance of the proposed algorithms.

Publisher: ArXiv

Publication Date: 2021

Publication Name: ArXiv

Research Interests:
Engineering, Computer Science, Correctness, Swarm Behaviour, and arXiv

Download (.pdf)

Learning tasks from simulated data using reinforcement learning has been proven effective. A major advantage of using simulation data for training is that it reduces the burden of acquiring real data. Specifically when robots are... more

Learning tasks from simulated data using reinforcement learning has been proven effective. A major advantage of using simulation data for training is that it reduces the burden of acquiring real data. Specifically when robots are involved, it is important to limit the amount of time a robot is occupied with learning, and can instead be used for its intended (manufacturing) task. A policy learned on simulation data can be transferred and refined for real data. In this paper we propose to learn a robustified policy during reinforcement learning using simulation data. A robustified policy is learned by exploiting the ability to change the simulation parameters (appearance and dynamics) for successive training episodes. We demonstrate that the amount of transfer learning for a robustified policy is reduced for transfer from a simulated to real task. We focus on tasks which involve reasl-time non-linear dynamics, since non-linear dynamics can only be approximately modeled in physics engi...

Download (.pdf)

Deep reinforcement learning (RL) algorithms have recently achieved remarkable successes in various sequential decision making tasks, leveraging advances in methods for training large deep networks. However, these methods usually require... more

Deep reinforcement learning (RL) algorithms have recently achieved remarkable successes in various sequential decision making tasks, leveraging advances in methods for training large deep networks. However, these methods usually require large amounts of training data, which is often a big problem for real-world applications. One natural question to ask is whether learning good representations for states and using larger networks helps in learning better policies. In this paper, we try to study if increasing input dimensionality helps improve performance and sample efficiency of model-free deep RL algorithms. To do so, we propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms. Even though the high dimensionality of input is usually supposed to make learning of RL agents more difficult, we show that the RL agents in fact learn more efficiently with the high-dimensional representation than wi...

Publisher: ICML

Publication Date: 2020

Research Interests:
Mathematics, Computer Science, Artificial Intelligence, Reinforcement Learning, Machine Learning, and 3 moreDeep Learning, Artificial Neural Network, and Curse of Dimensionality

Download (.pdf)

Publisher: IEEE

Publication Date: 2021

Publication Name: 2021 IEEE International Conference on Robotics and Automation (ICRA)

Research Interests:
Computer Science, Artificial Intelligence, Reinforcement Learning, and Generalization

Download (.pdf)

Publisher: IEEE

Publication Date: 2017

Publication Name: 2017 13th IEEE Conference on Automation Science and Engineering (CASE)

Research Interests:
Computer Science, Motion Planning, Computation, Planner, IEEE International Conference on Computer Science and Automation Engineering, and Random Tree

Download (.pdf)

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2020

Publication Name: IEEE Transactions on Neural Networks and Learning Systems

Research Interests:
Engineering, Mathematics, Computer Science, Dynamic programming, Medicine, and 3 moreSemidefinite Programming, Mathematical Optimization, and Lipschitz continuity

Download (.pdf)

This paper addresses the problem of learning dynamic models of hybrid systems from demonstrations and then the problem of imitation of those demonstrations by using Bayesian filtering. A linear programming-based approach is used to... more

This paper addresses the problem of learning dynamic models of hybrid systems from demonstrations and then the problem of imitation of those demonstrations by using Bayesian filtering. A linear programming-based approach is used to develop nonparametric kernel-based conditional density estimation technique to infer accurate and concise dynamic models of system evolution from data. The training data for these models have been acquired from demonstrations by teleoperation. The trained data-driven models for mode-dependent state evolution and state-dependent mode evolution are then used online for imitation of demonstrated tasks via particle filtering. The results of simulation and experimental validation with a hexapod robot are reported to establish generalization of the proposed learning and control algorithms.

Publisher: ASME International

Publication Date: 2017

Publication Name: Journal of Dynamic Systems, Measurement, and Control

Research Interests:
Engineering, Computer Science, Artificial Intelligence, Machine Learning, and Imitation

Download (.pdf)

This paper addresses the problem of target detection in dynamic environments in a semi-supervised data-driven setting with low-cost passive sensors. A key challenge here is to simultaneously achieve high probabilities of correct detection... more

This paper addresses the problem of target detection in dynamic environments in a semi-supervised data-driven setting with low-cost passive sensors. A key challenge here is to simultaneously achieve high probabilities of correct detection with low probabilities of false alarm under the constraints of limited computation and communication resources. In general, the changes in a dynamic environment may significantly affect the performance of target detection due to limited training scenarios and the assumptions made on signal behavior under a static environment. To this end, an algorithm of binary hypothesis testing is proposed based on clustering of features extracted from multiple sensors that may observe the target. First, the features are extracted individually from time-series signals of different sensors by using a recently reported feature extraction tool, called symbolic dynamic filtering. Then, these features are grouped as clusters in the feature space to evaluate homogeneit...

Publication Date: Jan 3, 2016

Publication Name: IEEE transactions on cybernetics

Research Interests:
Computer Science, Artificial Intelligence, Medicine, Cluster Analysis, and Sensor Fusion

Download (.pdf)

Publisher: Informa UK Limited

Publication Date: 2015

Publication Name: International Journal of Control

Research Interests:
Engineering, Applied Mathematics, Computer Science, Control, Motion Planning, and 2 moreGlobal Positioning System and Electrical And Electronic Engineering

Download (.pdf)

ABSTRACT

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 IEEE Congress on Evolutionary Computation

Research Interests:
Engineering, Computer Science, Motion Planning, Particle Swarm Optimization, ROBOT, and Mathematical Optimization

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 American Control Conference (ACC)

Research Interests:
Computer Science, Motion Planning, and Global Positioning System

Download (.pdf)

We propose a trust region method for policy optimization that employs Quasi-Newton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement... more

We propose a trust region method for policy optimization that employs Quasi-Newton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization (QNTRPO). Gradient descent is the de facto algorithm for reinforcement learning tasks with continuous controls. The algorithm has achieved state-of-the-art performance when used in reinforcement learning across a wide range of tasks. However, the algorithm suffers from a number of drawbacks including: lack of stepsize selection criterion, and slow convergence. We investigate the use of a trust region method using dogleg step and a Quasi-Newton approximation for the Hessian for policy optimization. We demonstrate through numerical experiments over a wide range of challenging continuous control tasks that our particular choice is efcient in terms of number of samples and improves performance. Conference on Robot Learning (CoRL) c © 2019 MERL. This work may not be copied or reproduced in whole or in part for any commercia...

Publication Date: 2021

Download (.pdf)

Publisher: IEEE

Publication Name: 2019 International Conference on Robotics and Automation (ICRA)

Research Interests:
Mathematics, Computer Science, Artificial Intelligence, Reinforcement Learning, ROBOT, and 2 moreTransfer of Learning and Music Information Dynamics

Download (.pdf)

We propose a trust region method for policy optimization that employs Quasi-Newton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization QNTRPO. Gradient descent is the de facto algorithm for reinforcement... more

We propose a trust region method for policy optimization that employs Quasi-Newton approximation for the Hessian, called Quasi-Newton Trust Region Policy Optimization QNTRPO. Gradient descent is the de facto algorithm for reinforcement learning tasks with continuous controls. The algorithm has achieved state-of-the-art performance when used in reinforcement learning across a wide range of tasks. However, the algorithm suffers from a number of drawbacks including: lack of stepsize selection criterion, and slow convergence. We investigate the use of a trust region method using dogleg step and a Quasi-Newton approximation for the Hessian for policy optimization. We demonstrate through numerical experiments over a wide range of challenging continuous control tasks that our particular choice is efficient in terms of number of samples and improves performance

Publisher: CoRL

Publication Date: 2019

Publication Name: ArXiv

Research Interests:
Mathematics, Computer Science, Reinforcement Learning, Mathematical Optimization, De Facto States, and 3 moreTrust Region, Hessian matrix, and arXiv

Download (.pdf)

Robots need to learn skills that can not only generalize across similar problems but also be directed to a specific goal. Previous methods either train a new skill for every different goal or do not infer the specific target in the... more

Robots need to learn skills that can not only generalize across similar problems but also be directed to a specific goal. Previous methods either train a new skill for every different goal or do not infer the specific target in the presence of multiple goals from visual data. We introduce an end-to-end method that represents targetable visuomotor skills as a goal-parameterized neural network policy. By training on an informative subset of available goals with the associated target parameters, we are able to learn a policy that can zero-shot generalize to previously unseen goals. We evaluate our method in a representative 2D simulation of a button-grid and on both button-pressing and peg-insertion tasks on two different physical arms. We demonstrate that our model trained on 33% of the possible goals is able to generalize to more than 90% of the targets in the scene for both simulation and robot experiments. We also successfully learn a mapping from target pixel coordinates to a robo...

Publisher: ArXiv

Publication Date: 2019

Publication Name: ArXiv

Research Interests:
Engineering, Computer Science, Artificial Intelligence, ROBOT, Parameterized Complexity, and 3 moreGrid, Artificial Neural Network, and arXiv

Download (.pdf)

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2021

Publication Name: IEEE Robotics and Automation Letters

Research Interests:
Computer Science, Artificial Intelligence, Reinforcement Learning, Model Predictive Control, Gaussian Process, and 3 morePhysics Engine, Physical System, and Residual

Download (.pdf)

Publisher: IEEE

Publication Date: 2019

Publication Name: 2019 International Conference on Robotics and Automation (ICRA)

Research Interests:
Mathematics, Computer Science, Physics, Gaussian, and Gaussian Process

Download (.pdf)

Publisher: IEEE

Publication Date: 2020

Publication Name: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Research Interests:
Mathematics and Computer Science

Download (.pdf)

Publisher: IEEE

Publication Name: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Research Interests: Computer Science, Mathematical Optimization, Markov chain, Swarm Behaviour, Homogeneous, and Ergodicity<div>()</div>

Publisher: Cornell University

Publication Date: May 14, 2019

Publication Name: arXiv (Cornell University)

Research Interests: Mathematics, Computer Science, Nash Equilibrium, and Mathematical Optimization<div>()</div>

Publisher: IEEE

Publication Name: 2020 IEEE International Conference on Robotics and Automation (ICRA)

Publisher: Cornell University

Publication Date: Oct 31, 2020

Publication Name: arXiv (Cornell University)

Research Interests: Computer Science, Artificial Intelligence, Reinforcement Learning, Kinematics, ROBOT, and Novelty<div>()</div>

Publisher: Cornell University

Publication Date: Feb 16, 2021

Publication Name: arXiv (Cornell University)

Publisher: IEEE

Publication Date: 2021

Publication Name: 2021 IEEE International Conference on Robotics and Automation (ICRA)

Research Interests: Computer Science, Motion Planning, Trajectory, Pulley, and Trajectory Optimization<div>()</div>

Publisher: IEEE

Publication Date: 2017

Publication Name: 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

Research Interests: Computer Science and Reinforcement Learning<div>()</div>

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2020

Publication Name: IEEE Robotics and Automation Letters

Publisher: IEEE

Publication Date: 2020

Publication Name: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Research Interests: Mathematics, Computer Science, Artificial Intelligence, Machine Learning, and Embedding<div>()</div>

Publication Date: 2019

Publisher: 2016 American Control Conference (ACC)

Publication Date: 2016

Publication Name: 2016 American Control Conference (ACC)

Research Interests: Computer Science, Distributed Computing, Topology Control, Network Topology, and Wireless Sensor Network<div>()</div>

Publication Date: 2019

Publication Date: 2021

Publisher: ArXiv

Publication Date: 2020

Publication Name: ArXiv

Publisher: ArXiv

Publication Date: 2020

Publication Name: ArXiv

Publisher: ArXiv

Publication Date: 2021

Publication Name: ArXiv

Research Interests: Engineering, Computer Science, Correctness, Swarm Behaviour, and arXiv<div>()</div>

Publisher: ICML

Publication Date: 2020

Publisher: IEEE

Publication Date: 2021

Publication Name: 2021 IEEE International Conference on Robotics and Automation (ICRA)

Research Interests: Computer Science, Artificial Intelligence, Reinforcement Learning, and Generalization<div>()</div>

Publisher: IEEE

Publication Date: 2017

Publication Name: 2017 13th IEEE Conference on Automation Science and Engineering (CASE)

Research Interests: Computer Science, Motion Planning, Computation, Planner, IEEE International Conference on Computer Science and Automation Engineering, and Random Tree<div>()</div>

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Publication Date: 2020

Publication Name: IEEE Transactions on Neural Networks and Learning Systems

Publisher: ASME International

Publication Date: 2017

Publication Name: Journal of Dynamic Systems, Measurement, and Control

Research Interests: Engineering, Computer Science, Artificial Intelligence, Machine Learning, and Imitation<div>()</div>

Publication Date: Jan 3, 2016

Publication Name: IEEE transactions on cybernetics

Research Interests: Computer Science, Artificial Intelligence, Medicine, Cluster Analysis, and Sensor Fusion<div>()</div>

Publisher: Informa UK Limited

Publication Date: 2015

Publication Name: International Journal of Control

Publisher: IEEE

Publication Date: 2009

Publication Name: 2009 IEEE Congress on Evolutionary Computation

Research Interests: Engineering, Computer Science, Motion Planning, Particle Swarm Optimization, ROBOT, and Mathematical Optimization<div>()</div>

Publisher: IEEE

Publication Date: 2015

Publication Name: 2015 American Control Conference (ACC)

Research Interests: Computer Science, Motion Planning, and Global Positioning System<div>()</div>

Research Interests:
Computer Science, Mathematical Optimization, Markov chain, Swarm Behaviour, Homogeneous, and Ergodicity

Research Interests:
Mathematics, Computer Science, Nash Equilibrium, and Mathematical Optimization

Research Interests:
Computer Science, Artificial Intelligence, Reinforcement Learning, Kinematics, ROBOT, and Novelty

Research Interests:
Computer Science, Motion Planning, Trajectory, Pulley, and Trajectory Optimization

Research Interests:
Computer Science and Reinforcement Learning

Research Interests:
Mathematics, Computer Science, Artificial Intelligence, Machine Learning, and Embedding

Research Interests:
Computer Science, Distributed Computing, Topology Control, Network Topology, and Wireless Sensor Network

Research Interests:
Engineering, Computer Science, Correctness, Swarm Behaviour, and arXiv

Research Interests:
Computer Science, Artificial Intelligence, Reinforcement Learning, and Generalization

Research Interests:
Computer Science, Motion Planning, Computation, Planner, IEEE International Conference on Computer Science and Automation Engineering, and Random Tree

Research Interests:
Engineering, Computer Science, Artificial Intelligence, Machine Learning, and Imitation

Research Interests:
Computer Science, Artificial Intelligence, Medicine, Cluster Analysis, and Sensor Fusion

Research Interests:
Engineering, Computer Science, Motion Planning, Particle Swarm Optimization, ROBOT, and Mathematical Optimization

Research Interests:
Computer Science, Motion Planning, and Global Positioning System

Research Interests:
Mathematics, Computer Science, Physics, Gaussian, and Gaussian Process

Research Interests:
Mathematics and Computer Science

Research Interests:
Mathematical Optimization, Interior Point Method, and Homogeneous