4 research outputs found
A probabilistic model with parsinomious representation for sensor fusion in recognizing activity in pervasive environment
To tackle the problem of increasing numbers of state transition parameters when the number of sensors increases, we present a probabilistic model together with several parsinomious representations for sensor fusion. These include context specific independence (CSI), mixtures of smaller multinomials and softmax function representations to compactly represent the state transitions of a large number of sensors. The model is evaluated on real-world data acquired through ubiquitous sensors in recognizing daily morning activities. The results show that the combination of CSI and mixtures of smaller multinomials achieves comparable performance with much fewer parameters.<br /
Learning to Represent Haptic Feedback for Partially-Observable Tasks
The sense of touch, being the earliest sensory system to develop in a human
body [1], plays a critical part of our daily interaction with the environment.
In order to successfully complete a task, many manipulation interactions
require incorporating haptic feedback. However, manually designing a feedback
mechanism can be extremely challenging. In this work, we consider manipulation
tasks that need to incorporate tactile sensor feedback in order to modify a
provided nominal plan. To incorporate partial observation, we present a new
framework that models the task as a partially observable Markov decision
process (POMDP) and learns an appropriate representation of haptic feedback
which can serve as the state for a POMDP model. The model, that is parametrized
by deep recurrent neural networks, utilizes variational Bayes methods to
optimize the approximate posterior. Finally, we build on deep Q-learning to be
able to select the optimal action in each state without access to a simulator.
We test our model on a PR2 robot for multiple tasks of turning a knob until it
clicks.Comment: IEEE International Conference on Robotics and Automation (ICRA), 201
Integrating Human-Provided Information Into Belief State Representation Using Dynamic Factorization
In partially observed environments, it can be useful for a human to provide
the robot with declarative information that represents probabilistic relational
constraints on properties of objects in the world, augmenting the robot's
sensory observations. For instance, a robot tasked with a search-and-rescue
mission may be informed by the human that two victims are probably in the same
room. An important question arises: how should we represent the robot's
internal knowledge so that this information is correctly processed and combined
with raw sensory information? In this paper, we provide an efficient belief
state representation that dynamically selects an appropriate factoring,
combining aspects of the belief when they are correlated through information
and separating them when they are not. This strategy works in open domains, in
which the set of possible objects is not known in advance, and provides
significant improvements in inference time over a static factoring, leading to
more efficient planning for complex partially observed tasks. We validate our
approach experimentally in two open-domain planning problems: a 2D discrete
gridworld task and a 3D continuous cooking task. A supplementary video can be
found at http://tinyurl.com/chitnis-iros-18.Comment: IROS 2018 final versio
Learning Factored Representations for Partially Observable Markov Decision Processes
The problem of reinforcement learning in a non-Markov environment is explored using a dynamic Bayesian network, where conditional independence assumptions between random variables are compactly represented by network parameters. The parameters are learned on-line, and approximations are used to perform inference and to compute the optimal value function. The relative effects of inference and value function approximations on the quality of the final policy are investigated, by learning to solve a moderately difficult driving task. The two value function approximations, linear and quadratic, were found to perform similarly, but the quadratic model was more sensitive to initialization. Both performed below the level of human performance on the task. The dynamic Bayesian network performed comparably to a model using a localist hidden state representation, while requiring exponentially fewer parameters. 1 Introduction Reinforcement learning (RL) addresses the problem of learni..