Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

    Justin Bayer

    A standard deep convolutional neural network paired with a suitable loss function learns compact local image descriptors that perform comparably to state-of-the art approaches.
    Predictive modeling of human or humanoid movement becomes increasingly complex as the dimensionality of those movements grows. Dynamic Movement Primitives (DMP) have been shown to be a powerful method of representing such movements, but... more
    Predictive modeling of human or humanoid movement becomes increasingly complex as the dimensionality of those movements grows. Dynamic Movement Primitives (DMP) have been shown to be a powerful method of representing such movements, but do not generalize well when used in configuration or task space. To solve this problem we propose a model called autoencoded dynamic movement primitive (AE- DMP) which uses deep autoencoders to find a representation of movement in a latent feature space, in which DMP can optimally generalize. The architecture embeds DMP into such an autoencoder and allows the whole to be trained as a unit. To further improve the model for multiple movements, sparsity is added for the feature layer neurons; therefore, various movements can be observed clearly in the feature space. After training, the model finds a single hidden neuron from the sparsity that can efficiently generate new movements. Our experiments clearly demonstrate the efficiency of missing data imputation using 50-dimensional human movement data.
    Recent advances in the estimation of deep directed graphical models and recurrent networks let us contribute to the removal of a blind spot in the area of probabilistc modelling of time series. The proposed methods i) can infer... more
    Recent advances in the estimation of deep directed graphical models and recurrent networks let us contribute to the removal of a blind spot in the area of probabilistc modelling of time series. The proposed methods i) can infer distributed latent state-space trajectories with nonlinear transitions, ii) scale to large data sets thanks to the use of a stochastic objective and fast, approximate inference, iii) enable the design of rich emission models which iv) will naturally lead to structured outputs. Two different paths of introducing latent state sequences are pursued, leading to the variational recurrent auto encoder (VRAE) and the variational one step predictor (VOSP). The use of independent Wiener processes as priors on the latent state sequence is a viable compromise between efficient computation of the Kullback-Leibler divergence from the variational approximation of the posterior and maintaining a reasonable belief in the dynamics. We verify our methods empirically, obtaining results close or superior to the state of the art. We also show qualitative results for denoising and missing value imputation.
    Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning... more
    Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning methods are utilized to construct high-level representations that are discriminative enough for subsequently trained supervised classification algorithms. However, it has never been \emph{quantitatively} investigated yet how well unsupervised learning methods can find \emph{low-level representations} for image patches without any additional supervision. In this paper we examine the performance of pure unsupervised methods on a low-level correspondence task, a problem that is central to many Computer Vision applications. We find that a special type of Restricted Boltzmann Machines (RBMs) performs comparably to hand-crafted descriptors. Additionally, a simple binarization scheme produces compact representations that perform better than several state-of-the-art descriptors.
    ABSTRACT Estimating human fingertip forces is required to understand force distribution in grasping and manipulation. Human grasping behavior can then be used to develop force-and impedance-based grasping and manipulation strategies for... more
    ABSTRACT Estimating human fingertip forces is required to understand force distribution in grasping and manipulation. Human grasping behavior can then be used to develop force-and impedance-based grasping and manipulation strategies for robotic hands. However, estimating human grip force naturally is only possible with instrumented objects or unnatural gloves, thus greatly limiting the type of objects used. In this paper we describe an approach which uses images of the human fingertip to reconstruct grip force and torque at the finger. Our approach does not use finger-mounted equipment, but instead a steady camera observing the fingers of the hand from a distance. This allows for finger force estimation without any physical interference with the hand or object itself, and is therefore universally applicable. We construct a 3-dimensional finger model from 2D images. Convolutional Neural Networks (CNN) are used to predict the 2D image to a 3D model transformation matrix. Two methods of CNN are designed for separate and combined outputs of orientation and position. After learning, our system shows an alignment accuracy over 98% on unknown data. In the final step, a Gaussian process estimates finger force and torque from the aligned images based on color changes and deformations of the nail and its surrounding skin. Experimental results shows that the accuracy achieves about 95% in the force estimation and 90% in the torque.
    ABSTRACT Estimating human fingertip forces is required to understand force distribution in grasping and manipulation. Human grasping behavior can then be used to develop force-and impedance-based grasping and manipulation strategies for... more
    ABSTRACT Estimating human fingertip forces is required to understand force distribution in grasping and manipulation. Human grasping behavior can then be used to develop force-and impedance-based grasping and manipulation strategies for robotic hands. However, estimating human grip force naturally is only possible with instrumented objects or unnatural gloves, thus greatly limiting the type of objects used. In this paper we describe an approach which uses images of the human fingertip to reconstruct grip force and torque at the finger. Our approach does not use finger-mounted equipment, but instead a steady camera observing the fingers of the hand from a distance. This allows for finger force estimation without any physical interference with the hand or object itself, and is therefore universally applicable. We construct a 3-dimensional finger model from 2D images. Convolutional Neural Networks (CNN) are used to predict the 2D image to a 3D model transformation matrix. Two methods of CNN are designed for separate and combined outputs of orientation and position. After learning, our system shows an alignment accuracy over 98% on unknown data. In the final step, a Gaussian process estimates finger force and torque from the aligned images based on color changes and deformations of the nail and its surrounding skin. Experimental results shows that the accuracy achieves about 95% in the force estimation and 90% in the torque.
    Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning... more
    Unsupervised feature learning has shown impressive results for a wide range of input modalities, in particular for object classification tasks in computer vision. Using a large amount of unlabeled data, unsupervised feature learning methods are utilized to construct high-level representations that are discriminative enough for subsequently trained supervised classification algorithms. However, it has never been \emph{quantitatively} investigated yet how well unsupervised learning methods can find \emph{low-level representations} for image patches without any additional supervision. In this paper we examine the performance of pure unsupervised methods on a low-level correspondence task, a problem that is central to many Computer Vision applications. We find that a special type of Restricted Boltzmann Machines (RBMs) performs comparably to hand-crafted descriptors. Additionally, a simple binarization scheme produces compact representations that perform better than several state-of-the...
    ABSTRACT Recent advances in the estimation of deep directed graphical models and recurrent networks let us contribute to the removal of a blind spot in the area of probabilistc modelling of time series. The proposed methods i) can infer... more
    ABSTRACT Recent advances in the estimation of deep directed graphical models and recurrent networks let us contribute to the removal of a blind spot in the area of probabilistc modelling of time series. The proposed methods i) can infer distributed latent state-space trajectories with nonlinear transitions, ii) scale to large data sets thanks to the use of a stochastic objective and fast, approximate inference, iii) enable the design of rich emission models which iv) will naturally lead to structured outputs. Two different paths of introducing latent state sequences are pursued, leading to the variational recurrent auto encoder (VRAE) and the variational one step predictor (VOSP). The use of independent Wiener processes as priors on the latent state sequence is a viable compromise between efficient computation of the Kullback-Leibler divergence from the variational approximation of the posterior and maintaining a reasonable belief in the dynamics. We verify our methods empirically, obtaining results close or superior to the state of the art. We also show qualitative results for denoising and missing value imputation.
    ABSTRACT Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient... more
    ABSTRACT Leveraging advances in variational inference, we propose to enhance recurrent neural networks with latent variables, resulting in Stochastic Recurrent Networks (STORNs). The model i) can be trained with stochastic gradient methods, ii) allows structured and multi-modal conditionals at each time step, iii) features a reliable estimator of the marginal likelihood and iv) is a generalisation of deterministic recurrent neural networks. We evaluate the method on four polyphonic musical data sets and motion capture data.
    Research Interests:
    PyBrain is a machine learning library written in Python designed to facilitate both the applica-tion of and research on premier learning algorithms such as LSTM (Hochreiter and Schmidhuber, 1997), deep belief networks, and policy gradient... more
    PyBrain is a machine learning library written in Python designed to facilitate both the applica-tion of and research on premier learning algorithms such as LSTM (Hochreiter and Schmidhuber, 1997), deep belief networks, and policy gradient algorithms. Emphasizing ...
    PyBrain is a versatile machine learning library for Python. Its goal is to provide flexible, easyto-use yet still powerful algorithms for machine learning tasks, including a variety of predefined environments and benchmarks to test and... more
    PyBrain is a versatile machine learning library for Python. Its goal is to provide flexible, easyto-use yet still powerful algorithms for machine learning tasks, including a variety of predefined environments and benchmarks to test and compare algorithms. Implemented algorithms include Long Short-Term Memory (LSTM), policy gradient methods, (multidimensional) recurrent neural networks and deep belief networks.
    We introduce Deep Variational Bayes Filters (DVBF), a new method for unsuper-vised learning of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable... more
    We introduce Deep Variational Bayes Filters (DVBF), a new method for unsuper-vised learning of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions by means of variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction.