Search | arXiv e-print repository

On the Importance of Accurate Geometry Data for Dense 3D Vision Tasks

Authors: HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Nassir Navab, Benjamin Busam

Abstract: Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are typically not compared nor discussed in the literature due to a lack of multi-modal datasets. Texture-less regions are problematic for structure from motion and stereo, reflective material poses issues for ac… ▽ More Learning-based methods to solve dense 3D vision problems typically train on 3D sensor data. The respectively used principle of measuring distances provides advantages and drawbacks. These are typically not compared nor discussed in the literature due to a lack of multi-modal datasets. Texture-less regions are problematic for structure from motion and stereo, reflective material poses issues for active sensing, and distances for translucent objects are intricate to measure with existing hardware. Training on inaccurate or corrupt data induces model bias and hampers generalisation capabilities. These effects remain unnoticed if the sensor measurement is considered as ground truth during the evaluation. This paper investigates the effect of sensor errors for the dense 3D vision tasks of depth estimation and reconstruction. We rigorously show the significant impact of sensor characteristics on the learned predictions and notice generalisation issues arising from various technologies in everyday household environments. For evaluation, we introduce a carefully designed dataset\footnote{dataset available at https://github.com/Junggy/HAMMER-dataset} comprising measurements from commodity sensors, namely D-ToF, I-ToF, passive/active stereo, and monocular RGB+P. Our study quantifies the considerable sensor noise impact and paves the way to improved dense vision estimates and targeted data fusion. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: Accepted at CVPR 2023, Main Paper + Supp. Mat. arXiv admin note: substantial text overlap with arXiv:2205.04565

arXiv:2205.04565 [pdf, other]

Is my Depth Ground-Truth Good Enough? HAMMER -- Highly Accurate Multi-Modal Dataset for DEnse 3D Scene Regression

Authors: HyunJun Jung, Patrick Ruhkamp, Guangyao Zhai, Nikolas Brasch, Yitong Li, Yannick Verdie, Jifei Song, Yiren Zhou, Anil Armagan, Slobodan Ilic, Ales Leonardis, Benjamin Busam

Abstract: Depth estimation is a core task in 3D computer vision. Recent methods investigate the task of monocular depth trained with various depth sensor modalities. Every sensor has its advantages and drawbacks caused by the nature of estimates. In the literature, mostly mean average error of the depth is investigated and sensor capabilities are typically not discussed. Especially indoor environments, howe… ▽ More Depth estimation is a core task in 3D computer vision. Recent methods investigate the task of monocular depth trained with various depth sensor modalities. Every sensor has its advantages and drawbacks caused by the nature of estimates. In the literature, mostly mean average error of the depth is investigated and sensor capabilities are typically not discussed. Especially indoor environments, however, pose challenges for some devices. Textureless regions pose challenges for structure from motion, reflective materials are problematic for active sensing, and distances for translucent material are intricate to measure with existing sensors. This paper proposes HAMMER, a dataset comprising depth estimates from multiple commonly used sensors for indoor depth estimation, namely ToF, stereo, structured light together with monocular RGB+P data. We construct highly reliable ground truth depth maps with the help of 3D scanners and aligned renderings. A popular depth estimators is trained on this data and typical depth senosors. The estimates are extensively analyze on different scene structures. We notice generalization issues arising from various sensor technologies in household environments with challenging but everyday scene content. HAMMER, which we make publicly available, provides a reliable base to pave the way to targeted depth improvements and sensor fusion approaches. △ Less

Submitted 9 May, 2022; originally announced May 2022.

arXiv:2003.13764 [pdf, other]

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

Authors: Anil Armagan, Guillermo Garcia-Hernando, Seungryul Baek, Shreyas Hampali, Mahdi Rad, Zhaohui Zhang, Shipeng Xie, MingXiu Chen, Boshen Zhang, Fu Xiong, Yang Xiao, Zhiguo Cao, Junsong Yuan, Pengfei Ren, Weiting Huang, Haifeng Sun, Marek Hrúz, Jakub Kanis, Zdeněk Krňoul, Qingfu Wan, Shile Li, Linlin Yang, Dongheui Lee, Angela Yao, Weiguo Zhou , et al. (10 additional authors not shown)

Abstract: We study how well different types of approaches generalise in the task of 3D hand pose estimation under single hand scenarios and hand-object interaction. We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set. Unfortunately, since the space of hand poses is highly dimensional, it is inherently not feasible to cover the whole… ▽ More We study how well different types of approaches generalise in the task of 3D hand pose estimation under single hand scenarios and hand-object interaction. We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set. Unfortunately, since the space of hand poses is highly dimensional, it is inherently not feasible to cover the whole space densely, despite recent efforts in collecting large-scale training datasets. This sampling problem is even more severe when hands are interacting with objects and/or inputs are RGB rather than depth images, as RGB images also vary with lighting conditions and colors. To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set. More exactly, HANDS'19 is designed (a) to evaluate the influence of both depth and color modalities on 3D hand pose estimation, under the presence or absence of objects; (b) to assess the generalisation abilities w.r.t. four main axes: shapes, articulations, viewpoints, and objects; (c) to explore the use of a synthetic hand model to fill the gaps of current datasets. Through the challenge, the overall accuracy has dramatically improved over the baseline, especially on extrapolation tasks, from 27mm to 13mm mean joint error. Our analyses highlight the impacts of: Data pre-processing, ensemble approaches, the use of a parametric 3D hand model (MANO), and different HPE methods/backbones. △ Less

Submitted 10 September, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: European Conference on Computer Vision (ECCV), 2020

arXiv:2003.12344 [pdf, other]

Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D Object Pose Estimation in Color Images

Authors: Juil Sock, Guillermo Garcia-Hernando, Anil Armagan, Tae-Kyun Kim

Abstract: Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images. These annotations are generally expensive to obtain and a common workaround is to generate and train on synthetic scenes, with the drawback of limited generalisation when the model is deployed in the real world. In this work, a two-… ▽ More Most successful approaches to estimate the 6D pose of an object typically train a neural network by supervising the learning with annotated poses in real world images. These annotations are generally expensive to obtain and a common workaround is to generate and train on synthetic scenes, with the drawback of limited generalisation when the model is deployed in the real world. In this work, a two-stage 6D object pose estimator framework that can be applied on top of existing neural-network-based approaches and that does not require pose annotations on real images is proposed. The first self-supervised stage enforces the pose consistency between rendered predictions and real input images, narrowing the gap between the two domains. The second stage fine-tunes the previously trained model by enforcing the photometric consistency between pairs of different object views, where one image is warped and aligned to match the view of the other and thus enabling their comparison. In the absence of both real image annotations and depth information, applying the proposed framework on top of two recent approaches results in state-of-the-art performance when compared to methods trained only on synthetic data, domain adaptation baselines and a concurrent self-supervised approach on LINEMOD, LINEMOD OCCLUSION and HomebrewedDB datasets. △ Less

Submitted 16 October, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

Comments: Accepted to 3DV'2020 as Oral

arXiv:1910.10653 [pdf, other]

doi 10.1109/ICASSP40776.2020.9053627

Accurate 6D Object Pose Estimation by Pose Conditioned Mesh Reconstruction

Authors: Pedro Castro, Anil Armagan, Tae-Kyun Kim

Abstract: Current 6D object pose methods consist of deep CNN models fully optimized for a single object but with its architecture standardized among objects with different shapes. In contrast to previous works, we explicitly exploit each object's distinct topological information i.e. 3D dense meshes in the pose estimation model, with an automated process and prior to any post-processing refinement stage. In… ▽ More Current 6D object pose methods consist of deep CNN models fully optimized for a single object but with its architecture standardized among objects with different shapes. In contrast to previous works, we explicitly exploit each object's distinct topological information i.e. 3D dense meshes in the pose estimation model, with an automated process and prior to any post-processing refinement stage. In order to achieve this, we propose a learning framework in which a Graph Convolutional Neural Network reconstructs a pose conditioned 3D mesh of the object. A robust estimation of the allocentric orientation is recovered by computing, in a differentiable manner, the Procrustes' alignment between the canonical and reconstructed dense 3D meshes. 6D egocentric pose is then lifted using additional mask and 2D centroid projection estimations. Our method is capable of self validating its pose estimation by measuring the quality of the reconstructed mesh, which is invaluable in real life applications. In our experiments on the LINEMOD, OCCLUSION and YCB-Video benchmarks, the proposed method outperforms state-of-the-arts. △ Less

Submitted 23 October, 2019; originally announced October 2019.

arXiv:1401.0730 [pdf]

doi 10.1109/CVPRW.2014.123

What is usual in unusual videos? Trajectory snippet histograms for discovering unusualness

Authors: Ahmet Iscen, Anil Armagan, Pinar Duygulu

Abstract: Unusual events are important as being possible indicators of undesired consequences. Moreover, unusualness in everyday life activities may also be amusing to watch as proven by the popularity of such videos shared in social media. Discovery of unusual events in videos is generally attacked as a problem of finding usual patterns, and then separating the ones that do not resemble to those. In this s… ▽ More Unusual events are important as being possible indicators of undesired consequences. Moreover, unusualness in everyday life activities may also be amusing to watch as proven by the popularity of such videos shared in social media. Discovery of unusual events in videos is generally attacked as a problem of finding usual patterns, and then separating the ones that do not resemble to those. In this study, we address the problem from the other side, and try to answer what type of patterns are shared among unusual videos that make them resemble to each other regardless of the ongoing event. With this challenging problem at hand, we propose a novel descriptor to encode the rapid motions in videos utilizing densely extracted trajectories. The proposed descriptor, which is referred to as trajectory snipped histograms, is used to distinguish unusual videos from usual videos, and further exploited to discover snapshots in which unusualness happen. Experiments on domain specific people falling videos and unrestricted funny videos show the effectiveness of our method in capturing unusualness. △ Less

Submitted 2 November, 2014; v1 submitted 3 January, 2014; originally announced January 2014.

Journal ref: Computer Vision and Pattern Recognition Workshops (CVPRW), 2014 IEEE Conference on

arXiv:1207.4854 [pdf, ps, other]

Finite sample posterior concentration in high-dimensional regression

Authors: Nate Strawn, Artin Armagan, Rayan Saab, Lawrence Carin, David Dunson

Abstract: We study the behavior of the posterior distribution in high-dimensional Bayesian Gaussian linear regression models having $p\gg n$, with $p$ the number of predictors and $n$ the sample size. Our focus is on obtaining quantitative finite sample bounds ensuring sufficient posterior probability assigned in neighborhoods of the true regression coefficient vector, $β^0$, with high probability. We assum… ▽ More We study the behavior of the posterior distribution in high-dimensional Bayesian Gaussian linear regression models having $p\gg n$, with $p$ the number of predictors and $n$ the sample size. Our focus is on obtaining quantitative finite sample bounds ensuring sufficient posterior probability assigned in neighborhoods of the true regression coefficient vector, $β^0$, with high probability. We assume that $β^0$ is approximately $S$-sparse and obtain universal bounds, which provide insight into the role of the prior in controlling concentration of the posterior. Based on these finite sample bounds, we examine the implied asymptotic contraction rates for several examples showing that sparsely-structured and heavy-tail shrinkage priors exhibit rapid contraction rates. We also demonstrate that a stronger result holds for the Uniform-Gaussian\footnote[2]{A binary vector of indicators ($γ$) is drawn from the uniform distribution on the set of binary sequences with exactly $S$ ones, and then each $β_i\sim\mathcal{N}(0,V^2)$ if $γ_i=1$ and $β_i=0$ if $γ_i=0$.} prior. These types of finite sample bounds provide guidelines for designing and evaluating priors for high-dimensional problems. △ Less

Submitted 3 January, 2014; v1 submitted 20 July, 2012; originally announced July 2012.

arXiv:1201.3528 [pdf, ps, other]

Path Following and Empirical Bayes Model Selection for Sparse Regression

Authors: Hua Zhou, Artin Armagan, David B. Dunson

Abstract: In recent years, a rich variety of regularization procedures have been proposed for high dimensional regression problems. However, tuning parameter choice and computational efficiency in ultra-high dimensional problems remain vexing issues. The routine use of $\ell_1$ regularization is largely attributable to the computational efficiency of the LARS algorithm, but similar efficiency for better beh… ▽ More In recent years, a rich variety of regularization procedures have been proposed for high dimensional regression problems. However, tuning parameter choice and computational efficiency in ultra-high dimensional problems remain vexing issues. The routine use of $\ell_1$ regularization is largely attributable to the computational efficiency of the LARS algorithm, but similar efficiency for better behaved penalties has remained elusive. In this article, we propose a highly efficient path following procedure for combination of any convex loss function and a broad class of penalties. From a Bayesian perspective, this algorithm rapidly yields maximum a posteriori estimates at different hyper-parameter values. To bypass the inefficiency and potential instability of cross validation, we propose an empirical Bayes procedure for rapidly choosing the optimal model and corresponding hyper-parameter value. This approach applies to any penalty that corresponds to a proper prior distribution on the regression coefficients. While we mainly focus on sparse estimation of generalized linear models, the method extends to more general regularizations such as polynomial trend filtering after reparameterization. The proposed algorithm scales efficiently to large $p$ and/or $n$. Solution paths of 10,000 dimensional examples are computed within one minute on a laptop for various generalized linear models (GLM). Operating characteristics are assessed through simulation studies and the methods are applied to several real data sets. △ Less

Submitted 17 January, 2012; originally announced January 2012.

Comments: 35 pages, 13 figures

arXiv:1107.4976 [pdf, other]

Generalized Beta Mixtures of Gaussians

Authors: Artin Armagan, David B. Dunson, Merlise Clyde

Abstract: In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better properties than traditional Cauchy and double exponential priors. We first propose a new class of normal scale mixtures through a novel generalized… ▽ More In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better properties than traditional Cauchy and double exponential priors. We first propose a new class of normal scale mixtures through a novel generalized beta distribution that encompasses many interesting priors as special cases. This encompassing framework should prove useful in comparing competing priors, considering properties and revealing close connections. We then develop a class of variational Bayes approximations through the new hierarchy presented that will scale more efficiently to the types of truly massive data sets that are now encountered routinely. △ Less

Submitted 13 March, 2012; v1 submitted 25 July, 2011; originally announced July 2011.

Comments: Advances in Neural Information Processing Systems 24 edited by J. Shawe-Taylor and R.S. Zemel and P. Bartlett and F. Pereira and K.Q. Weinberger (2011)

arXiv:1104.4135 [pdf, ps, other]

doi 10.1093/biomet/ast028

Posterior consistency in linear models under shrinkage priors

Authors: Artin Armagan, David B. Dunson, Jaeyong Lee, Waheed U. Bajwa, Nate Strawn

Abstract: We investigate the asymptotic behavior of posterior distributions of regression coefficients in high-dimensional linear models as the number of dimensions grows with the number of observations. We show that the posterior distribution concentrates in neighborhoods of the true parameter under simple sufficient conditions. These conditions hold under popular shrinkage priors given some sparsity assum… ▽ More We investigate the asymptotic behavior of posterior distributions of regression coefficients in high-dimensional linear models as the number of dimensions grows with the number of observations. We show that the posterior distribution concentrates in neighborhoods of the true parameter under simple sufficient conditions. These conditions hold under popular shrinkage priors given some sparsity assumptions. △ Less

Submitted 19 May, 2013; v1 submitted 20 April, 2011; originally announced April 2011.

Comments: To appear in Biometrika

Journal ref: Biometrika, vol. 100, no. 4, pp. 1011-1018, Dec. 2013

arXiv:1104.0861 [pdf, other]

Generalized double Pareto shrinkage

Authors: Artin Armagan, David Dunson, Jaeyong Lee

Abstract: We propose a generalized double Pareto prior for Bayesian shrinkage estimation and inferences in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, forming a bridge between the Laplace and Normal-Jeffreys' priors. While it has a spike at zero like the Laplace density, it also has a Student's $t$-like tail behavior. Bayesian computation is straightforwa… ▽ More We propose a generalized double Pareto prior for Bayesian shrinkage estimation and inferences in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, forming a bridge between the Laplace and Normal-Jeffreys' priors. While it has a spike at zero like the Laplace density, it also has a Student's $t$-like tail behavior. Bayesian computation is straightforward via a simple Gibbs sampling algorithm. We investigate the properties of the maximum a posteriori estimator, as sparse estimation plays an important role in many problems, reveal connections with some well-established regularization procedures, and show some asymptotic results. The performance of the prior is tested through simulations and an application. △ Less

Submitted 26 January, 2013; v1 submitted 5 April, 2011; originally announced April 2011.

Journal ref: Statistica Sinica 23 (2013), 119-143

arXiv:0803.2173 [pdf, ps, other]

Adaptive Ridge Selector (ARiS)

Authors: Artin Armagan, Russell Zaretzki

Abstract: We introduce a new shrinkage variable selection operator for linear models which we term the \emph{adaptive ridge selector} (ARiS). This approach is inspired by the \emph{relevance vector machine} (RVM), which uses a Bayesian hierarchical linear setup to do variable selection and model estimation. Extending the RVM algorithm, we include a proper prior distribution for the precisions of the regre… ▽ More We introduce a new shrinkage variable selection operator for linear models which we term the \emph{adaptive ridge selector} (ARiS). This approach is inspired by the \emph{relevance vector machine} (RVM), which uses a Bayesian hierarchical linear setup to do variable selection and model estimation. Extending the RVM algorithm, we include a proper prior distribution for the precisions of the regression coefficients, $v_{j}^{-1} \sim f(v_{j}^{-1}|η)$, where $η$ is a scalar hyperparameter. A novel fitting approach which utilizes the full set of posterior conditional distributions is applied to maximize the joint posterior distribution $p(\boldsymbolβ,σ^{2},\mathbf{v}^{-1}|\mathbf{y},η)$ given the value of the hyper-parameter $η$. An empirical Bayes method is proposed for choosing $η$. This approach is contrasted with other regularized least squares estimators including the lasso, its variants, nonnegative garrote and ordinary ridge regression. Performance differences are explored for various simulated data examples. Results indicate superior prediction and model selection accuracy under sparse setups and drastic improvement in accuracy of model choice with increasing sample size. △ Less

Submitted 28 May, 2008; v1 submitted 14 March, 2008; originally announced March 2008.

arXiv:0711.3765 [pdf, other]

MCMC Inference for a Model with Sampling Bias: An Illustration using SAGE data

Authors: Russell Zaretzki, Michael A. Gilchrist, William M. Briggs, Artin Armagan

Abstract: This paper explores Bayesian inference for a biased sampling model in situations where the population of interest cannot be sampled directly, but rather through an indirect and inherently biased method. Observations are viewed as being the result of a multinomial sampling process from a tagged population which is, in turn, a biased sample from the original population of interest. This paper pres… ▽ More This paper explores Bayesian inference for a biased sampling model in situations where the population of interest cannot be sampled directly, but rather through an indirect and inherently biased method. Observations are viewed as being the result of a multinomial sampling process from a tagged population which is, in turn, a biased sample from the original population of interest. This paper presents several Gibbs Sampling techniques to estimate the joint posterior distribution of the original population based on the observed counts of the tagged population. These algorithms efficiently sample from the joint posterior distribution of a very large multinomial parameter vector. Samples from this method can be used to generate both joint and marginal posterior inferences. We also present an iterative optimization procedure based upon the conditional distributions of the Gibbs Sampler which directly computes the mode of the posterior distribution. To illustrate our approach, we apply it to a tagged population of messanger RNAs (mRNA) generated using a common high-throughput technique, Serial Analysis of Gene Expression (SAGE). Inferences for the mRNA expression levels in the yeast Saccharomyces cerevisiae are reported. △ Less

Submitted 23 November, 2007; originally announced November 2007.

arXiv:0711.3657

Bayesian Shrinkage Variable Selection

Authors: Artin Armagan, Russell L. Zaretzki

Abstract: Withdrawn due to extensions and submission as another paper. Withdrawn due to extensions and submission as another paper. △ Less

Submitted 16 April, 2008; v1 submitted 23 November, 2007; originally announced November 2007.

Showing 1–14 of 14 results for author: Armagan, A