3D hand pose regression with variants of decision forests

Tang, Danhang

811

IRUS Total
Downloads

Altmetric

3D hand pose regression with variants of decision forests

File	Description	Size	Format
Tang-D-2016-PhD-Thesis.pdf	Thesis	50.63 MB	Adobe PDF	View/Open

Title:	3D hand pose regression with variants of decision forests
Authors:	Tang, Danhang
Item Type:	Thesis or dissertation
Abstract:	3D hand pose regression is a fundamental component in many modern human computer interaction applications such as sign language recognition, virtual object manipulation, game control, etc. This thesis focuses on the scope of 3D pose regression with a single hand from depth data. The problem has many challenges including high degrees of freedom, severe viewpoint changes, self-occlusion and sensor noise. The main contributions of this work are to propose a series of decision forest-based methods in a progressive manner, which improves upon the previous and achieves state-of-the-art performance is achieved in the end. The thesis first introduces a novel algorithm called semi-supervised transductive regression forest, which combines transductive learning and semi-supervised learning to bridge the gap between synthetically generated, noise-free training data and real noisy data. Moreover, it incorporates a coarse-to-fine training quality function to handle viewpoint changes in a more efficient manner. As a patch-based method, STR forest has high complexity during inference. To handle that, this thesis proposes latent regression forest, a method that models the pose estimation problem as a coarse-to-fine search. This inherently combines the efficiency of a holistic method and the flexibility of a patch-based method, and thus results in 62.5 FPS without CPU/GPU optimisation. Targeting the drawbacks of LRF, a new algorithm called hierarchical sampling forests is proposed to model this problem as a progressive search, guided by kinematic structure. Hence the intermediate results (partial poses) can be verified by a new efficient energy function. Consequently it can produce more accurate full poses. All these methods are thoroughly described, compared and published. In the conclusion part we discuss and analyse their differences, limitations and usage scenarios, and then propose a few ideas for future work.
Content Version:	Open Access
Issue Date:	Sep-2015
Date Awarded:	Mar-2016
URI:	http://hdl.handle.net/10044/1/31531
DOI:	https://doi.org/10.25560/31531
Supervisor:	Kim, Tae-Kyun
Department:	Electrical and Electronic Engineering
Publisher:	Imperial College London
Qualification Level:	Doctoral
Qualification Name:	Doctor of Philosophy (PhD)
Appears in Collections:	Electrical and Electronic Engineering PhD theses

Unless otherwise indicated, items in Spiral are protected by copyright and are licensed under a Creative Commons Attribution NonCommercial NoDerivatives License.