206
IRUS Total
Downloads
  Altmetric

Machine learning methods for face modelling and analysis in-the-wild

File Description SizeFormat 
Kossaifi-J-2019-PhD-Thesis.pdfThesis24.17 MBAdobe PDFView/Open
Title: Machine learning methods for face modelling and analysis in-the-wild
Authors: Kossaifi, Jean
Item Type: Thesis or dissertation
Abstract: Automatic facial analysis is at the intersection between computer vision and machine learning. It consists of two main steps. First, facial alignment, which typically consists of the detection of a set of fiducial points, or landmarks, on the face. Secondly, the aligned faces are then used, either directly as pixels intensity, or after extracting more robust features (either hand-crafted ones such as histograms of oriented gradients, or learned with a deep neural network), as input to estimate emotional states. In this thesis, we develop a complete pipeline for facial analysis in real-life, naturalistic conditions (in-the-wild), covering both steps. We first explore generative models for the task of facial alignment, Active Appearance Models (AAMs). Specifically, we introduce a new second order method for fitting AAMs. We then introduce a bidirectional method that simultaneously deforms the model and the image, leading to faster convergence. In both cases, we leverage the structure in the problem to obtain exact solutions with better computational complexity. We show that, when trained in-the-wild, they achieve state- of-the-art performance, while requiring smaller datasets than discriminative methods. We also demonstrate how to leverage the statistical shape model and motion model from AAMs to constrain generative adversarial networks. We then build on the facial alignment framework to estimate dimensional measures of emotion. Specifically, we estimate continuous levels of valence (how positive or negative a state of mind is) and arousal (how exciting or calming the experience is). To do so, we introduce a new database of images collected in-the-wild, and annotated per-frame in terms of continuous levels of valence and arousal, along with accurate facial landmarks. We then demonstrate the importance of training models on data collected in-the-wild as opposed to existing databases, mainly collected in laboratory, or controlled environments. While developing tools for better facial analysis, it became clear that, while the data we work with has a rich multi-linear structure (e.g. spatial and temporal), this is discarded by current methods. We therefore endeavoured in devising new methods able to leverage that structure. In particular, given the absence of software for tensor methods, we created TensorLy, a high level API for tensor algebra, decomposition and regression in Python. Its flexible backend system makes it possible to seamlessly run computation on various hardware with several libraries, including deep learning libraries such as PyTorch, Tensor- Flow or MXNet. This allowed us to introduce new ways of combining tensor methods with deep learning, such as tensor contraction and regression layers. This type of hybrid method combines the power of tensor algebra with the efficiency of deep learning. It makes it possible to devise efficient algorithms that achieve state-of-the-art performance and are scalable to very large datasets, while enabling large parameter space savings.
Content Version: Open Access
Issue Date: Oct-2018
Date Awarded: Feb-2019
URI: http://hdl.handle.net/10044/1/67896
DOI: https://doi.org/10.25560/67896
Copyright Statement: Creative Commons Attribution NonCommercial NoDerivatives Licence
Supervisor: Pantic, Maja
Department: Computing
Publisher: Imperial College London
Qualification Level: Doctoral
Qualification Name: Doctor of Philosophy (PhD)
Appears in Collections:Computing PhD theses