Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
  • Montréal, Québec, Canada
The development of autonomous vehicles provides an opportunity to have a complete set of camera sensors capturing the environment around the car. Thus, it is important for object detection and tracking to address new challenges, such as... more
The development of autonomous vehicles provides an opportunity to have a complete set of camera sensors capturing the environment around the car. Thus, it is important for object detection and tracking to address new challenges, such as achieving consistent results across views of cameras. To address these challenges, this work presents a new Global Association Graph Model with Link Prediction approach to predict existing tracklets location and link detections with tracklets via cross-attention motion modeling and appearance re-identification. This approach aims at solving issues caused by inconsistent 3D object detection. Moreover, our model exploits to improve the detection accuracy of a standard 3D object detector in the nuScenes detection challenge. The experimental results on the nuScenes dataset demonstrate the benefits of the proposed method to produce SOTA performance on the existing vision-based tracking dataset.
Extracting useful information while ignoring others (e.g. noise, occlusion, lighting) is an essential and challenging data analyzing step for many computer vision tasks such as facial recognition, scene reconstruction, event detection,... more
Extracting useful information while ignoring others (e.g. noise, occlusion, lighting) is an essential and challenging data analyzing step for many computer vision tasks such as facial recognition, scene reconstruction, event detection, image restoration, etc. Data analyzing of those tasks can be formulated as a form of matrix decomposition or factorization to separate useful and/or fill in missing information based on sparsity and/or low-rankness of the data. There has been an increasing number of non-convex approaches including conventional matrix norm optimizing and emerging deep learning models. However, it is hard to optimize the ideal l0-norm or learn the deep models directly and efficiently. Motivated from this challenging process, this thesis proposes two sets of approaches: conventional and deep learning based. For conventional approaches, this thesis proposes a novel online non-convex lp-norm based Robust PCA (OLP-RPCA) approach for matrix decomposition, where 0 < p <...
Large-scale face recognition in-the-wild has been recently achieved matured performance in many real work applications. However, such systems are built on GPU platforms and mostly deploy heavy deep network architectures. Given a... more
Large-scale face recognition in-the-wild has been recently achieved matured performance in many real work applications. However, such systems are built on GPU platforms and mostly deploy heavy deep network architectures. Given a high-performance heavy network as a teacher, this work presents a simple and elegant teacher-student learning paradigm, namely ShrinkTeaNet, to train a portable student network that has significantly fewer parameters and competitive accuracy against the teacher network. Far apart from prior teacher-student frameworks mainly focusing on accuracy and compression ratios in closed-set problems, our proposed teacher-student network is proved to be more robust against open-set problem, i.e. large-scale face recognition. In addition, this work introduces a novel Angular Distillation Loss for distilling the feature direction and the sample distributions of the teacher's hypersphere to its student. Then ShrinkTeaNet framework can efficiently guide the student&#39...
Face Aging has raised considerable attentions and interest from the computer vision community in recent years. Numerous approaches ranging from purely image processing techniques to deep learning structures have been proposed in... more
Face Aging has raised considerable attentions and interest from the computer vision community in recent years. Numerous approaches ranging from purely image processing techniques to deep learning structures have been proposed in literature. In this paper, we aim to give a review of recent developments of modern deep learning based approaches, i.e. Deep Generative Models, for Face Aging task. Their structures, formulation, learning algorithms as well as synthesized results are also provided with systematic discussions. Moreover, the aging databases used in most methods to learn the aging process are also reviewed.
Disentangled representations have been commonly adopted to Age-invariant Face Recognition (AiFR) tasks. However, these methods have reached some limitations with (1) the requirement of large-scale face recognition (FR) training data with... more
Disentangled representations have been commonly adopted to Age-invariant Face Recognition (AiFR) tasks. However, these methods have reached some limitations with (1) the requirement of large-scale face recognition (FR) training data with age labels, which is limited in practice; (2) heavy deep network architecture for high performance; and (3) their evaluations are usually taken place on age-related face databases while neglecting the standard large-scale FR databases to guarantee its robustness. This work presents a novel Attentive Angular Distillation (AAD) approach to Large-scale Lightweight AiFR that overcomes these limitations. Given two high-performance heavy networks as teachers with different specialized knowledge, AAD introduces a learning paradigm to efficiently distill the age-invariant attentive and angular knowledge from those teachers to a lightweight student network making it more powerful with higher FR accuracy and robust against age factor. Consequently, AAD approa...
Active Contour (AC)-based segmentation has been widely used to solve many image processing problems, specially image segmentation. While these AC-based methods offer object shape constraints, they typically look for strong edges or... more
Active Contour (AC)-based segmentation has been widely used to solve many image processing problems, specially image segmentation. While these AC-based methods offer object shape constraints, they typically look for strong edges or statistical modeling for successful segmentation. Clearly, AC-based approaches lack a way to work with labeled images in a supervised machine learning framework. Furthermore, they are unsupervised approaches and strongly depend on many parameters which are chosen by empirical results. Recently, Deep Learning (DL) has become the go-to method for solving many problems in various areas. Over the past decade, DL has achieved remarkable success in various artificial intelligence research areas. DL is supervised methods and requires large volume ground-truth. This paper first provides a fundamental of both Active Contour techniques and Deep Learning framework. We then present the state-of-the-art approaches of Active Contour techniques incorporating in Deep Learning framework.
ABSTRACT
Face recognition under various conditions such as illumination, poses, expression, and occlusion has been one of the most challenging problems in computer vision. Over the last few years there has been significant attention paid to the... more
Face recognition under various conditions such as illumination, poses, expression, and occlusion has been one of the most challenging problems in computer vision. Over the last few years there has been significant attention paid to the low-rank approximation (LRA) and sparse representation (SR) techniques. The applications of these techniques have appeared in many different areas ranging from handwritten character recognition to multi-factor face recognition. In this paper, we will review some of the most recent works using LRA and SR in the multi-factor face recognition problem, and present a novel framework to improve their performance in the recognition of faces under various affecting conditions. Our results are comparable to or better than the state-of-the-art in this area.
Group-level emotion recognition (ER) is a growing research area as the demands for assessing crowds of all sizes is becoming an interest in both the security arena and social media. This work investigates group-level expression... more
Group-level emotion recognition (ER) is a growing research area as the demands for assessing crowds of all sizes is becoming an interest in both the security arena and social media. This work investigates group-level expression recognition on crowd videos where information is not only aggregated across a variable length sequence of frames but also over the set of faces within each frame to produce aggregated recognition results. In this paper, we propose an effective deep feature level fusion mechanism to model the spatial-temporal information in the crowd videos. Furthermore, we extend our proposed NVP fusion mechanism to temporal NVP fussion appoarch to learn the temporal information between frames. In order to demonstrate the robustness and effectiveness of each component in the proposed approach, three experiments were conducted: (i) evaluation on the AffectNet database to benchmark the proposed emoNet for recognizing facial expression; (ii) evaluation on EmotiW2018 to benchmark...
Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial... more
Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial settings and highly depend on the number of iterations. To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS) 1 to employ Gated Recurrent Unit under the energy minimization of a variational LS functional. The curve deformation process in RLS is formed as a hidden state evolution procedure and updated by minimizing an energy functional composed of fitting forces and contour length. By sharing the convolutional features in a fully end-to-end trainable framework, we extend RLS to Contextual RLS (CRLS) to address semantic segmentation in the wild. The experimental results have shown that our proposed RLS...
There has been considerable research in the last several years based on the principles of Active Appearance Models (AAMs). AAM is a robust methodology for general image (object) descriptions that incorporates shape and texture... more
There has been considerable research in the last several years based on the principles of Active Appearance Models (AAMs). AAM is a robust methodology for general image (object) descriptions that incorporates shape and texture information. In this work, we extend the basic AAMs by developing a new method for texture description for the application of human facial modeling and synthesizing. The premise is to develop a better texture based model for the face that incorporates specific facial information such as wrinkling, micro-features (e.g., moles, scares, freckles, etc.), and aging features (e.g., sagging, hollowing of checks, weight gain, etc.) This paper proposes a new improvement in texture synthesis-Gabor Wavelet-based Appearance Model. The experimental results demonstrate the potential of this approach for face-based applications.