Full Professor, Dept. of Electrical Engineering, Chalmers Univ. of Technology, Gothenburg, Sweden. Current main research: Machine learning/deep learning, Digital signal /image analysis, MR image analysis for AI-assisted diagnosis. Big data analytics with applications, Video object tracking, detection and classification. e-Healthcare, Spare time like: outdoor activities, walking, reading, yoga.
This technical report describes the research work on automatic recognizing Chinese traffic signs ... more This technical report describes the research work on automatic recognizing Chinese traffic signs from an implicit public resource, i.e. street views. First, we give a comprehensive survey on Chinese traffic signs and introduce our approaches for collecting street view images that can be used for experimental purposes. Then, we introduce our coarse-to-fine recognition framework consisting of sign detection, sign salient region segmentation, feature extraction (including simple text recognition from signs), and subsequent sign classification. We also propose to incrementally build a sign dataset in a semi-automatic way, aiming at reducing manual effort. Experiments on collected datasets for both sign detection and classification have validated that the proposed framework is feasible and capable of recognizing multiple categories of Chinese traffic signs in a single input image.
This paper proposes a machine-learning-based framework for voltage quality analytics, where the s... more This paper proposes a machine-learning-based framework for voltage quality analytics, where the space phasor model (SPM) of the three-phase voltages before, during, and after the event is applied as input data. The framework proceeds along with three main steps: (a) event extraction, (b) event characterization, and (c) additional information extraction. During the first step, it utilizes a Gaussian-based anomaly detection (GAD) technique to extract the event data from the recording. Principal component analysis (PCA) is adopted during the second step, where it is shown that the principal components correspond to the semi-minor and semi-major axis of the ellipse formed by the SPM. During the third step, these characteristics are interpreted to extract additional information about the underlying cause of the event. The performance of the framework was verified through experiments conducted on datasets containing synthetic and measured power quality events. The results show that the co...
This paper describes a new and highly efficient measurement method (algorithm) that determines ho... more This paper describes a new and highly efficient measurement method (algorithm) that determines how flicker propagates throughout the network and also traces the dominant flicker source. The fundamental principle of the method is to use the
In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in ... more In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in each diffusion sensitizing direction has been the subject of numerous studies. One can estimate the true signal value using either the raw complex-valued data or the real-valued magnitude signal. While conventional methods focus on the former strategy, this paper proposes a new framework for acquiring/processing repeated measurements based on the latter strategy. The aim is to enhance the DTI processing pipeline by adding a diffusion signal estimator (DSE). This permits us to exploit the knowledge of the noise distribution to estimate the true signal value in each direction. An extensive study of the proposed framework, including theoretical analysis, experiments with synthetic data, performance evaluation and comparisons is presented. Our results show that the precision of estimated diffusion parameters is dependent on the number of available samples and the manner in which the DSE acco...
2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016
Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and a... more Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and abnormal limb movements. Detection of limb movement anomalies associated with such neurological dysfunctions in infants is the first step towards early treatment for improving infant development. This paper addresses the issue of detecting and quantifying limb movement anomalies in infants through non-invasive 3D image analysis methods using videos from multiple camera views. We propose a novel scheme for tracking 3D time trajectories of markers on infant's limbs by video analysis techniques. The proposed scheme employ videos captured from three camera views. This enables us to detect a set of enhanced 3D markers through cross-view matching and to effectively handle marker self-occlusions by other body parts. We track a set of 3D trajectories of limb movements by a set of particle filters in parallel, enabling more robust 3D tracking of markers, and use the 3D model errors for quantifying abrupt limb movements. The proposed work makes a significant advancement to the previous work in [1] through employing tracking in 3D space, and hence overcome several main barriers that hinder real applications by using single camera-based techniques. To the best of our knowledge, applying such a multi-view video analysis approach for assessing neurological dysfunctions of infants through 3D time trajectories of markers on limbs is novel, and could lead to computer-aided tools for diagnosis of dysfunctions where early treatment may improve infant development. Experiments were conducted on multi-view neonate videos recorded in a clinical setting and results have provided further support to the proposed method.
2015 IEEE International Conference on Image Processing (ICIP), 2015
This paper addresses issues in fall detection from videos. The focus is on the analysis of human ... more This paper addresses issues in fall detection from videos. The focus is on the analysis of human shapes which deform drastically in camera views while a person falls onto the ground. A novel approach is proposed that performs fall detection from an arbitrary view angle, via shape analysis on a unified Riemannian manifold for different camera views. The main novelties of this paper include: (a) representing dynamic shapes as points moving on a unit n-sphere, one of the simplest Riemannian manifolds; (b) characterizing the deformation of shapes by computing velocity statistics of their corresponding manifold points, based on geodesic distances on the manifold. Experiments have been conducted on two publicly available video datasets for fall detection. Test, evaluations and comparisons with 6 existing methods show the effectiveness of our proposed method.
2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018
Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (A... more Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (AD) remain a challenging task. In this paper, we propose an efficient and simple three-dimensional convolutional network (3D ConvNet) architecture that is able to achieve high performance for detection of AD on a relatively large dataset. The proposed 3D ConvNet consists of five convolutional layers for feature extraction, followed by three fully-connected layers for AD/NC classification. The main contributions of the paper include: (a) propose a novel and effective 3D ConvNet architecture; (b) study the impact of hyper-parameter selection on the performance of AD classification; (c) study the impact of pre-processing; (d) study the impact of data partitioning; (e) study the impact of dataset size. Experiments conducted on an ADNI dataset containing 340 subjects and 1198 MRI brain scans have resulted good performance (with the test accuracy of 98.74%, 100% AD detection rate and 2,4% false alarm). Comparisons with 7 existing state-of-the-art methods have provided strong support to the robustness of the proposed method.
This paper addresses issues in fall detection in videos. We propose a novel method to detect huma... more This paper addresses issues in fall detection in videos. We propose a novel method to detect human falls from arbitrary view angles, through analyzing dynamic shape and motion of image regions of human bodies on Riemannian manifolds. The proposed method exploits time-dependent dynamic features on smooth manifolds based on the observation that human falls often involve drastically shape changes and abrupt motions as comparing with other activities. The main novelties of this paper include: (a) representing videos of human activities by dynamic shape points and motion points moving on two separate unit n-spheres, or, two simple Riemannian manifolds; (b) characterizing the dynamic shape and motion of each video activity by computing the velocity statistics on the two manifolds, based on geodesic distances; (c) combining the statistical features of dynamic shape and motion that are learned from their corresponding manifolds via mutual information. Experiments were conducted on three video datasets, containing 400 videos of 5 activities, 100 videos of 4 activities, and 768 videos of 3 activities, respectively, where videos were captured from cameras in different view angles. Our test results have shown high detection rate (average 99.38%) and low false alarm (average 1.84%). Comparisons with eight state-of-the-art methods have provided further support to the proposed method.
2015 17th International Conference on E-health Networking, Application & Services (HealthCom), 2015
This paper addresses issues in fall detection from videos. Since it has been a broadly accepted i... more This paper addresses issues in fall detection from videos. Since it has been a broadly accepted intuition that a falling person usually undergoes large physical movement and displacement in a short time interval, the study is thus focused on measuring the intensity and temporal variation of pose change and body motion. The main novelties of this paper include: (a) characterizing pose/motion dynamics based on centroid velocity, head-to-centroid distance, histogram of oriented gradients and optical flow; (b) extracting compact features based on the mean and variance of pose/motion dynamics; (c) detecting human by combining depth information and background mixture models. Experiments have been conducted on an RGB-D video dataset for fall detection. Tests and evaluations show the effectiveness of the proposed method.
2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015
The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to i... more The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to its uniformly distributed orientations and rotationally invariant condition number. The major drawback with this scheme is that it is not available for arbitrary number of measurements. In this paper (i) we propose an algorithm to find the icosahedral scheme for any number of measurements. Performance of the obtained GES is evaluated and compared with that of Jones and traditional icosahedral schemes in terms of condition number, standard deviation of the estimated fractional anisotropy and distribution of diffusion sensitizing directions; and (ii) we introduce minimum eigenvalue of the information matrix as a new optimality metric to replace condition number. Unlike condition number, it is proportional to the number of measurements and thus in agreement with the intuition that more measurements leads to more robust tensor estimation. Furthermore, it may independently be maximized to design GESs for different diffusion imaging techniques.
This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimatio... more This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimation. Through measurement of voltages and currents at a specific system bus, the estimate of the grid impedance was obtained by first extracting the sequences of the time-dependent features for the measured data using a long short-term memory autoencoder (LSTM-AE) followed by a random forest (RF) regression method to find the nonlinear map function between extracted features and the corresponding grid impedance for a wide range of frequencies. The method was trained via simulation by using time-series measurements (i.e., voltage and current) for different system parameters and verified through several case studies. The obtained results show that: (1) extracting the time-dependent features of the voltage/current data improves the performance of the RF regression method; (2) the RF regression method is robust and allows grid impedance estimation within 1.5 grid cycles; (3) the proposed method...
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017
This paper addresses issues in human fall detection from videos. Unlike using handcrafted feature... more This paper addresses issues in human fall detection from videos. Unlike using handcrafted features in the conventional machine learning, we extract features from Convolutional Neural Networks (CNNs) for human fall detection. Similar to many existing work using two stream inputs, we use a spatial CNN stream with raw image difference and a temporal CNN stream with optical flow as the inputs of CNN. Different from conventional two stream action recognition work, we exploit sparse representation with residual-based pooling on the CNN extracted features, for obtaining more discriminative feature codes. For characterizing the sequential information in video activity, we use the code vector from long-range dynamic feature representation by concatenating codes in segment-levels as the input to a SVM classifier. Experiments have been conducted on two public video databases for fall detection. Comparisons with six existing methods show the effectiveness of the proposed method.
2019 IEEE International Conference on Image Processing (ICIP), 2019
This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from M... more This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from Magnetic Resonance Images (MRIs). Many existing AD detection methods use single-scale feature learning from brain scans. In this paper, we propose a multiscale deep learning architecture for learning AD features. The main contributions of the paper include: (a) propose a novel 3D multiscale CNN architecture for the dedicated task of AD detection; (b) propose a feature fusion and enhancement strategy for multiscale features; (c) empirical study on the impact of several settings, including two dataset partitioning approaches, and the use of multiscale and feature enhancement. Experiments were conducted on an open ADNI dataset (1198 brain scans from 337 subjects), test results have shown the effectiveness of the proposed method with test accuracy of 93.53%, 87.24% (best, average) on subject-separated dataset, and 99.44%, 98.80% (best, average) on random brain scan-partitioned dataset. Comparison with eight existing methods has provided further support to the proposed method.
This technical report describes the research work on automatic recognizing Chinese traffic signs ... more This technical report describes the research work on automatic recognizing Chinese traffic signs from an implicit public resource, i.e. street views. First, we give a comprehensive survey on Chinese traffic signs and introduce our approaches for collecting street view images that can be used for experimental purposes. Then, we introduce our coarse-to-fine recognition framework consisting of sign detection, sign salient region segmentation, feature extraction (including simple text recognition from signs), and subsequent sign classification. We also propose to incrementally build a sign dataset in a semi-automatic way, aiming at reducing manual effort. Experiments on collected datasets for both sign detection and classification have validated that the proposed framework is feasible and capable of recognizing multiple categories of Chinese traffic signs in a single input image.
This paper proposes a machine-learning-based framework for voltage quality analytics, where the s... more This paper proposes a machine-learning-based framework for voltage quality analytics, where the space phasor model (SPM) of the three-phase voltages before, during, and after the event is applied as input data. The framework proceeds along with three main steps: (a) event extraction, (b) event characterization, and (c) additional information extraction. During the first step, it utilizes a Gaussian-based anomaly detection (GAD) technique to extract the event data from the recording. Principal component analysis (PCA) is adopted during the second step, where it is shown that the principal components correspond to the semi-minor and semi-major axis of the ellipse formed by the SPM. During the third step, these characteristics are interpreted to extract additional information about the underlying cause of the event. The performance of the framework was verified through experiments conducted on datasets containing synthetic and measured power quality events. The results show that the co...
This paper describes a new and highly efficient measurement method (algorithm) that determines ho... more This paper describes a new and highly efficient measurement method (algorithm) that determines how flicker propagates throughout the network and also traces the dominant flicker source. The fundamental principle of the method is to use the
In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in ... more In the context of diffusion tensor imaging (DTI), the utility of making repeated measurements in each diffusion sensitizing direction has been the subject of numerous studies. One can estimate the true signal value using either the raw complex-valued data or the real-valued magnitude signal. While conventional methods focus on the former strategy, this paper proposes a new framework for acquiring/processing repeated measurements based on the latter strategy. The aim is to enhance the DTI processing pipeline by adding a diffusion signal estimator (DSE). This permits us to exploit the knowledge of the noise distribution to estimate the true signal value in each direction. An extensive study of the proposed framework, including theoretical analysis, experiments with synthetic data, performance evaluation and comparisons is presented. Our results show that the precision of estimated diffusion parameters is dependent on the number of available samples and the manner in which the DSE acco...
2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2016
Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and a... more Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and abnormal limb movements. Detection of limb movement anomalies associated with such neurological dysfunctions in infants is the first step towards early treatment for improving infant development. This paper addresses the issue of detecting and quantifying limb movement anomalies in infants through non-invasive 3D image analysis methods using videos from multiple camera views. We propose a novel scheme for tracking 3D time trajectories of markers on infant's limbs by video analysis techniques. The proposed scheme employ videos captured from three camera views. This enables us to detect a set of enhanced 3D markers through cross-view matching and to effectively handle marker self-occlusions by other body parts. We track a set of 3D trajectories of limb movements by a set of particle filters in parallel, enabling more robust 3D tracking of markers, and use the 3D model errors for quantifying abrupt limb movements. The proposed work makes a significant advancement to the previous work in [1] through employing tracking in 3D space, and hence overcome several main barriers that hinder real applications by using single camera-based techniques. To the best of our knowledge, applying such a multi-view video analysis approach for assessing neurological dysfunctions of infants through 3D time trajectories of markers on limbs is novel, and could lead to computer-aided tools for diagnosis of dysfunctions where early treatment may improve infant development. Experiments were conducted on multi-view neonate videos recorded in a clinical setting and results have provided further support to the proposed method.
2015 IEEE International Conference on Image Processing (ICIP), 2015
This paper addresses issues in fall detection from videos. The focus is on the analysis of human ... more This paper addresses issues in fall detection from videos. The focus is on the analysis of human shapes which deform drastically in camera views while a person falls onto the ground. A novel approach is proposed that performs fall detection from an arbitrary view angle, via shape analysis on a unified Riemannian manifold for different camera views. The main novelties of this paper include: (a) representing dynamic shapes as points moving on a unit n-sphere, one of the simplest Riemannian manifolds; (b) characterizing the deformation of shapes by computing velocity statistics of their corresponding manifold points, based on geodesic distances on the manifold. Experiments have been conducted on two publicly available video datasets for fall detection. Test, evaluations and comparisons with 6 existing methods show the effectiveness of our proposed method.
2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018
Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (A... more Automatic extraction of features from MRI brain scans and diagnosis of Alzheimer's Disease (AD) remain a challenging task. In this paper, we propose an efficient and simple three-dimensional convolutional network (3D ConvNet) architecture that is able to achieve high performance for detection of AD on a relatively large dataset. The proposed 3D ConvNet consists of five convolutional layers for feature extraction, followed by three fully-connected layers for AD/NC classification. The main contributions of the paper include: (a) propose a novel and effective 3D ConvNet architecture; (b) study the impact of hyper-parameter selection on the performance of AD classification; (c) study the impact of pre-processing; (d) study the impact of data partitioning; (e) study the impact of dataset size. Experiments conducted on an ADNI dataset containing 340 subjects and 1198 MRI brain scans have resulted good performance (with the test accuracy of 98.74%, 100% AD detection rate and 2,4% false alarm). Comparisons with 7 existing state-of-the-art methods have provided strong support to the robustness of the proposed method.
This paper addresses issues in fall detection in videos. We propose a novel method to detect huma... more This paper addresses issues in fall detection in videos. We propose a novel method to detect human falls from arbitrary view angles, through analyzing dynamic shape and motion of image regions of human bodies on Riemannian manifolds. The proposed method exploits time-dependent dynamic features on smooth manifolds based on the observation that human falls often involve drastically shape changes and abrupt motions as comparing with other activities. The main novelties of this paper include: (a) representing videos of human activities by dynamic shape points and motion points moving on two separate unit n-spheres, or, two simple Riemannian manifolds; (b) characterizing the dynamic shape and motion of each video activity by computing the velocity statistics on the two manifolds, based on geodesic distances; (c) combining the statistical features of dynamic shape and motion that are learned from their corresponding manifolds via mutual information. Experiments were conducted on three video datasets, containing 400 videos of 5 activities, 100 videos of 4 activities, and 768 videos of 3 activities, respectively, where videos were captured from cameras in different view angles. Our test results have shown high detection rate (average 99.38%) and low false alarm (average 1.84%). Comparisons with eight state-of-the-art methods have provided further support to the proposed method.
2015 17th International Conference on E-health Networking, Application & Services (HealthCom), 2015
This paper addresses issues in fall detection from videos. Since it has been a broadly accepted i... more This paper addresses issues in fall detection from videos. Since it has been a broadly accepted intuition that a falling person usually undergoes large physical movement and displacement in a short time interval, the study is thus focused on measuring the intensity and temporal variation of pose change and body motion. The main novelties of this paper include: (a) characterizing pose/motion dynamics based on centroid velocity, head-to-centroid distance, histogram of oriented gradients and optical flow; (b) extracting compact features based on the mean and variance of pose/motion dynamics; (c) detecting human by combining depth information and background mixture models. Experiments have been conducted on an RGB-D video dataset for fall detection. Tests and evaluations show the effectiveness of the proposed method.
2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), 2015
The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to i... more The icosahedral gradient encoding scheme (GES) is widely used in diffusion MRI community due to its uniformly distributed orientations and rotationally invariant condition number. The major drawback with this scheme is that it is not available for arbitrary number of measurements. In this paper (i) we propose an algorithm to find the icosahedral scheme for any number of measurements. Performance of the obtained GES is evaluated and compared with that of Jones and traditional icosahedral schemes in terms of condition number, standard deviation of the estimated fractional anisotropy and distribution of diffusion sensitizing directions; and (ii) we introduce minimum eigenvalue of the information matrix as a new optimality metric to replace condition number. Unlike condition number, it is proportional to the number of measurements and thus in agreement with the intuition that more measurements leads to more robust tensor estimation. Furthermore, it may independently be maximized to design GESs for different diffusion imaging techniques.
This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimatio... more This paper proposes a deep-learning-based method for frequency-dependent grid impedance estimation. Through measurement of voltages and currents at a specific system bus, the estimate of the grid impedance was obtained by first extracting the sequences of the time-dependent features for the measured data using a long short-term memory autoencoder (LSTM-AE) followed by a random forest (RF) regression method to find the nonlinear map function between extracted features and the corresponding grid impedance for a wide range of frequencies. The method was trained via simulation by using time-series measurements (i.e., voltage and current) for different system parameters and verified through several case studies. The obtained results show that: (1) extracting the time-dependent features of the voltage/current data improves the performance of the RF regression method; (2) the RF regression method is robust and allows grid impedance estimation within 1.5 grid cycles; (3) the proposed method...
2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017
This paper addresses issues in human fall detection from videos. Unlike using handcrafted feature... more This paper addresses issues in human fall detection from videos. Unlike using handcrafted features in the conventional machine learning, we extract features from Convolutional Neural Networks (CNNs) for human fall detection. Similar to many existing work using two stream inputs, we use a spatial CNN stream with raw image difference and a temporal CNN stream with optical flow as the inputs of CNN. Different from conventional two stream action recognition work, we exploit sparse representation with residual-based pooling on the CNN extracted features, for obtaining more discriminative feature codes. For characterizing the sequential information in video activity, we use the code vector from long-range dynamic feature representation by concatenating codes in segment-levels as the input to a SVM classifier. Experiments have been conducted on two public video databases for fall detection. Comparisons with six existing methods show the effectiveness of the proposed method.
2019 IEEE International Conference on Image Processing (ICIP), 2019
This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from M... more This paper addresses the issues of Alzheimer’s disease (AD) characterization and detection from Magnetic Resonance Images (MRIs). Many existing AD detection methods use single-scale feature learning from brain scans. In this paper, we propose a multiscale deep learning architecture for learning AD features. The main contributions of the paper include: (a) propose a novel 3D multiscale CNN architecture for the dedicated task of AD detection; (b) propose a feature fusion and enhancement strategy for multiscale features; (c) empirical study on the impact of several settings, including two dataset partitioning approaches, and the use of multiscale and feature enhancement. Experiments were conducted on an open ADNI dataset (1198 brain scans from 337 subjects), test results have shown the effectiveness of the proposed method with test accuracy of 93.53%, 87.24% (best, average) on subject-separated dataset, and 99.44%, 98.80% (best, average) on random brain scan-partitioned dataset. Comparison with eight existing methods has provided further support to the proposed method.
Uploads
Papers by Irene Y.H. Gu