Multisensor data fusion for human activities classification and fall detection

Aman Shrestha

Li, H., Shrestha, A., Fioranelli, F. , Le Kernec, J. , Heidari, H. , Pepa, M., Cippitelli, E., Gambi, E. and Spinsante, S. (2017) Multisensory Data Fusion for Human Activities Classification and Fall Detection. In: IEEE Sensors 2017, Glasgow, UK, 30 Oct - 01 Nov 2017, ISBN 9781509010127 (doi:10.1109/ICSENS.2017.8234179) This is the author’s final accepted version. There may be differences between this version and the published version. You are advised to consult the publisher’s version if you wish to cite from it. http://eprints.gla.ac.uk/148716/ Deposited on: 08 February 2018 Enlighten – Research publications by members of the University of Glasgow http://eprints.gla.ac.uk Multisensor Data Fusion for Human Activities Classification and Fall Detection Haobo Li, Aman Shrestha, Francesco Fioranelli, Julien Le Kernec, Hadi Heidari Matteo Pepa, Enea Cippitelli, Ennio Gambi, Susanna Spinsante School of Engineering, University of Glasgow, Glasgow, United Kingdom Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy Abstract—Significant research exists on the use of wearable sensors in the context of assisted living for activities recognition and fall detection, whereas radar sensors have been studied only recently in this domain. This paper approaches the performance limitation of using individual sensors, especially for classification of similar activities, by implementing information fusion of features extracted from experimental data collected by different sensors, namely a tri-axial accelerometer, a micro-Doppler radar, and a depth camera. Preliminary results confirm that combining information from heterogeneous sensors improves the overall performance of the system. The classification accuracy attained by means of this fusion approach improves by 11.2% compared to radar-only use, and by 16.9% compared to the accelerometer. Furthermore, adding features extracted from a RGB-D Kinect sensor, the overall classification accuracy increases up to 91.3%. Keywords—Accelerometer, Radar sensor, Depth camera, human activities classification, fall detection, machine learning, data fusion. I. INTRODUCTION With increasingly aging population worldwide and increasing incidence of multi-morbidity conditions (i.e. the simultaneous presence of more than several chronic health issues), there is a significant need for automatic systems and sensors capable of classifying human activities and promptly detecting critical events such as falls [1]. Falls have obvious physical consequences, and there is a proven correlation between the long-lie time spent on the floor after the event and a reduction in life expectancy [2]. Activity classification can help characterise a normal behaviour pattern for the monitored patients, to detect anomalies that may be linked to deteriorating physical or cognitive health. Different sensors have been proposed for the aforementioned applications in the context of Ambient Assisted Living (AAL) [3], namely wearables such as accelerometers, gyroscopes and magnetic sensors [4,5,6], video-camera sensors [5], and depth cameras and radar sensors [7] among others. To select one of the many possible technologies, one has to consider the different advantages and disadvantages of each type of sensor, in terms of performance (classification accuracy, rejection of false alarms, and percentage of missed detection), and regarding aspects such as end-users’ acceptance, cost, power consumption, easiness of use and deployment. This leads to the investigation of how information from heterogeneous sensors can be used, leveraging on the strengths of each of them through information fusion. The rest of this paper presents preliminary results of this investigation, with experimental data collected using a radar sensor, a RGB-D Kinect, and a tri-axial accelerometer within a smartphone. The main contribution is the initial investigation of fusion information from heterogeneous sensors, including radar sensors, as well as in the use of a rich set of experimental data, both in terms of the activities considered for classification and the number and age span of the participants (compared with other studies in the literature for radar and wearables [5]). II. DATA COLLECTION AND FEATURE EXTRACTION The radar sensor is an off-the-shelf Frequency Modulated Continuous Wave (FMCW) radar operating at 5.8 GHz and capable of recording micro-Doppler signatures of the targets of interest, i.e. Doppler vs time patterns of moving targets [8]. Microsoft Kinect sensor estimates the coordinates of joints corresponding to different body parts of the monitored subject and records their temporal evolution frame by frame. The triaxial accelerometer within a commercial smartphone samples and records linear acceleration along the X, Y, and Z axis at approximately 100 Hz (maximum rate allowed by the smartphone). We recorded 10 different activities indicated in Table I, with 16 participants with age from 23 to 58 years. The measurements were collected in an office environment at Glasgow University. These activities were selected to be similar in pairs (e.g. 1 and 2, or 7 and 8) for an additional classification challenge, and to trigger possible false alarms when detecting falls, for example activity 3, 6, and 10, all presenting a fast acceleration component directed towards the floor. TABLE I. No 1 2 3 4 5 6 7 8 9 10 LIST OF HUMAN ACTIVITIES Description Walking back and forth Walking and carrying an object with both hands Sitting down on a chair Standing up from a chair Bending to pick up an object and coming back up Bending and staying down to tie shoelaces Drinking a glass of water while standing Picking up a phone call while standing Simulating tripping and falling down frontally Bending to check under furniture and coming back up Numerical parameters, referred to as features, were extracted from the data of each sensor. For the tri-axial accelerometer these features were inspired from previous work in this domain [9-11] and are summarised in table II, divided into time and frequency domain. Frequency domain features aim to capture the spectral energy distribution and include the amplitude of the Power Spectral Density (PSD) at a selected frequency band, the sum of the Fast Fourier Transform (FFT) coefficients, and the spectral entropy based on the power spectrum. For the radar sensor, the spectrograms (Doppler vs time patterns) were calculated by applying STFT (Short Time Fourier Transform) to the raw range-time radar data, and then features were extracted from the resulting images. These are summarised in table III. Entropy and skewness are related to the energy distribution of the pixels in the spectrogram. The centroid estimates the centre of mass of the micro-Doppler signature in the spectrogram, whereas the bandwidth estimates the energy content around it. The SVD (Singular Value Decomposition) of the spectrograms projects the energy content onto vectors in time and frequency domain. Statistical moments (mean and variance) of the first three left and right vectors are considered as features, as well as those of the centroid and bandwidth parameters over time. TABLE II. TABLE OF FEATURES FOR ACCELEROMETER SENSOR Time domain No. Frequency domain No. 9 Spectral Power 3 Mean 3 Coefficients Sum 3 Standard Deviation 3 Spectral Entropy 3 Autocorrelation 3 Cross Correlation 3 Variance 3 RMS (Root Mean Square) 3 MAD (Median Absolute 3 Deviation) 3 Inter-quadrature Range 3 Range 3 Minimum Number of features TABLE III. 30 Number of features TABLE OF FEATURES FOR RADAR SENSOR Radar Features Entropy of spectrogram Skewness of spectrogram Centroid of spectrogram (mean & variance) Bandwidth of spectrogram (mean & variance) Singular Value Decomposition (mean & variance of right and left vectors) Number of features 15 No. 1 1 2 2 13 19 III. RESULTS ANALYSIS USING RADAR AND ACCELEROMETER The features listed in section II are used as input to classifiers based on supervised machine learning. A 16-fold crossvalidation approach was applied, whereby data from 15 (a) participants were used for training and data from the 16th remaining participant were used for testing. This was repeated 16 times, one for each participant, and the average classification results are reported in this paper. Fig. 1(a)-1(c) shows confusion matrices to validate the classification performance of the proposed approach. The rows show the true classes of the 10 activities under test, whereas the columns show the predicted classes as estimated by the SVM quadratic-kernel classifier. An ideal result would have 100% classification on the diagonal of the confusion matrices, whereas the elements outside are related to misclassification events. A colour code is used to highlight in green the desired elements on the diagonal, and in yellow or in red colour the misclassifications. Activity 9 is highlighted as well, as this is the simulated fall activity. The radar sensor generates the majority of misclassifications for activity 1 and 2, but also activities 4, 5, and 8 contribute to several mistakes. The average classification is 68.8%. For the accelerometer, activity 3, 5, 6, and 8 are the most problematic ones for classification, and the average classification accuracy is 63.1% across the ten activities. For both sensors, the classification accuracy for activity 9 (the fall) is relatively high, but a system that can be practically deployed will need an extremely high rejection rate of false alarms and provide very low missed detections. Fig. 1(c) presents the results using information fusion at feature level for radar plus accelerometer, i.e. by combining into a single feature vector the feature samples extracted from the radar and the accelerometer data. The overall classification accuracy across the ten activities is in this case 80.6%, and for most activities the classification accuracy has increased compared with the case of independent use of the radar sensor or the accelerometer sensor. IV. RESULTS USING ADDITIONAL SENSORS RGB-D sensors (e.g. Kinect) project infrared light into the space and detect the distortion of this pattern. Since these sensors produces large feature pools, PCA (Principle Component Analysis) that selects optimal combinations of the features with the largest possible variance is implemented to reduce the computational complexity in feature selection. The results presented in Fig. 1(c) show persistent misclassifications for activities 3, 8, 9, 10, as well as for the pair (b) (c) Fig. 1. Confusion matrices of radar data with SVM classifier (a), accelerometer with SVM classifier (b), and fusion radar plus accelerometer features with SVM classifier (c) and investigate in more detail the best approaches in terms of feature selection for data from each sensor and of information fusion. V. CONCLUSION AND FUTURE WORK Fig. 2. Confusion matrix of sensor data fusion radar plus accelerometer and Kinect features with Quadratic-kernel SVM classifier Accuracy with different sensor fusion and classiﬁer Ensemble subspace discriminant Linear discriminant KNN SVM Different research directions can build up on the preliminary results presented in this paper. For future work, additional data will be collected, involving 10 DOF (Degrees of Freedom) inertial measurement unit (three axes accelerometer, gyroscope, magnetic sensors and one pressure sensor or GPS) [12], more participants, more indoor scenarios, and more deployment geometries for the sensors. This includes different aspect angles of the Kinect and radar with respect to the participants’ movements and trajectories, as well as different positions of the accelerometer sensor (e.g. on the wrist like for this paper, at the waist or chest or arms or thighs, or inside pockets) and multiple accelerometers. The integration of gyroscope and magnetic sensors together with the accelerometer and the other sensors will also be considered. On the signal processing side, different approaches to select features for each different sensor will be considered (for example using metrics such as entropy or Fisher’s scores to rank all the possible features), as well as the effect of using different information fusion techniques on data from all the sensors (e.g. fusing at feature level, or at decision level taking into account the level of confidence of each separate classifier based on data from each individual sensor). ACKNOWLEDGMENT This work was partly supported by a STSM funded by the COST Action IC1303 AAPELE (Architectures, Algorithms and Platforms for Enhanced Living Environments). r da Ra + r + le ad ar b le b le a a d r b R ara t c ea Ra ara e W e W ine W K + Fig. 3. Comparison of different sensors combination with different classifiers of activities 1 and 2. The feature extracted from the Kinect sensor data are then added to those extracted from accelerometer and radar to investigate whether those results can be further improved. The resulting confusion matrix is shown in Fig. 2. Fusing information from three sensors increases the classification accuracy to 86.9% with the quadratic-kernel SVM classifier, and up to 91.3% using an Ensemble classifier. The confusion matrix shown in Fig. 2 presents on average less misclassifications compared with the previous cases, although in some particular cases (e.g. activity 6) using a 3rd sensor decreases the accuracy compared with using only two, so optimal approaches to fuse information should be considered to avoid classifiers with low performance to decrease the overall performance. Fig. 3 summarises the average classification accuracy (assuming 16-fold cross-validation on the data from the 16 participants) for 4 different classifiers and for different combinations of sensors using feature-level information fusion. A clear trend of increasing accuracy when combining different sensors can be seen, and it is also interesting to observe the effect of using different types of classifier on data from the same sensor. Future work will build up on these preliminary results REFERENCES [1] Fabbri E, et al., “New Tasks, Priorities, and Frontiers for Integrated Gerontological and Clinical Research”, Journal of the American Medical Directors Association, vol. 16, no. 8, pp. 640-647, 2015. [2] Terroso, M., Rosa, N. and Torres Marques, “Physical consequences of falls in the elderly: a literature review from 1995 to 2010”, Eur Rev Aging Phys Act, pp.11-51, 2014. [3] K. Chaccour, R. Darazi, A. H. El Hassani, and E. Andrès, “From Fall Detection to Fall Prevention: A Generic Classification of Fall-Related Systems,” IEEE Sensors Journal, vol. 17, no. 3. pp. 812–822, 2017. [4] S. C. Mukhopadhyay, “Wearable Sensors for Human Activity Monitoring: A Review,” IEEE Sensors Journal, vol. 15(3), 2015. [5] I. H. Lopez-Nava, M. M. Angelica, “Wearable Inertial Sensors for Human Motion Analysis: A review,” IEEE Sensors Journal, vol.16 (22), 2016. [6] S. Wen, et al., "A Wearable Fabric-Based RFID Skin Temperature Monitoring Patch," in Proc. of IEEE Sensors Conference, pp. 1-3, 2016. [7] F. Erden, et al, “Sensors in Assisted Living: A survey of signal and image processing methods,” IEEE Signal Processing Magazine, vol. 33, no. 2. pp. 36–44, 2016. [8] E. Cippitelli, F.Fioranelli, E. Gambi, S. Spinsante, “Radar and RGBDepth sensors for fall detection: a review”, IEEE Sensors Journal, vol. 17, no. 12, pp. 3585-3604, 2017. [9] D. Figo et al. "Preprocessing techniques for context recognition from accelerometer data”, Personal Ubiquitous Computing, vol. 14, no.7, 2010. [10] Lee, M-H., et al. “Physical Activity Recognition Using a Single Tri-Axis Accelerometer”, Proc. World Congress Eng. & Computer Science, 2009. [11] E.M.Tapia, “Using Machine Learning for Real-time Activity Recognition and Estimation of Energy Expenditure”, MIT PhD Thesis, 2008. [12] H. Heidari, et al., "CMOS vertical Hall magnetic sensors on flexible substrate," IEEE Sensors Journal, vol. 16, pp. 8736-8743, 2016.

RELATED PAPERS

RELATED TOPICS

Log In

Multisensor data fusion for human activities classification and fall detection

Multisensor data fusion for human activities classification and fall detection

Related Papers

RELATED PAPERS

RELATED TOPICS