Margin-Based Deep Learning Networks for Human Activity Recognition
Abstract
:1. Introduction
- We add a margin mechanism to the softmax loss to learn more discriminative feature representations which enhance intra-class compactness and inter-class diversity.
- We use four deep learning models to carry out comparison experiments on three widely used public datasets and prove that the margin-based method can improve the recognition ability of the deep learning networks. Furthermore, we also conduct experiments to compare three machine learning classifiers trained using hand-crafted features and features from the unmodified and margin-based deep learning networks.
- We illustrate how the margin-based networks outperformed the unmodified models with different hyperparameters. Additionally, we carry out experiments to visualise the cosine similarity and 2-D learning features of the softmax and additive angular margin losses.
- We extend our study to the open-set HAR problem. To the best of our knowledge, our work is the first to treat HAR as an open-set classification problem. We confirm the effectiveness and performance of our method using the PAMAP2 dataset.
2. Related Work
3. Framework
3.1. Deep Learning Models
3.2. Margin-based Loss Function
3.3. Open-set Classification Problem for HAR
4. Experiment
4.1. Benchmark Datasets
4.2. Performance Metrics
4.3. Model Settings
4.4. Model Training
5. Results
5.1. Performance Comparison
5.2. Evaluation of Hyperparameters
5.2.1. Length of Sliding Window
5.2.2. Number of Sensor Channels
5.2.3. Margin Value
5.2.4. Comparison with Softmax Loss
5.3. Performance of Open-Set HAR
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Xu, C.; He, J.; Zhang, X.; Yao, C.; Tseng, P.H. Geometrical kinematic modeling on human motion using method of multi-sensor fusion. Inf. Fusion 2018, 41, 243–254. [Google Scholar] [CrossRef]
- Margarito, J.; Helaoui, R.; Bianchi, A.; Sartor, F.; Bonomi, A. User-Independent Recognition of Sports Activities from a Single Wrist-worn Accelerometer: A Template Matching Based Approach. IEEE Trans. Biomed. Eng. 2015. [Google Scholar] [CrossRef] [PubMed]
- Günther, S.; Bernd, R.; Thomas, B. Contextual Sensing: Integrating Contextual Information with Human and Technical Geo-Sensor Information for Smart Cities. Sensors 2015, 15, 17013–17035. [Google Scholar]
- Francisco, O.; Daniel, R. Deep Convolutional and LSTM Recurrent Neural Networks for Multimodal Wearable Activity Recognition. Sensors 2016, 16, 115. [Google Scholar]
- Li, F.; Kimiaki, S.; Muhammad, N.; Lukas, K.; Marcin, G. Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors. Sensors 2018, 18, 679. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xu, L.; Yang, W.; Cao, Y.; Li, Q. Human activity recognition based on random forests. In Proceedings of the 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Guangxi, China, 29–31 July 2017; pp. 548–553. [Google Scholar]
- Panwar, M.; Dyuthi, S.R.; Prakash, K.C.; Biswas, D.; Naik, G.R. CNN based approach for activity recognition using a wrist-worn accelerometer. In Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Jeju Island, Korea, 11–15 July 2017. [Google Scholar]
- Huang, J.; Lin, S.; Wang, N.; Dai, G.; Xie, Y.; Zhou, J. TSE-CNN: A Two-Stage End-To-End CNN for Human Activity Recognition. IEEE J. Biomed. Health Inform. 2019, 24, 292–299. [Google Scholar] [CrossRef] [PubMed]
- Liu, W.; Wen, Y.; Yu, Z.; Yang, M. Large-Margin Softmax Loss for Convolutional Neural Networks. In Proceedings of the International Conference on International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 507–516. [Google Scholar]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A Discriminative Feature Learning Approach for Deep Face Recognition. Eur. Conf. Comput. Vis. 2016. [Google Scholar]
- Liu, W.; Wen, Y.; Yu, Z.; Li, M.; Raj, B.; Song, L. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 212–220. [Google Scholar]
- Roggen, D.; Calatroni, A.; Rossi, M.; Holleczek, T.; Förster, K.; Tröster, G.; Lukowicz, P.; Bannach, D.; Pirkl, G.; Ferscha, A. Collecting complex activity datasets in highly rich networked sensor environments. In Proceedings of the Seventh international conference on networked sensing systems (INSS), Kassel, Germany, 15–18 June 2010; pp. 233–240. [Google Scholar]
- Micucci, D.; Mobilio, M.; Napoletano, P. UniMiB SHAR: A Dataset for Human Activity Recognition Using Acceleration Data from Smartphones. Appl. Sci. 2017, 7, 1101. [Google Scholar] [CrossRef] [Green Version]
- Reiss, A.; Stricker, D. Introducing a New Benchmarked Dataset for Activity Monitoring. In Proceedings of the 16th International Symposium on Wearable Computers, Newcastle, UK, 18–22 June 2012; pp. 108–109. [Google Scholar]
- Chen, Z.; Zhu, Q.; Soh, Y.C.; Le, Z. Robust Human Activity Recognition Using Smartphone Sensors via CT-PCA and Online SVM. IEEE Trans. Ind. Inform. 2013, 13, 3070–3080. [Google Scholar] [CrossRef]
- Hossain, T.; Goto, H.; Ahad, M.A.R.; Inoue, S. A Study on Sensor-based Activity Recognition Having Missing Data. In Proceedings of the 7th International Conference on Informatics, Electronics & Vision (ICIEV) and 2018 2nd International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Saratha Devi, Germany, 25–28 June 2018; pp. 556–561. [Google Scholar]
- Mobark, M.; Chuprat, S.; Mantoro, T. Improving the accuracy of complex activities recognition using accelerometer-embedded mobile phone classifiers. In Proceedings of the Second International Conference on Informatics and Computing (ICIC), Jayapura, Indonesia, 1–3 November 2017; pp. 1–5. [Google Scholar]
- Vepakomma, P.; De, D.; Das, S.K.; Bhansali, S. A-Wristocracy: Deep Learning on Wrist-worn Sensing for Recognition of User Complex Activities. In Proceedings of the 2015 IEEE 12th International Conference on Wearable and Implantable Body Sensor Networks (BSN), Cambridge, MA, USA, 9–12 June 2015; pp. 1–6. [Google Scholar]
- Almaslukh, B.; AlMuhtadi, J.; Artoli, A. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur. 2017, 17, 160–165. [Google Scholar]
- Yu, S. Residual Learning and LSTM Networks for Wearable Human Activity Recognition Problem. In Proceedings of the 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 9440–9447. [Google Scholar]
- Hao, W.; Wang, Y.; Zheng, Z.; Xing, J.; Li, Z.; Gong, D.; Zhou, J.; Wei, L. CosFace: Large Margin Cosine Loss for Deep Face Recognition. Available online: http://openaccess.thecvf.com/content-cvpr-2018/CameraReady/1797.pdf (accessed on 10 February 2020).
- Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 4690–4699. [Google Scholar]
- Wojke, N.; Bewley, A. Deep cosine metric learning for person re-identification. In Proceedings of the IEEE winter conference on applications of computer vision (WACV), Lake Tahoe, NV, USA, 12–15 March 2018; pp. 748–756. [Google Scholar]
- Zhai, Y.; Guo, X.; Lu, Y.; Li, H. In Defense of the Classification Loss for Person Re-Identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Zhang, Y.; Pan, P.; Zheng, Y.; Zhao, K.; Zhang, Y.; Ren, X.; Jin, R. Visual search at alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, London, UK, 19–23 August 2018; pp. 993–1001. [Google Scholar]
- Hu, H.; Wang, Y.; Yang, L.; Komlev, P.; Huang, L.; Chen, X.S.; Huang, J.; Wu, Y.; Merchant, M.; Sacheti, A. Web-scale responsive visual search at bing. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, London, UK, 19–23 August 2018; pp. 359–367. [Google Scholar]
- Younes, R.; Jones, M.; Martin, T.L. Classifier for activities with variations. Sensors 2018, 18, 3529. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kim, Y.J.; Kang, B.N.; Kim, D. Hidden Markov model ensemble for activity recognition using tri-axis accelerometer. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Kowloon Tong, Hong Kong, 9–12 October 2015; pp. 3036–3041. [Google Scholar]
- Arthur, D.; Vassilvitskii, S. K-Means++: The Advantages of Careful Seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, University City Science Center, Philadelphia, PA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
- Zappi, P.; Lombriser, C.; Stiefmeier, T.; Farella, E.; Roggen, D.; Benini, L.; Tröster, G. Activity recognition from on-body sensors: Accuracy-power trade-off by dynamic sensor selection. In Proceedings of the European Conference on Wireless Sensor Networks, Porto, Portugal, 30 January–1 February 2008; pp. 17–33. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
- Automatic Differentiation in Pytorch 2017. Available online: https://openreview.net/forum?id=BJJsrmfCZ (accessed on 10 February 2020).
OPPORTUNITY and PAMAP2 | UniMiB-SHAR | |||
---|---|---|---|---|
Model | Parameter | Value | Parameter | Value |
MLP | Neurons in fully-connected layers 1, 2, and 3 | 2000 | Neurons in fully-connected layers 1, 2, and 3 | 6000 |
CNN | Convolutional kernel size for blocks 1, 2, and 3 | (11,1), (10,1), (6,1) | Convolutional kernel size for block 1 | (32,3) |
Convolutional sliding stride for blocks 1, 2, and 3 | (1,1), (1,1), (1,1) | Convolutional sliding stride for block 1 | (1,1) | |
Convolutional kernels for blocks 1, 2, and 3 | 50, 40, 30 | Convolutional kernels for block 1 | 100 | |
Pooling sizes for blocks 1, 2, and 3 | (2,1), (3,1), (1,1) | Pooling sizes for block 1 | (2,1) | |
Neurons in fully-connected layer | 1000 | Neurons in fully-connected layer | 6000 | |
LSTM | LSTM cells in layers 1 and 2 | 64, 64, 170, 170 | LSTM cells in layers 1 and 2 | 151, 151 |
Output dimensions of LSTM cells in layers 1 and 2 | 600, 600 | Output dimensions of LSTM cells in layers 1 and 2 | 1000, 1000 | |
Neurons in fully-connected layer | 512 | Neurons in fully-connected layer | 6000 | |
Hybrid | Convolutional kernel size for block 1 | (11,1) | Convolutional kernel size for block 1 | (32,3) |
Convolutional sliding stride for block 1 | (1,1) | Convolutional sliding stride for block 1 | (1,1) | |
Convolutional kernels for block 1 | 50 | Convolutional kernels for block 1 | 100 | |
Pooling sizes for block 1 | (2,1) | Pooling sizes for block 1 | (2,1) | |
LSTM cells in layers 1 and 2 | 27, 27, 80, 80 | LSTM cells in layers 1 and 2 | 60, 60 | |
Output dimensions of LSTM cells in layers 1 and 2 | 600, 600 | Output dimensions of LSTM cells in layers 1 and 2 | 1000, 1000 | |
Neurons in fully-connected layer | 512 | Neurons in fully-connected layer | 6000 |
OPPORTUNITY | UniMiB-SHAR | PAMAP2 | |||||||
---|---|---|---|---|---|---|---|---|---|
Method | |||||||||
HC [5] | 89.96 | 89.53 | 63.76 | 32.01 | 22.85 | 13.78 | - | - | - |
CBH [5] | 89.66 | 88.99 | 62.27 | 75.21 | 74.13 | 60.01 | - | - | - |
CBS [5] | 90.22 | 89.88 | 67.50 | 77.03 | 75.93 | 63.23 | - | - | - |
AE [5] | 87.80 | 87.60 | 55.62 | 65.67 | 64.84 | 55.04 | - | - | - |
MLP [5] | 91.11 | 90.86 | 68.17 | 71.62 | 70.81 | 59.97 | 82.63 | 80.83 | 72.92 |
CNN [5] | 90.58 | 90.19 | 65.26 | 74.97 | 74.29 | 64.65 | 91.51 | 91.35 | 83.34 |
LSTM [5] | 91.29 | 91.16 | 69.71 | 71.47 | 70.82 | 59.32 | 84.00 | 82.71 | 74.00 |
Hybrid [5] | 91.76 | 91.56 | 70.86 | 74.63 | 73.65 | 64.47 | 85.12 | 83.73 | 76.10 |
MLP-M | 91.28 | 91.03 | 68.09 | 73.94 | 73.55 | 61.59 | 82.47 | 82.09 | 74.43 |
CNN-M | 90.88 | 90.47 | 66.85 | 74.86 | 74.42 | 63.30 | 93.74 | 93.75 | 92.95 |
LSTM-M | 92.30 | 91.99 | 70.45 | 74.17 | 72.93 | 59.43 | 86.00 | 84.60 | 83.75 |
Hybrid-M | 91.92 | 91.87 | 71.08 | 77.88 | 77.29 | 65.31 | 93.52 | 93.52 | 93.09 |
Method | |||
---|---|---|---|
SVM | 89.96 | 89.53 | 63.76 |
Random Forest | 89.21 | 87.08 | 52.45 |
Naive Bayes | 44.79 | 52.61 | 32.81 |
SVM-DF | 91.81 | 91.62 | 70.24 |
Random Forest-DF | 91.84 | 91.63 | 70.24 |
Naive Bayes-DF | 91.15 | 91.29 | 69.03 |
SVM-DF-M | 91.88 | 91.62 | 70.43 |
Random Forest-DF-M | 91.93 | 91.64 | 70.42 |
Naive Bayes-DF-M | 91.68 | 91.62 | 70.08 |
LSTM-M | 92.30 | 91.99 | 70.45 |
T = 32 | T = 64 | T = 96 | |||||||
---|---|---|---|---|---|---|---|---|---|
Method | |||||||||
MLP [5] | 90.79 | 90.40 | 66.33 | 91.11 | 90.86 | 68.17 | 90.94 | 90.65 | 66.37 |
CNN [5] | 90.34 | 89.71 | 62.10 | 90.58 | 90.19 | 65.26 | 90.38 | 89.98 | 63.38 |
LSTM [5] | 90.88 | 90.60 | 67.20 | 91.29 | 91.16 | 69.71 | 91.33 | 91.21 | 68.64 |
Hybrid [5] | 91.10 | 90.75 | 67.31 | 91.76 | 91.56 | 70.86 | 91.44 | 91.25 | 69.04 |
MLP-M | 91.13 | 90.77 | 66.80 | 91.28 | 91.03 | 68.09 | 91.34 | 91.10 | 67.42 |
CNN-M | 89.97 | 89.87 | 64.20 | 90.88 | 90.47 | 66.85 | 91.47 | 91.16 | 67.78 |
LSTM-M | 91.34 | 91.10 | 68.52 | 92.30 | 91.99 | 70.45 | 92.02 | 91.93 | 71.72 |
Hybrid-M | 92.06 | 91.77 | 71.36 | 91.92 | 91.87 | 71.08 | 92.45 | 92.22 | 71.03 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lv, T.; Wang, X.; Jin, L.; Xiao, Y.; Song, M. Margin-Based Deep Learning Networks for Human Activity Recognition. Sensors 2020, 20, 1871. https://doi.org/10.3390/s20071871
Lv T, Wang X, Jin L, Xiao Y, Song M. Margin-Based Deep Learning Networks for Human Activity Recognition. Sensors. 2020; 20(7):1871. https://doi.org/10.3390/s20071871
Chicago/Turabian StyleLv, Tianqi, Xiaojuan Wang, Lei Jin, Yabo Xiao, and Mei Song. 2020. "Margin-Based Deep Learning Networks for Human Activity Recognition" Sensors 20, no. 7: 1871. https://doi.org/10.3390/s20071871