Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Effective human action recognition by combining manifold regularization and pairwise constraints

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The ever-growing popularity of mobile networks and electronics has prompted intensive research on multimedia data (e.g. text, image, video, audio, etc.) management. This leads to the researches of semi-supervised learning that can incorporate a small number of labeled and a large number of unlabeled data by exploiting the local structure of data distribution. Manifold regularization and pairwise constraints are representative semi-supervised learning methods. In this paper, we introduce a novel local structure preserving approach by considering both manifold regularization and pairwise constraints. Specifically, we construct a new graph Laplacian that takes advantage of pairwise constraints compared with the traditional Laplacian. The proposed graph Laplacian can better preserve the local geometry of data distribution and achieve the effective recognition. Upon this, we build the graph regularized classifiers including support vector machines and kernel least squares as special cases for action recognition. Experimental results on a multimodal human action database (CAS-YNU-MHAD) show that our proposed algorithms outperform the general algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2012) Effective codebooks for human action representation and classification in unconstrained videos. IEEE Trans Multimedia 14(4):1234–1245

    Article  Google Scholar 

  2. Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res 6(6):937–965

    MathSciNet  MATH  Google Scholar 

  3. Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Int Conf Neural Inf Proces Syst: Nat and Synth MIT Press 14(6):585–591

    Google Scholar 

  4. Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7(1):2399–2434

    MathSciNet  MATH  Google Scholar 

  5. Bernstein M, De Silva V, Langford JC, Tenenbaum JB (2001) Graph approximations to geodesics on embedded manifolds. Tech Rep, Standard University 24(9):153–158

    Google Scholar 

  6. Cevikalp H, Verbeek J, Jurie F, Klaser A (2008) Semi-supervised dimensionality reduction using pairwise equivalence constraints. Int Conf Comput Vis Theory Appl 1:489–496

    Google Scholar 

  7. Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, Cambridge

    Book  Google Scholar 

  8. Chen C, Jafari R, Kehtarnavaz N (2015) Improving human action recognition using fusion of depth camera and inertial sensors. IEEE Trans Hum-Mach Syst 45(1):51–61

    Article  Google Scholar 

  9. Coyte JL, Stirling D, Haiping D, Ros M (2016) Seated whole-body vibration analysis, technologies, and modeling: a survey. IEEE Trans Syst Man Cybern Syst 46(6):725–739

    Article  Google Scholar 

  10. Ding S, Jia H, Zhang L, Jin F (2014) Research of semi-supervised spectral clustering algorithm based on pairwise constraints. Neural Comput & Applic 24(1):211–219

    Article  Google Scholar 

  11. Donoho DL, Grimes C (2003) Hessian eigenmaps: new locally linear embedding techniques for high-dimensional data. Natl Acad Sci USA 100(10):5591–5596

    Article  MathSciNet  MATH  Google Scholar 

  12. Gong C, Liu T, Tao D, Keren F, Enmei T, Yang J (2015) Deformed graph Laplacian for semisupervised learning. IEEE Trans Neural Netw Learn Syst 26(10):2261–2274

    Article  MathSciNet  Google Scholar 

  13. Guo Y, Tao D, Liu W, Cheng J (2017) Multiview Cauchy estimator feature embedding for depth and inertial sensor-based human action recognition. IEEE Trans Syst Man Cybern Syst 47(4):617–627

    Article  Google Scholar 

  14. Hong C, Yu J, Tao D, Wang M (2015) Image-based three-dimensional human pose recovery by Multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751

    Google Scholar 

  15. Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670

    Article  MathSciNet  MATH  Google Scholar 

  16. Hong C, Yu J, You J, Chen X, Tao D (2015) Multi-view ensemble manifold regularization for 3D object recognition. Inf Sci 320:395–405

    Article  MathSciNet  Google Scholar 

  17. Huang K, Wang C, Tao D (2015) High-order topology modeling of visual words for image classification. IEEE Trans Image Process 24(11):3598–3608

    Article  MathSciNet  MATH  Google Scholar 

  18. Jalal A, Uddin MZ, Kim T-S (2012) Depth video-based human activity recognition system using translation and scaling invariant features for life logging at smart home. IEEE Trans Consum Electron 58(3):863–871

    Article  Google Scholar 

  19. Ji X, Zhaojie J, Wang C, Wang C (2015) Multi-view transition HMMs based view-invariant human action recognition method. Multimed Tools Appl 75(19):1–18

    Google Scholar 

  20. Jiang J, Hu R, Wang Z, Cai Z (2016) CDMMA: coupled discriminant multi-manifold analysis for matching low-resolution face images. Signal Process 124:162–172

    Article  Google Scholar 

  21. Khan AM, Lee Y-K, Lee SY, Kim T-S (2010) A triaxial accelerometer-based physical-activity recognition via augmented-signal features and a hierarchical recognizer. IEEE Trans Inf Technol Biomed 14(5):1166–1172

    Article  Google Scholar 

  22. Li L, Dai S (2017) Action recognition with spatio-temporal augmented descriptor and fusion method. Multimed Tools Appl 76(12):13953–13969

  23. Liu T, Tao D (2016) Classification with noisy labels by importance reweighting. IEEE Trans Pattern Anal Mach Intell 38(3):447–461

    Article  Google Scholar 

  24. Liu M, Zhang D (2016) Pairwise constraint-guided sparse learning for feature selection. IEEE Trans Cybern 46(1):298–310

    Article  MathSciNet  Google Scholar 

  25. Liu W, Liu H, Tao D, Wang Y, Lu K (2014) Multiview hessian regularized logistic regression for action recognition. Signal Process 110:101–107

    Article  Google Scholar 

  26. Liu A, Yuting S, Jia P, Gao Z, Hao T, Yang Z (2015) Multiple/single-view human action recognition via part-induced multitask structural learning. IEEE Trans Cybern 45(6):1194–1208

    Article  Google Scholar 

  27. Luo Y, Tao D, Ramamohanarao K, Xu C, Wen Y (2015) Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans Knowl Data Eng 27(11):3111–3124

    Article  Google Scholar 

  28. Luo Y, Wen Y, Tao D, Gui J, Xu C (2016) Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans Image Process 25(1):414–427

    Article  MathSciNet  MATH  Google Scholar 

  29. Luo Y, Wen Y, Tao D (2016) On Combining Side Information and Unlabeled Data for Heterogeneous Multi-Task Metric Learning, International Joint Conference on Artificial Intelligence , pp. 1809–1815

  30. Mignon A, Jurie F (2012) PCCA: A new approach for distance learning from sparse pairwise constraints, IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 2666–2672

  31. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326

    Article  Google Scholar 

  32. Sang J, Deng Z, Lu D, Xu C (2015) Cross-OSN user modeling by homogeneous behavior quantification and local social regularization. IEEE Trans Multimed 17(12):2259–2270

    Article  Google Scholar 

  33. Schiller H, Chaudhuri BB (1990) Efficient coding of side information in a low bit rate hybrid image coder. Signal Process 19(1):61–73

    Article  Google Scholar 

  34. Seeger M (2000) Learning with labeled and unlabeled data. Technical report. University of Edinburgh, Edinburgh

  35. Tentori M, Favela J (2008) Activity-aware computing for healthcare. IEEE Pervasive Comput 7(2):51–57

    Article  Google Scholar 

  36. Tosato D, Spera M, Cristani M, Murino V (2013) Characterizing humans on riemannian manifolds. IEEE Trans Pattern Anal Mach Intell 35(8):1972–1984

    Article  Google Scholar 

  37. Wagstaff K, Cardie C (2000) Clustering with instance-level constraints, International Conference on Machine Learning DBLP, pp. 1103–1110

  38. Wang M, Ni B, Hua X-S, Chua T-S, (2012) Assistive tagging: a survey of multimedia tagging with human-computer joint exploration. ACM Comput Surv (CSUR) 44(4):25

  39. Xia L, Aggarwa JK (2013) Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera, IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, pp. 2834–2841

  40. Yan M, Sang J, Xu C, Shamim Hossain M (2015) YouTube video promotion by cross-network association: @Britney to advertise Gangnam style. IEEE Trans Multimed 17(8):1248–1261

    Article  Google Scholar 

  41. Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients, Proceedings of the ACM international conference on Multimedia, pp. 1057–1060

  42. Yu J, Rui Y, Tang Y, Tao D (2014) High-order distance based Multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442

    Article  Google Scholar 

  43. Zhang D, Zhou Z-H, Chen S, (2007) Semi-supervised dimensionality reduction, Siam International Conference on Data Mining DBLP, 22, pp. 11–393

  44. Zhang D, Chen S, Zhou Z-H (2008) Constraint score: a new filter method for feature selection with pairwise constraints. Pattern Recogn 41(5):1440–1451

    Article  MATH  Google Scholar 

  45. Zhang T, Liu S, Xu C, Lu H (2013) Mining semantic context information for intelligent video surveillance of traffic scenes. IEEE Trans Ind Inf 9(1):149–160

    Article  Google Scholar 

  46. Zhang J, Han Y, Tang J, Hu Q, Jiang J (2017) Semi-supervised image-to-video adaptation for video action recognition. IEEE Trans Cybern 47(4):960–973

    Article  Google Scholar 

  47. Zheng J, Jiang Z, Chellappa R (2016) Cross-view action recognition via transferable dictionary learning. IEEE Trans Image Process 25(6):2542–2556

    Article  MathSciNet  MATH  Google Scholar 

  48. Zhenyong F, Lu Z, Ip HHS, Lu H, Wang Y (2015) Local similarity learning for pairwise constraint propagation. Multimed Tools Appl 74(11):3739–3758

    Article  Google Scholar 

  49. Zhu X (2008) Semi-supervised learning literature survey. Comput Sci 37(1):63–77

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This paper is partly supported by the National Natural Science Foundation of China (Grant No. 61671480), the Fundamental Research Funds for the Central Universities, China University of Petroleum (East China) (Grant No. 14CX02203A, YCX2017059).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weifeng Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, X., Tao, D. & Liu, W. Effective human action recognition by combining manifold regularization and pairwise constraints. Multimed Tools Appl 78, 13313–13329 (2019). https://doi.org/10.1007/s11042-017-5172-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5172-1

Keywords