Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Local and Global Features Interactive Fusion Network for Macro- and Micro-expression Spotting in Long Videos

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15041))

Included in the following conference series:

  • 82 Accesses

Abstract

Individual emotions are often manifested through facial expressions, where macro-expressions (MaEs) and micro-expressions (MEs) provide the corresponding visual cues for different emotion applications. The spotting of these intertwined expressions has attracted extraordinary interests, which is an indispensable procedure in expression applications. However, due to the presence of noise, irrelevant movements, and the confusion of MEs and MaEs, it is very challenging to learn discriminative intrinsic features by deep learning models. In this paper, we explore an efficient deep neural network to address the issue of MaEs and MEs spotting in long videos. Specifically, this study assigns optical flow features as the model input and proposes a deep model, named LGFINet, which concentrates on the fusion of local and global features to predict probability scores of frames during an expression interval. To further boost the learning capability of facial expression spotting, the LGFINet scheme integrates multi-head self-attention and multi-head cross-attention into the backbone of the spotting network. To validate the superiority of the LGFINet, spotting experiments are conducted on two public MEs datasets, CAS(ME)2 and SAMM-LV. The proposed spotting approach achieves the F1 scores of 0.3710 and 0.4129 on the SAMM-LV and CAS(ME)2 dataset respectively. Extensive experiments verify the robustness and superiority of the MEs spotting based on LGFINet to other models. The source code of the LGFINet is available on GitHub (https://github.com/XionghuiYe/LGIF_Net).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Davison, A.K., Yap, M.H., Lansley, C.: Micro-facial movement detection using individualised baselines and histogram-based descriptors. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics, pp. 1864–1869. IEEE, Hong Kong (2015)

    Google Scholar 

  2. Moilanen, A., Zhao, G., Pietikainen, M.: Spotting rapid facial movements from videos using appearance-based feature difference analysis. In: 2014 22nd International Conference on Pattern Recognition, pp. 1722–1727. IEEE, Stockholm (2014)

    Google Scholar 

  3. Asthana, A., Zafeiriou, S., Cheng, S., et al.: Robust discriminative response map fitting with constrained local models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3444–3451. IEEE, Portland (2013)

    Google Scholar 

  4. Cristinacce, D., Cootes, T.F.: Feature detection and tracking with constrained local models. In: Proceedings of the British Machine Vision Conference 2006, vol. III, pp. 929–938. BMVA, Edinburgh (2006)

    Google Scholar 

  5. Liong, S.T., See, J., Wong, K.S., et al.: Automatic micro-expression recognition from long video using a single spotted apex. In: Computer Vision–ACCV 2016 Workshops. LNCS, vol. 10117, pp. 345–360. Springer, Taipei (2016)

    Chapter  Google Scholar 

  6. Shreve, M., Brizzi, J., Fefilatyev, S., et al.: Automatic expression spotting in videos. Image Vis. Comput. 32(8), 476–486 (2014)

    Article  Google Scholar 

  7. Li, J., Soladie, C., Seguier, R.: LTP-ML: micro-expression detection by recognition of local temporal pattern of facial movements. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 634–641. IEEE, Xi’an (2018)

    Google Scholar 

  8. Liong, S.T., See, J., Wong, K.S., et al.: Automatic apex frame spotting in micro-expression database. In: 2015 3rd IAPR Asian conference on pattern recognition (ACPR), pp. 665–669. IEEE, Kuala Lumpur (2015)

    Google Scholar 

  9. Verburg, M., Menkovski, V.: Micro-expression detection in long videos using optical flow and recurrent neural networks. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp. 1–6. IEEE, Lille (2019)

    Google Scholar 

  10. Yu, W.W., Jiang, J., Li, Y.J.: LSSNet: A two-stream convolutional neural network for spotting macro- and micro-expression in long videos. In: Proceedings of the 29th ACM International Conference on Multimedia 2021, pp. 4745–4749 (2021)

    Google Scholar 

  11. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 2017, vol. 30. Curran Associates, Long Beach (2017)

    Google Scholar 

  12. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: 9th International Conference on Learning Representations ICLR 2021, Virtual Event Austria (2021)

    Google Scholar 

  13. Deng, J., Dong, W., Socher, R., et al.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, Miami (2009)

    Google Scholar 

  14. He, E., Chen, Q., Zhong, Q.: SL-Swin: a transformer-based deep learning approach for macro- and micro-expression spotting on small-size expression datasets. Electronics 12(12), 2656 (2023)

    Google Scholar 

  15. Sun, L., Lian, Z., Liu, B., et al.: MAE-DFER: efficient masked autoencoder for self-supervised dynamic facial expression recognition. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 6110–6121. Ottawa (2023)

    Google Scholar 

  16. Liong, S.T., Gan, Y.S., See, J., et al.: Shallow triple stream three-dimensional CNN (STSTNet) for micro-expression recognition. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp. 1–5. IEEE, Lille (2019)

    Google Scholar 

  17. Liong, G.B., See, J., Wong, L.K.: Shallow optical flow three-stream CNN for macro- and micro-expression spotting from long videos. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 2643–2647. IEEE, Anchorage (2021)

    Google Scholar 

  18. Minh Trieu, N., Truong Thinh, N.: The anthropometric measurement of nasal landmark locations by digital 2D photogrammetry using the convolutional neural network. Diagnostics 13(5), 891 (2023)

    Google Scholar 

  19. Mohamed, M.A., Mertsching, B.: TV-L1 optical flow estimation with image details recovering based on modified census transform. In: Advances in visual computing (ISVC 2012). LNCS, vol. 7431, pp. 482–491. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Zhang, L.W., Li, J., Wang, S.J., et al.: Spatio-temporal fusion for macro- and micro-expression spotting in long video sequences. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 734–741. IEEE (2020)

    Google Scholar 

  21. Yang, B., Wu, J., Zhou, Z., et al.: Facial action unit-based deep learning framework for spotting macro- and micro-expressions in long video sequences. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4794–4798 (2021)

    Google Scholar 

  22. Zhao, Y., Tong, X., Zhu, Z., et al.: Rethinking optical flow methods for micro-expression spotting. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 7175–7179. Association for Computing Machinery, Lisboa (2022)

    Google Scholar 

  23. Moor, M., Banerjee, O., Abad, Z.S.H., et al.: Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023)

    Article  Google Scholar 

  24. Fang, J., Xie, L., Wang, X., et al.: Msg-transformer: exchanging local spatial information by manipulating messenger tokens. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022, pp. 12063–12072 (2022)

    Google Scholar 

  25. Sun, L., Lian, Z., Liu, B., et al.: Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis. IEEE Trans. Affect. Comput. 15(1), 309–325 (2024)

    Article  Google Scholar 

  26. He, K., Chen, X., Xie, S., et al.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) 2022, pp. 16000–16009 (2022)

    Google Scholar 

  27. Qu, F., Wang, S.J., Yan, W.J., et al.: CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans. Affect. Comput. 9(4), 424–436 (2018)

    Article  Google Scholar 

  28. Yap, C.H., Kendrick, C., Yap, M.H.: Samm long videos: a spontaneous facial micro-and macro-expressions dataset. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 771–776. IEEE, Buenos Aires (2020)

    Google Scholar 

  29. Jingting, L.I., Wang, S.J., Yap, M.H., et al.: Megc2020-the third facial micro-expression grand challenge. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 777–780. IEEE, Buenos Aires (2020)

    Google Scholar 

  30. Li, J., Dong, Z., Lu, S., et al.: CAS (ME) 3: a third generation facial spontaneous micro-expression database with depth information and high ecological validity. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 2782–2800 (2023)

    Google Scholar 

  31. He, Y., Wang, S.J., Li, J., et al.: Spotting macro- and micro-expression intervals in long video sequences. In: 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), pp. 742–748. IEEE, Buenos Aires (2020)

    Google Scholar 

  32. Yuhong, H.: Research on micro-expression spotting method based on optical flow features. In: Proceedings of the 29th ACM International Conference on Multimedia 2021, pp. 4803–4807. Association for Computing Machinery, Virtual Event China (2021)

    Google Scholar 

  33. Yap, C.H., Yap, M.H., Davison, A., et al.: 3d-cnn for facial micro-and macro-expression spotting on long video sequences using temporal oriented reference frame. In: Proceedings of the 30th ACM International Conference on Multimedia 2022, pp. 7016–7020 (2022)

    Google Scholar 

  34. Liong, G.B., Liong, S.T., See, J., et al.: MTSN: a multi-temporal stream network for spotting facial macro-and micro-expression with hard and soft pseudo-labels. In: Proceedings of the 2nd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis 2022, pp. 3–10. Lisboa (2022)

    Google Scholar 

  35. Selvaraju, R.R., Cogswell, M.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision (ICCV) 2017, pp. 618–626. IEEE (2017)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by the National Nature Science Foundation of China (No. 62362037), the Natural Science Foundation of Jiangxi Province of China (No. 20224ACB202011) and the Jiangxi Province Graduate Innovation Special Fund Project (No. YC2023-X17).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhihua Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, Z., Ye, X. (2025). Local and Global Features Interactive Fusion Network for Macro- and Micro-expression Spotting in Long Videos. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15041. Springer, Singapore. https://doi.org/10.1007/978-981-97-8795-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-8795-1_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-8794-4

  • Online ISBN: 978-981-97-8795-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics