research-article

CAM-RNN: Co-Attention Model Based RNN for Video Captioning

Authors:

Xiaoqiang LuAuthors Info & Claims

IEEE Transactions on Image Processing, Volume 28, Issue 11

Pages 5552 - 5565

https://doi.org/10.1109/TIP.2019.2916757

Published: 01 November 2019 Publication History

Abstract

Video captioning is a technique that bridges vision and language together, for which both visual information and text information are quite important. Typical approaches are based on the recurrent neural network (RNN), where the video caption is generated word by word, and the current word is predicted based on the visual content and previously generated words. However, in the prediction of the current word, there is much uncorrelated visual content, and some of the previously generated words provide little information, which may cause interference in generating a correct caption. Based on this point, we attempt to exploit the visual and text features that are most correlated with the caption. In this paper, a co-attention model based recurrent neural network (CAM-RNN) is proposed, where the CAM is utilized to encode the visual and text features, and the RNN works as the decoder to generate the video caption. Specifically, the CAM is composed of a visual attention module, a text attention module, and a balancing gate. During the generation procedure, the visual attention module is able to adaptively attend to the salient regions in each frame and the frames most correlated with the caption. The text attention module can automatically focus on the most relevant previously generated words or phrases. Moreover, between the two attention modules, a balancing gate is designed to regulate the influence of visual features and text features when generating the caption. In practice, the extensive experiments are conducted on four popular datasets, including MSVD, Charades, MSR-VTT, and MPII-MD, which have demonstrated the effectiveness of the proposed approach.

References

[1]

S. Venugopalan, M. Rohrbach, J. Donahue, R. Mooney, T. Darrell, and K. Saenko, “Sequence to sequence—Video to text,” in Proc. IEEE Int. Conf. Comput. Vis., Jun. 2015, pp. 4534–4542.

[2]

X. Tan, Y. Guo, Y. Chen, and W. Zhu, “Accurate inference of user popularity preference in a large-scale online video streaming system,” Sci. China Inf. Sci., vol. 61, no. 1, 2017, Art. no.

[3]

X. Li, B. Zhao, and X. Lu, “A general framework for edited video and raw video summarization,” IEEE Trans. Image Process., vol. 26, no. 8, pp. 3652–3664, Aug. 2017.

Digital Library

[4]

X. Li, B. Zhao, and X. Lu, “Key frame extraction in the summary space,” IEEE Trans. Cybern., vol. 48, no. 6, pp. 1923–1934, Jun. 2018.

[5]

B. Zhao, X. Li, X. Lu, and Z. Wang, “A CNN–RNN architecture for multi-label weather recognition,” Neurocomputing, vol. 322, pp. 47–57, Dec. 2018.

[6]

Q. Wang, J. Gao, W. Lin, and Y. Yuan, “Learning from synthetic data for crowd counting in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 8198–8207.

[7]

H. Fang, C. Shang, and J. Chen, “An optimization-based shared control framework with applications in multi-robot systems,” Sci. China Inf. Sci., vol. 61, no. 1, 2018, Art. no.

[8]

S. Guadarramaet al., “YouTube2Text: Recognizing and describing arbitrary activities using semantic hierarchies and zero-shot recognition,” in Proc. IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 2712–2719.

[9]

J. Thomason, S. Venugopalan, S. Guadarrama, K. Saenko, and R. Mooney, “Integrating language and vision to generate natural language descriptions of videos in the wild,” in Proc. 25th Int. Conf. Comput. Linguistics, Tech. Papers (COLING), Dublin, Ireland, Aug. 2014, pp. 1218–1227.

[10]

N. Krishnamoorthy, G. Malkarnenkar, R. J. Mooney, K. Saenko, and S. Guadarrama, “Generating natural-language video descriptions using text-mined knowledge,” in Proc. 27th AAAI Conf. Artif. Intell., 2013, pp. 541–547.

[11]

J. Donahueet al., “Long-term recurrent convolutional networks for visual recognition and description,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 4, pp. 677–691, Apr. 2017.

Digital Library

[12]

K.-H. Zeng, T.-H. Chen, J. C. Niebles, and M. Sun, “Title generation for user generated videos,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 609–625.

[13]

R. Shetty and J. Laaksonen, “Video captioning with recurrent networks based on frame- and video-level features and visual content classification,” CoRR, vol. abs/1512.02949, Dec. 2015. [Online]. Available: http://arxiv.org/abs/1512.02949

[14]

X. Long, C. Gan, and G. de Melo, “Video captioning with multi-faceted attention,” CoRR, vol. abs/1612.00234, Dec. 2016.

[15]

B. Zhao, X. Li, and X. Lu, “Video captioning with tube features,” in Proc. 27th Int. Joint Conf. Artif. Intell. (IJCAI), Stockholm, Sweden, Jul. 2018, pp. 1177–1183.

[16]

S. Venugopalan, H. Xu, J. Donahue, M. Rohrbach, R. Mooney, and K. Saenko, “Translating videos to natural language using deep recurrent neural networks,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguistics, Hum. Lang. Technol., 2015, pp. 1494–1504.

[17]

X. Long, C. Gan, and G. de Melo. (2016). “Video captioning with multi-faceted attention.” [Online]. Available: https://arxiv.org/abs/1612.00234

[18]

Y. Yu, H. Ko, J. Choi, and G. Kim. (2016). “Video captioning and retrieval models with semantic attention.” [Online]. Available: https://arxiv.org/abs/1610.02947

[19]

L. Yaoet al., “Describing videos by exploiting temporal structure,” in Proc. IEEE Int. Conf. Comput. Vis., Dec. 2015, pp. 4507–4515.

[20]

C. Hori, T. Hori, T.-Y. Lee, K. Sumi, J. R. Hershey, and T. K. Marks, “Attention-based multimodal fusion for video description,” CoRR, vol. abs/1701.03126, Jan. 2017.

[21]

J. Xu, T. Yao, Y. Zhang, and T. Mei, “Learning multimodal attention LSTM networks for video captioning,” in Proc. 25th ACM Int. Conf. Multimedia, Mountain View, CA, USA, Oct. 2017, pp. 537–545.

[22]

Y. Jang, Y. Song, Y. Yu, Y. Kim, and G. Kim, “TGIF-QA: Toward spatio-temporal reasoning in visual question answering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 1359–1367.

[23]

Y. Pan, T. Mei, T. Yao, H. Li, and Y. Rui, “Jointly modeling embedding and translation to bridge video and language,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 4594–4602.

[24]

J. Lu, J. Yang, D. Batra, and D. Parikh, “Hierarchical question-image co-attention for visual question answering,” in Proc. Adv. Neural Inf. Process. Syst., Barcelona, Spain, Dec. 2016, pp. 289–297.

[25]

H. Nam, J.-W. Ha, and J. Kim, “Dual attention networks for multimodal reasoning and matching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 2156–2164.

[26]

T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), Lisbon, Portugal, Sep. 2015, pp. 1412–1421.

[27]

X. Liet al., “MAM-RNN: Multi-level attention model based RNN for video captioning,” in Proc. IJCAI, 2017, pp. 2208–2214.

[28]

N. Kalchbrenner and P. Blunsom, “Recurrent continuous translation models,” in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), Washington, DC, USA, Oct. 2013, pp. 1700–1709.

[29]

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” CoRR, vol. abs/1409.0473, Sep. 2014.

[30]

A. Eriguchi, K. Hashimoto, and Y. Tsuruoka, “Tree-to-sequence attentional neural machine translation,” in Proc. 54th Annu. Meeting Assoc. Comput. Linguistics (ACL), Berlin, Germany, vol. 1, Aug. 2016, pp. 823–833.

[31]

I. V. Serbanet al., “A hierarchical latent variable encoder-decoder model for generating dialogues,” in Proc. 31st AAAI Conf. Artif. Intell., San Francisco, CA, USA, Feb. 2017, pp. 3295–3301.

[32]

J. Li, M.-T. Luong, and D. Jurafsky, “A hierarchical neural autoencoder for paragraphs and documents,” in Proc. 53rd Annu. Meeting Assoc. Comput. Linguistics, 7th Int. Joint Conf. Natural Lang. Process. Asian Fed. Natural Lang. Process. (ACL), Beijing, China, vol. 1, Jul. 2015, pp. 1106–1115.

[33]

R. Lin, S. Liu, M. Yang, M. Li, M. Zhou, and S. Li, “Hierarchical recurrent neural network for document modeling,” in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), Lisbon, Portugal, Sep. 2015, pp. 899–907.

[34]

A. Farhadiet al., “Every picture tells a story: Generating sentences from images,” in Proc. 11th Eur. Conf. Comput. Vis., Heraklion, Greece, Sep. 2010, pp. 15–29.

[35]

P. Kuznetsova, V. Ordonez, A. Berg, T. Berg, and Y. Choi, “Generalizing image captions for image-text parallel corpus,” in Proc. 51st Annu. Meeting Assoc. Comput. Linguistics (ACL), Sofia, Bulgaria, vol. 2, Aug. 2013, pp. 790–796.

[36]

Y. Jia, M. Salzmann, and T. Darrell, “Learning cross-modality similarity for multinomial data,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Barcelona, Spain, Nov. 2011, pp. 2407–2414.

[37]

G. Kulkarniet al., “Babytalk: Understanding and generating simple image descriptions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2891–2903, Dec. 2013.

Digital Library

[38]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.

[39]

K. Simonyan and A. Zisserman. (2014). “Very deep convolutional networks for large-scale image recognition.” [Online]. Available: https://arxiv.org/abs/1409.1556

[40]

C. Szegedyet al., “Going deeper with convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 1–9.

[41]

K. Cho, A. Courville, and Y. Bengio, “Describing multimedia content using attention-based encoder-decoder networks,” IEEE Trans. Multimedia, vol. 17, no. 11, pp. 1875–1886, Nov. 2015.

Digital Library

[42]

K. Xuet al., “Show, attend and tell: Neural image caption generation with visual attention,” in Proc. 32nd Int. Conf. Mach. Learn. (ICML), Lille, France, Jul. 2015, pp. 2048–2057.

[43]

Q. You, H. Jin, Z. Wang, C. Fang, and J. Luo, “Image captioning with semantic attention,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 4651–4659.

[44]

J. Lu, C. Xiong, D. Parikh, and R. Socher, “Knowing when to look: Adaptive attention via a visual sentinel for image captioning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 3242–3250.

[45]

Y. Pan, T. Yao, H. Li, and T. Mei, “Video captioning with transferred semantic attributes,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 984–992.

[46]

L. Baraldi, C. Grana, and R. Cucchiara, “Hierarchical boundary-aware neural encoder for video captioning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jul. 2017, pp. 1657–1666.

[47]

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

Digital Library

[48]

H. Yu, J. Wang, Z. Huang, Y. Yang, and W. Xu, “Video paragraph captioning using hierarchical recurrent neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 4584–4593.

[49]

P. Pan, Z. Xu, Y. Yang, F. Wu, and Y. Zhuang, “Hierarchical recurrent neural encoder for video representation with application to captioning,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 1029–1038.

[50]

S. Venugopalan, L. A. Hendricks, R. J. Mooney, and K. Saenko, “Improving lstm-based video description with linguistic knowledge mined from text,” in Proc. Conf. Empirical Methods Natural Lang. Process. (EMNLP), Austin, TX, USA, Nov. 2016, pp. 1961–1966.

[51]

J. Liang, L. Jiang, L. Cao, L. Li, and A. Hauptmann, “Focal visual-text attention for visual question answering,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 6135–6143.

[52]

P. Lu, L. Ji, W. Zhang, N. Duan, M. Zhou, and J. Wang, “R-VQA: Learning visual relation facts with semantic attention for visual question answering,” in Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery & Data Mining, 2018, pp. 1880–1889.

[53]

Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. Neural Netw., vol. 5, no. 2, pp. 157–166, Mar. 1994.

Digital Library

[54]

W. Zaremba and I. Sutskever, “Learning to execute,” CoRR, vol. abs/1410.4615, 2014. [Online]. Available: http://arxiv.org/abs/1410.4615

[55]

G. A. Sigurdsson, G. Varol, X. Wang, A. Farhadi, I. Laptev, and A. Gupta, “Hollywood in homes: Crowdsourcing data collection for activity understanding,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 510–526.

[56]

J. Xu, T. Mei, T. Yao, and Y. Rui, “MSR-VTT: A large video description dataset for bridging video and language,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 5288–5296.

[57]

A. Rohrbach, M. Rohrbach, N. Tandon, and B. Schiele, “A dataset for movie description,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Boston, MA, USA, Jun. 2015, pp. 3202–3212.

[58]

R. Fakoor, A. Mohamed, M. Mitchell, S. B. Kang, and P. Kohli, “Memory-augmented attention modelling for videos,” CoRR, abs/1611.02261, 2016.

[59]

K. Simonyan and A. Zisserman. (2014). “Very deep convolutional networks for large-scale image recognition.” [Online]. Available: https://arxiv.org/abs/1409.1556

[60]

D. Tran, L. D. Bourdev, R. Fergus, L. Torresani, and M. Paluri, “Learning spatiotemporal features with 3D convolutional networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Santiago, Chile, Dec. 2015, pp. 4489–4497.

[61]

J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global vectors for word representation,” in Proc. Conf. Empirical Methods Natural Lang. Process., 2014, pp. 1532–1543.

[62]

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proc. 40th Annu. Meeting Assoc. Comput. Linguistics, 2002, pp. 311–318.

[63]

C. Lin and F. J. Och, “Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics,” in Proc. 42nd Annu. Meeting Assoc. Comput. Linguistics, 2004, Art. no.

[64]

M. J. Denkowski and A. Lavie, “Meteor universal: Language specific translation evaluation for any target language,” in Proc. 9th Workshop Stat. Mach. Transl., 2014, pp. 376–380.

[65]

R. Vedantam, C. L. Zitnick, and D. Parikh, “CIDEr: Consensus-based image description evaluation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2015, pp. 4566–4575.

[66]

X. Zhang, K. Gao, Y. Zhang, D. Zhang, J. Li, and Q. Tian, “Task-driven dynamic fusion: Reducing ambiguity in video description,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, pp. 6250–6258.

[67]

A. Rohrbach, M. Rohrbach, and B. Schiele, “The long-short story of movie description,” in Proc. 37th German Conf. Pattern Recognit. (GCPR), Aachen, Germany, Oct. 2015, pp. 209–221.

Cited By

Gu YWu QZou JLi BMai XZhang YChen Y(2025)Multi-modal clear cell renal cell carcinoma grading with the segment anything modelMultimedia Systems10.1007/s00530-024-01602-731:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s00530-024-01602-7
Zhang BGao JYuan YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)A Descriptive Basketball Highlight Dataset for Automatic Commentary GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681178(10316-10325)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681178
Wu YJiao LLiu XLiu FYang SLi L(2024)Domain Adaptation-Aware Transformer for Hyperspectral Object TrackingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338527334:9(8041-8052)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TCSVT.2024.3385273
Show More Cited By

Index Terms

CAM-RNN: Co-Attention Model Based RNN for Video Captioning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Video Captioning using Hierarchical Multi-Attention Model
ICAIP '18: Proceedings of the 2nd International Conference on Advances in Image Processing

Attention mechanism has been widely used on the temporal task of video captioning and has shown promising improvements. However, in the decoding stage, some words belong to visual words have corresponding canonical visual signals, while other words such ...
MAM-RNN: multi-level attention model based RNN for video captioning
IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial Intelligence

Visual information is quite important for the task of video captioning. However, in the video, there are a lot of uncorrelated content, which may cause interference to generate a correct caption. Based on this point, we attempt to exploit the visual ...
c-RNN: A Fine-Grained Language Model for Image Captioning

Captioning methods from predecessors that based on the conventional deep Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) architecture follow translation system using word-level modelling. But an optimal word segmentation algorithm ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Image Processing

IEEE Transactions on Image Processing Volume 28, Issue 11

Nov. 2019

513 pages

ISSN:1057-7149

Issue’s Table of Contents

1057-7149 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 November 2019

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gu YWu QZou JLi BMai XZhang YChen Y(2025)Multi-modal clear cell renal cell carcinoma grading with the segment anything modelMultimedia Systems10.1007/s00530-024-01602-731:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s00530-024-01602-7
Zhang BGao JYuan YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)A Descriptive Basketball Highlight Dataset for Automatic Commentary GenerationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681178(10316-10325)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681178
Wu YJiao LLiu XLiu FYang SLi L(2024)Domain Adaptation-Aware Transformer for Hyperspectral Object TrackingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338527334:9(8041-8052)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1109/TCSVT.2024.3385273
Kavitha NSoundar KKarthick RKohila J(2024)Automatic video captioning using tree hierarchical deep convolutional neural network and ASRNN-bi-directional LSTMComputing10.1007/s00607-024-01334-6106:11(3691-3709)Online publication date: 13-Aug-2024
https://dl.acm.org/doi/10.1007/s00607-024-01334-6
Verma DHaldar ADutta TWilliams BChen YNeville J(2023)Leveraging weighted fine-grained cross-graph attention for visual and semantic enhanced video captioning networkProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i2.25343(2465-2473)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i2.25343
Li X(2023)Study on Volleyball-Movement Pose Recognition Based on Joint Point SequenceComputational Intelligence and Neuroscience10.1155/2023/21984952023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/2198495
Dong SNiu TLuo XLiu WXu X(2023)Semantic Embedding Guided Attention with Explicit Visual Feature Fusion for Video CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/355027619:2(1-18)Online publication date: 6-Feb-2023
https://dl.acm.org/doi/10.1145/3550276
Chen JPan YLi YYao TChao HMei T(2023)Retrieval Augmented Convolutional Encoder-decoder Networks for Video CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/353922519:1s(1-24)Online publication date: 23-Jan-2023
https://dl.acm.org/doi/10.1145/3539225
Chen YZheng HLi YOuyang WZhu J(2023)Online Handwritten Chinese Character Recognition Based on 1-D Convolution and Two-Streams TransformersIEEE Transactions on Multimedia10.1109/TMM.2023.333958926(5769-5781)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1109/TMM.2023.3339589
Xin BXu NZhai YZhang TLu ZLiu JNie WLi XLiu A(2023)A comprehensive survey on deep-learning-based visual captioningMultimedia Systems10.1007/s00530-023-01175-x29:6(3781-3804)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s00530-023-01175-x
Show More Cited By

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents