Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation

Rodrigues, Rui; Madeira, Rui Neves; Correia, Nuno; Fernandes, Carla; Ribeiro, Sara

doi:10.1007/978-3-030-33617-2_3

Rui Rodrigues^14,15,
Rui Neves Madeira^14,15,
Nuno Correia¹⁵,
Carla Fernandes¹⁶ &
…
Sara Ribeiro¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11872))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1095 Accesses
5 Citations

Abstract

This paper presents a multi-platform Web-based video annotator to support multimodal annotation that can be applied to several working areas, such as dance rehearsals, among others. The CultureMoves’ “Motion-Notes” Annotator was designed to assist the creative and exploratory processes of both professional and amateur users, working with a digital device for personal annotations. This prototype is being developed for any device capable of running in a modern Web browser. It is a real-time multimodal video annotator based on keyboard, touch and voice inputs. Five different ways of adding annotations have been already implemented: voice, draw, text, web URL, and mark annotations. Pose estimation functionality uses machine learning techniques to identify a person skeleton in the video frames, which gives the user another resource to identify possible annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Exploring the User Interaction with a Multimodal Web-Based Video Annotator

Sample-Based Human Movement Detection for Interactive Videos Applied to Performing Arts

Semi-automation of gesture annotation by machine learning and human collaboration

Article Open access 25 February 2022

References

Cabral, D., Valente, J., Silva, J., Aragão, U., Fernandes, C., Correia, N.: A creation-tool for contemporary dance using multimodal video annotation. In: Proceedings of the 19th ACM International Conference on Multimedia, MM 2011, pp. 905–908. ACM, New York (2011). http://doi.acm.org/10.1145/2072298.2071899
Silva, J.M.F., Cabral, D., Fernandes, C., Correia, N.: Real-time annotation of video objects on tablet computers. In: MUM 2012, p. 19 (2012)
Google Scholar
Cabral, D., Valente, J., Aragão, U., Fernandes, C., Correia, N.: Evaluation of a multimodal video annotator for contemporary dance. In: AVI 2012
Google Scholar
Yilmaz, A., Javed, O., Shah, M.: Object tracking: a survey. ACM Comput. Surv. 38(4), 13:1–13:45 (2006)
Article Google Scholar
Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: a review. IEEE Trans. Cybernet. 43(5), 1318–1334 (2013). https://doi.org/10.1109/TCYB.2013.2265378
Article Google Scholar
Kawana, Y., Ukita, N., Huang, J.-B., Yang, M.-H.: Ensemble convolutional neural networks for pose estimation. Comput. Vis. Image Underst. 169, 62–74 (2018). https://doi.org/10.1016/j.cviu.2017.12.005. ISSN 1077-3142
Article Google Scholar
PoseNet. https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5. Accessed 31 July 2019
Cao, Z., Simon, T., Wei, S., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017). https://doi.org/10.1109/cvpr.2017.143
Bargeron, D., Gupta, A., Grudin, J., Sanocki, E.: Annotations for streaming video on the Web: system design and usage studies. Comput. Netw. 31(11–16), 1139–1153 (1999). ISSN 1389-1286
Article Google Scholar
Lausberg, H., Sloetjes, H.: Behav. Res. Methods 41, 841 (2009). https://doi.org/10.3758/BRM.41.3.841
Article Google Scholar
Correia, N., Chambel, T.: Active video watching using annotation. In: Proceedings of the Seventh ACM International Conference on Multimedia (Part 2) (MULTIMEDIA 1999), pp. 151–154. ACM, New York (1999)
Google Scholar
Goldman, D.B., Gonterman, C., Curless, B., Salesin, D., Seitz, S.M.: Video object annotation, navigation, and composition. In: Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology, UIST 2008, New York, USA (2008)
Google Scholar
Marshall, C.C.: Toward an ecology of hypertext annotation. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, HYPERTEXT 1998. ACM, New York (1998)
Google Scholar
Europeana. https://www.europeana.eu/portal/pt. Accessed 31 July 2019
Stackoverflow. https://insights.stackoverflow.com/survey/2019#most-popular-technologies. Accessed 31 July 2019

Download references

Acknowledgements

This work was supported by the project CultureMoves, Grant Agreement Number: INEA/CEF/ICT/A2017/1568369, Action No: 2017-EU-tA-0171.

Author information

Authors and Affiliations

Sustain.RD Center, ESTSetúbal, Polytechnic Institute, Setúbal, Portugal
Rui Rodrigues & Rui Neves Madeira
NOVA LINCS, DI, FCT, NOVA University of Lisboa, Lisbon, Portugal
Rui Rodrigues, Rui Neves Madeira & Nuno Correia
ICNOVA, FCSH, NOVA University of Lisboa, Lisbon, Portugal
Carla Fernandes & Sara Ribeiro

Authors

Rui Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar
Rui Neves Madeira
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Correia
View author publications
You can also search for this author in PubMed Google Scholar
Carla Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Sara Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Rodrigues .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Technical University of Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros
University of Exeter, Exeter, UK
Ronaldo Menezes
University of Manchester, Manchester, UK
Richard Allmendinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodrigues, R., Madeira, R.N., Correia, N., Fernandes, C., Ribeiro, S. (2019). Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11872. Springer, Cham. https://doi.org/10.1007/978-3-030-33617-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-33617-2_3
Published: 18 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33616-5
Online ISBN: 978-3-030-33617-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Exploring the User Interaction with a Multimodal Web-Based Video Annotator

Sample-Based Human Movement Detection for Interactive Videos Applied to Performing Arts

Semi-automation of gesture annotation by machine learning and human collaboration

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multimodal Web Based Video Annotator with Real-Time Human Pose Estimation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Exploring the User Interaction with a Multimodal Web-Based Video Annotator

Sample-Based Human Movement Detection for Interactive Videos Applied to Performing Arts

Semi-automation of gesture annotation by machine learning and human collaboration

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation