research-article

Spatial-Temporal Tag Mining for Automatic Geospatial Video Annotation

Authors:

Roger ZimmermannAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 11, Issue 2

Article No.: 29, Pages 1 - 21

https://doi.org/10.1145/2658981

Published: 07 January 2015 Publication History

Abstract

Videos are increasingly geotagged and used in practical and powerful GIS applications. However, video search and management operations are typically supported by manual textual annotations, which are subjective and laborious. Therefore, research has been conducted to automate or semi-automate this process. Since a diverse vocabulary for video annotations is of paramount importance towards good search results, this article proposes to leverage crowdsourced data from social multimedia applications that host tags of diverse semantics to build a spatio-temporal tag repository, consequently acting as input to our auto-annotation approach. In particular, to build the tag store, we retrieve the necessary data from several social multimedia applications, mine both the spatial and temporal features of the tags, and then refine and index them accordingly. To better integrate the tag repository, we extend our previous approach by leveraging the temporal characteristics of videos as well. Moreover, we set up additional ranking criteria on the basis of tag similarity, popularity and location bias. Experimental results demonstrate that, by making use of such a tag repository, the generated tags have a wide range of semantics, and the resulting rankings are more consistent with human perception.

References

[1]

Golnaz Abdollahian and Edward J. Delp. 2009. User generated video annotation using geo-tagged image databases. In Proceedings of the IEEE International Conference on Multimedia and Expo.

Digital Library

[2]

Shane Ahern, Mor Naaman, Rahul Nair, and Jeannie Hui. 2007. World Explorer: Visualizing aggregate data from unstructured text in geo-referenced collections. In Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries.

Digital Library

[3]

Morgan Ames and Mor Naaman. 2007. Why We Tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

Digital Library

[4]

Sakire Arslan Ay, Roger Zimmermann, and Seon Ho Kim. 2008. Viewable scene modeling for geospatial video search. In Proceedings of the ACM Multimedia Conference.

Digital Library

[5]

Lamberto Ballan, Marco Bertini, Alberto Del Bimbo, Marco Meoni, and Giuseppe Serra. 2010. Tag suggestion and localization in user-generated videos based on social knowledge. In Proceedings of the 2nd ACM SIGMM Workshop on Social Media. 3--8.

Digital Library

[6]

Lamberto Ballan, Marco Bertini, Alberto Del Bimbo, and Giuseppe Serra. 2011. Enriching and localizing semantic tags in internet videos. In Proceedings of the ACM Multimedia Conference. 1541--1544.

Digital Library

[7]

C. Brunsdon, A. S. Fotheringham, and M. Charlton. 2002. Geographically weighted summary statistics a framework for localised exploratory data analysis. Computers Environ. Urban Syst. 26, 6, 501--524.

[8]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol.

Digital Library

[9]

Nello Cristianini and John Shawe-Taylor. 2000. An Introduction to Support Vector Machines: and Other Kernel-Based Learning Methods. Cambridge University Press.

Digital Library

[10]

A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statistical Soc. Series B.

[11]

Martin Ester, Hans-peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining.

Digital Library

[12]

S. L. Feng, R. Manmatha, and V. Lavrenko. 2004. Multiple Bernoulli relevance models for image and video annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

Digital Library

[13]

Yue Gao, Jinhui Tang, Richang Hong, Qionghai Dai, Tat S. Chua, and Ramesh Jain. 2010. W2Go: A travel guidance system by automatic landmark ranking. In Proceedings of the ACM Multimedia Conference.

Digital Library

[14]

Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explor. Newsl. 10--18.

Digital Library

[15]

Claudia Hauff. 2013. A study on the accuracy of Flickr's geotag data. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 1037--1040.

Digital Library

[16]

John R. Hershey and Peder A. Olsen. 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 4. 317--320.

[17]

Richang Hong, Jinhui Tang, Hung-Khoon Tan, Chong-Wah Ngo, Shuicheng Yan, and Tat-Seng Chua. 2011. Beyond search: Event-driven summarization for web videos. ACM Trans. Multimedia Comput. Commun. Appl. 7, 4, 35:1--35:18.

Digital Library

[18]

Suradej Intagorn and Kristina Lerman. 2011. Learning boundaries of vague places from noisy annotations. In Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. 425--428.

Digital Library

[19]

Ramesh Jain and Pinaki Sinha. 2010. Content without context is meaningless. In Proceedings of the ACM Multimedia Conference.

Digital Library

[20]

J. Jeon, V. Lavrenko, and R. Manmatha. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. 119--126.

Digital Library

[21]

Martha Larson, Maria Eskevich, Roeland Ordelman, Christoph Kofler, Sebastian Schmiedeke, and Gareth J. F Jones. 2011a. Overview of MediaEval 2011 rich speech retrieval task and genre tagging task. In Proceedings of the Multimedia Benchmark Workshop.

[22]

Martha Larson, Mohammad Soleymani, Pavel Serdyukov, Stevan Rudinac, Christian Wartena, Vanessa Murdock, Gerald Friedland, Roeland Ordelman, and Gareth J. F. Jones. 2011b. Automatic tagging and geotagging in video collections and communities. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval. 51:1--51:8.

Digital Library

[23]

Jing Liu, Bin Wang, Mingjing Li, Zhiwei Li, Weiying Ma, Hanqing Lu, and Songde Ma. 2007. Dual cross-media relevance model for image annotation. In Proceedings of the ACM Multimedia Conference. 605--614.

Digital Library

[24]

Xin Lu, Changhu Wang, Jiang M. Yang, Yanwei Pang, and Lei Zhang. 2010. Photo2Trip: Generating travel routes from geo-tagged photos for trip planning. In Proceedings of the ACM Multimedia Conference.

Digital Library

[25]

Florent Monay and Daniel G. Perez. 2003. On image auto-annotation with latent space models. In Proceedings of the ACM Multimedia Conference.

Digital Library

[26]

Emily Moxley, Jim Kleban, and B. S. Manjunath. 2008. SpiritTagger: A geo-aware tag suggestion tool mined from Flickr. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval.

Digital Library

[27]

Guo J. Qi, Xian S. Hua, Yong Rui, Jinhui Tang, Tao Mei, and Hong J. Zhang. 2007. Correlative multi-label video annotation. In Proceedings of the ACM Multimedia Conference.

Digital Library

[28]

Abu Saleh Md Mahfujur Rahman, M Anwar Hossain, and Abdulmotaleb El Saddik. 2010. Spatial-geometric approach to physical mobile interaction based on accelerometer and IR sensory data fusion. ACM Trans. Multimedia Comput. Commun. Appl. 6, 4, 28:1--28:23.

Digital Library

[29]

Tye Rattenbury, Nathaniel Good, and Mor Naaman. 2007. Towards automatic extraction of event and place semantics from Flickr tags. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[30]

Zhijie Shen, Sakire Arslan Ay, Seon Ho Kim, and Roger Zimmermann. 2011. Automatic tag generation and ranking for sensor-rich outdoor videos. In Proceedings of the ACM Multimedia Conference.

Digital Library

[31]

Stefan Siersdorfer, Jose San Pedro, and Mark Sanderson. 2009. Automatic video tagging using content redundancy. In Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval.

Digital Library

[32]

Börkur Sigurbjörnsson and Roelof Van Zwol. 2008. Flickr tag recommendation based on collective knowledge. In Proceedings of the ACM Conference on the World Wide Web.

Digital Library

[33]

Sergej Sizov. 2010. GeoFolk: Latent spatial semantics in Web 2.0 social media. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining. 281--290.

Digital Library

[34]

Cees G. M. Snoek, Koen E. A. van de Sande, Xirong Li, Masoud Mazloom, Yu-Gang Jiang, Dennis C. Koelma, and Arnold W. M. Smeulders. 2011. The MediaMill TRECVID 2011 semantic video search engine.

[35]

Fabian M. Suchanek, Milan Vojnovic, and Dinan Gunawardena. 2008. Social Tags: Meaning and Suggestions. In Proceedings of the 17th ACM Conference on Information and Knowledge Management.

Digital Library

[36]

Bart Thomee and Adam Rae. 2013. Uncovering locally characterizing regions within geotagged data. In Proceedings of the ACM Conference on the World Wide Web. 1285--1296.

Digital Library

[37]

Xinmei Tian, Dacheng Tao, and Yong Rui. 2012. Sparse transfer learning for interactive video search reranking. ACM Trans. Multimedia Comput. Commun. Appl. 8, 3, 26:1--26:19.

Digital Library

[38]

Kentaro Toyama, Ron Logan, and Asta Roseway. 2003. Geographic location tags on digital images. In Proceedings of the ACM Multimedia Conference.

Digital Library

[39]

Lei Wu, Linjun Yang, Nenghai Yu, and Xian S. Hua. 2009. Learning to tag. In Proceedings of the ACM Conference on the World Wide Web.

Digital Library

[40]

Rong Yan, Apostol Natsev, and Murray Campbell. 2008. A learning-based hybrid tagging and browsing approach for efficient manual image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[41]

Keiji Yanai, Hidetoshi Kawakubo, and Bingyu Qiu. 2009. A visual analysis of the relationship between word concepts and geographical locations. In Proceedings of the ACM International Conference on Image and Video Retrieval.

Digital Library

[42]

Zhijun Yin, Liangliang Cao, Jiawei Han, Jiebo Luo, and Thomas S. Huang. 2011a. Diversified trajectory pattern ranking in geo-tagged social media. In Proceedings of the SIAM International Conference on Data Mining. 980--991.

[43]

Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, and Thomas Huang. 2011b. Geographical topic discovery and comparison. In Proceedings of the ACM Conference on World Wide Web. 247--256.

Digital Library

[44]

Haipeng Zhang, Mohammed Korayem, Erkang You, and David J. Crandall. 2012. Beyond co-occurrence: Discovering and visualizing tag relationships from geo-spatial and temporal similarities. In Proceedings of the ACM International Conference on Web Search and Data Mining.

Digital Library

Cited By

Alfarrarjeh AKim SYoon J(2025)A framework for automatically generating composite keywords for geo-tagged street imagesKuwait Journal of Science10.1016/j.kjs.2024.10033352:1(100333)Online publication date: Jan-2025
https://doi.org/10.1016/j.kjs.2024.100333
Shen ZCai KFang QLuo X(2024)Air Traffic Flow Prediction with Spatiotemporal Knowledge Distillation NetworkJournal of Advanced Transportation10.1155/2024/43494022024(1-17)Online publication date: 15-May-2024
https://doi.org/10.1155/2024/4349402
Li ZCao ZYue PZhang C(2024)Earth Video Cube: A Geospatial Data Cube for Multisource Earth Observation Video Management and AnalysisIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.335834217(4986-5000)Online publication date: 2024
https://doi.org/10.1109/JSTARS.2024.3358342
Show More Cited By

Index Terms

Spatial-Temporal Tag Mining for Automatic Geospatial Video Annotation
1. Information systems
  1. Information retrieval

Recommendations

Automatic tag generation and ranking for sensor-rich outdoor videos
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Video tag annotations have become a useful and powerful feature to facilitate video search in many social media and web applications. The majority of tags assigned to videos are supplied by users - a task which is time consuming and may result in ...
SRV-TaGS: An Automatic TAGging and Search System for Sensor-Rich Outdoor Videos
MM '11: Proceedings of the 19th ACM international conference on Multimedia

Tagging facilitates video search in many social media and web applications. While manual tagging is time consuming, subjective and sometimes inaccurate, auto-tagging facilitated by content-based techniques is compute-intensive and challenging to apply ...
Disinformation in Multimedia Annotation: Misleading Metadata Detection on YouTube
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

Popularity of online videos is increasing at a rapid rate. Not only the users can access these videos online, but they can also upload video content on platforms like YouTube and Myspace. These videos are indexed by user generated multimedia annotation, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 11, Issue 2

December 2014

197 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/2716635

Editor:
Ralf Steinmetz
Technische Universität Darmstadt, Germany

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 January 2015

Accepted: 01 June 2014

Revised: 01 December 2013

Received: 01 August 2013

Published in TOMM Volume 11, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiative
IDM Programme Office through the Centre of Social Media Innovations for Communities (COSMIC)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
312
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Alfarrarjeh AKim SYoon J(2025)A framework for automatically generating composite keywords for geo-tagged street imagesKuwait Journal of Science10.1016/j.kjs.2024.10033352:1(100333)Online publication date: Jan-2025
https://doi.org/10.1016/j.kjs.2024.100333
Shen ZCai KFang QLuo X(2024)Air Traffic Flow Prediction with Spatiotemporal Knowledge Distillation NetworkJournal of Advanced Transportation10.1155/2024/43494022024(1-17)Online publication date: 15-May-2024
https://doi.org/10.1155/2024/4349402
Li ZCao ZYue PZhang C(2024)Earth Video Cube: A Geospatial Data Cube for Multisource Earth Observation Video Management and AnalysisIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing10.1109/JSTARS.2024.335834217(4986-5000)Online publication date: 2024
https://doi.org/10.1109/JSTARS.2024.3358342
Tan MYu JYu ZGao FRui YTao D(2018)User-Click-Data-Based Fine-Grained Image Recognition via Weakly Supervised Metric LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/320966614:3(1-23)Online publication date: 24-Jul-2018
https://dl.acm.org/doi/10.1145/3209666
Zhen-tao HWang ZShu PXiang X(2018)Learning an video-based message sharing system for large-scale smart vehiclesMultimedia Tools and Applications10.1007/s11042-018-5760-8Online publication date: 20-Feb-2018
https://doi.org/10.1007/s11042-018-5760-8
Zhu JXiong CDu HXiang RLi Y(2018)RETRACTED ARTICLEMultimedia Tools and Applications10.1007/s11042-017-5604-y77:12(16001-16001)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s11042-017-5604-y
Han XXiong CLi YHe FDu H(2018)Medical image encryption technique in big media environmentMultimedia Tools and Applications10.1007/s11042-017-5598-5Online publication date: 20-Feb-2018
https://doi.org/10.1007/s11042-017-5598-5
Li YXiong CHan XDu HHe F(2018)Internet-scale secret sharing algorithm with multimedia applicationsMultimedia Tools and Applications10.1007/s11042-017-5558-0Online publication date: 22-Jan-2018
https://doi.org/10.1007/s11042-017-5558-0
Jiang ZChen GJin XWang Y(2018)RETRACTED ARTICLE: Analysis of security operation and maintenance system using privacy utility in media environmentMultimedia Tools and Applications10.1007/s11042-017-5361-yOnline publication date: 1-Feb-2018
https://doi.org/10.1007/s11042-017-5361-y
Wang FMa YJin YJiang YWang Y(2018)RETRACTED ARTICLEMultimedia Tools and Applications10.1007/s11042-017-5057-377:3(3245-3260)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1007/s11042-017-5057-3
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents