Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1101149.1101341acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

A web-based system for collaborative annotation of large image and video collections: an evaluation and user study

Published: 06 November 2005 Publication History

Abstract

Annotated collections of images and videos are a necessary basis for the successful development of multimedia retrieval systems. The underlying models of such systems rely heavily on quality and availability of large training collections. The annotation of large collections, however, is a time-consuming and error prone task as it has to be performed by human annotators. In this paper we present the IBM Efficient Video Annotation (EVA) system, a server-based tool for semantic concept annotation of large video and image collections. It is optimised for collaborative annotation and includes features such as workload sharing and support in conducting inter-annotator analysis. We discuss initial results of an ongoing user-evaluation of this system. The results are based on data collected during the 2005 TRECVID Annotation Forum, where more than 100 annotators have been using the system.

References

[1]
M. Christel, T. Kanade, M. Mauldin, R. Reddy, M. Sirbu, S. Stevens, and H. Wactlar. Informedia digital video library. Communications of the ACM, 38(4), April 1995.
[2]
E. Cooke, P. Ferguson, G. Gaughan, C. Gurrin, G. Jones, H. L. Borgue, H. Lee, S. Marlow, K. McDonald, M. McHugh, N. Murphy, N. O'Connor, N. O'Hare, S. Rothwell, A. Smeaton, and P. Wilkins. TRECVID 2004 experiments in Dublin City University. In TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/dcu.pdf.
[3]
D. Doermann and D. Mihalcik. Tools and techniques for video performance evaluation. In Proceedings of the IEEE International Conference on Pattern Recognistion (ICPR'00), volume 4, pages 167--170, Barcelona, Spain, 3--9 September 2000. http://lamp.cfar.umd.edu/media/research/viper.
[4]
A. L. Edwards. An Introduction to Linear Regression and Correlation, chapter 4. The Correlation Coefficient, pages 33--46. W. H. Freeman, San Francisco, CA, USA, 1976.
[5]
The Informedia Digital Library Project. http://www.informedia.cs.cmu.edu.
[6]
W. Kraaij, A. F. Smeaton, P. Over, and J. Arlandis. TRECVID-2004 -- An introduction. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/tv4overview.pdf.
[7]
C.-Y. Lin, B. L. Tseng, and J. R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2003 Workshop Notebook Papers, Gaithersburg, MD, USA, 18--21 November 2003. http://www.alphaworks.ibm.com/tech/videoannex.
[8]
The ricoh MovieTool. http://ricoh.co.jp/src/multimedia/MovieTool.
[9]
M. Naphade, L. Kennedy, J. Kender, S. Chang, J. R. Smith, P. Over, and A. Hauptmann. A light scale concept ontology for multimedia understanding for TRECVID 2005. Technical Report RC23612, IBM T.J. Watson Research Center, Hawthorne, NY, USA, May 2005. http://domino.watson.ibm.com/library/CyberDig.nsf/papers/A33ABDB65967B5%3B852570070056B36F/$File/rc23612.pdf.
[10]
C. Petersohn. Fraunhofer HHI at TRECVID~2004: Shot boundary detection system. In TRECVID 2004 Workshop Notebook Papers, Gaithersburg, MD, USA, 15--16 November 2004. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers04/fraunhofer.pdf.
[11]
T. Pfund and S. Marchand-Maillet. A dynamic multimedia annotation tool. In G. Beretta and R. Schettini, editors, In Proceedings of SPIE Photonics West, Electronic Imaging 2002, Internet Imaging III, volume 4672, pages 216--224, San Jose, CA, USA, January 2002. http://viper.unige.ch/research/annotation.
[12]
A. F. Smeaton, W. Kraaij, and P. Over. TRECVID-2003 -- An introduction. In E. M. Voorhees and L. P. Buckland, editors, TRECVID 2003 Workshop Notebook Papers, Gaithersburg, MD, USA, 18--21 November 2003. http://www-nlpir.nist.gov/projects/tvpubs/tvpapers03/tv3overview.pdf.
[13]
A. F. Smeaton and P. Over. The TREC-2002 video track report. In E. M. Voorhees and L. P. Buckland, editors, NIST Special Publication 500-251: Proceedings of the Eleventh Text REtrieval Conference (TREC 2002), pages 69--85, Gaithersburg, MD, USA, 19--22 November 2002. http://trec.nist.gov/pubs/trec11/papers/VIDEO.OVER.pdf.
[14]
TREC Video Retrieval Evaluation (TRECVID). http://www-nlpir.nist.gov/projects/trecvid.
[15]
The video desciption tool (VIDETO). http://www.zgdv.de/zgdv/departments/zr1/Produkte/videto.
[16]
L. von Ahn and L. Dabbish. Labeling images with a computer game. In Conference on Human Factors in Computing Systems (CHI) 2004, pages 319--326, Vienna, Austria, 24--29 April 2004. http://www.espgame.org.

Cited By

View all
  • (2023)User Preference and Performance using Tagging and Browsing for Image LabelingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580926(1-13)Online publication date: 19-Apr-2023
  • (2022)What's in an ALT Tag? Exploring Caption Content Priorities through Collaborative CaptioningACM Transactions on Accessible Computing10.1145/350765915:1(1-32)Online publication date: 4-Mar-2022
  • (2021)A survey of image labelling for computer vision applicationsJournal of Business Analytics10.1080/2573234X.2021.19088614:2(91-110)Online publication date: 18-Apr-2021
  • Show More Cited By

Index Terms

  1. A web-based system for collaborative annotation of large image and video collections: an evaluation and user study

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia
      November 2005
      1110 pages
      ISBN:1595930442
      DOI:10.1145/1101149
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 November 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Article

      Conference

      MM05

      Acceptance Rates

      MULTIMEDIA '05 Paper Acceptance Rate 49 of 312 submissions, 16%;
      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)User Preference and Performance using Tagging and Browsing for Image LabelingProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580926(1-13)Online publication date: 19-Apr-2023
      • (2022)What's in an ALT Tag? Exploring Caption Content Priorities through Collaborative CaptioningACM Transactions on Accessible Computing10.1145/350765915:1(1-32)Online publication date: 4-Mar-2022
      • (2021)A survey of image labelling for computer vision applicationsJournal of Business Analytics10.1080/2573234X.2021.19088614:2(91-110)Online publication date: 18-Apr-2021
      • (2020)The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image ClassificationIEEE Transactions on Image Processing10.1109/TIP.2020.297381229(4683-4695)Online publication date: 1-Jan-2020
      • (2018)Activities of Daily Living Ontology for Ubiquitous Systems: Development and EvaluationSensors10.3390/s1807236118:7(2361)Online publication date: 20-Jul-2018
      • (2017)Automatic Baseball Video Tagging Based on Voice Pattern Prioritization and Recursive Model LocalizationJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2017.p126221:7(1262-1279)Online publication date: 20-Nov-2017
      • (2017)Comparing apples and oranges: Off‐road pedestrian detection on the National Robotics Engineering Center agricultural person‐detection datasetJournal of Field Robotics10.1002/rob.2176035:4(545-563)Online publication date: 6-Nov-2017
      • (2016)VidCritProceedings of the 29th Annual Symposium on User Interface Software and Technology10.1145/2984511.2984552(517-528)Online publication date: 16-Oct-2016
      • (2016)Cognition inspired format for the expression of computer vision metadataMultimedia Tools and Applications10.1007/s11042-015-2974-x75:24(17035-17057)Online publication date: 1-Dec-2016
      • (2016)Towards large-scale multimedia retrieval enriched by knowledge about human interpretationMultimedia Tools and Applications10.1007/s11042-014-2292-875:1(297-331)Online publication date: 1-Jan-2016
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media