Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2502081.2508118acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Structured exploration of who, what, when, and where in heterogeneous multimedia news sources

Published: 21 October 2013 Publication History

Abstract

We present a fully automatic system from raw data gathering to navigation over heterogeneous news sources, including over 18k hours of broadcast video news, 3.58M online articles, and 430M public Twitter messages. Our system addresses the challenge of extracting "who," "what," "when," and "where" from a truly multimodal perspective, leveraging audiovisual information in broadcast news and those embedded in articles, as well as textual cues in both closed captions and raw document content in articles and social media. Performed over time, we are able to extract and study the trend of topics in the news and detect interesting peaks in news coverage over the life of the topic. We visualize these peaks in trending news topics using automatically extracted keywords and iconic images, and introduce a novel multimodal algorithm for naming speakers in the news. We also present several intuitive navigation interfaces for interacting with these complex topic structures over different news sources.

References

[1]
X. Anguera Miro, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, and O. Vinyals. Speaker diarization: A review of recent research. ASLP, 2012.
[2]
H. Becker, M. Naaman, and L. Gravano. Beyond trending topics: Real-world event identification on Twitter. In ICWSM, 2011.
[3]
E. L. Bird, Steven and E. Klein. Natural Language Processing with Python. O'Reilly Media Inc., 2009.
[4]
A. S. Das, M. Datar, A. Garg, and S. Rajaram. Google News personalization: Scalable online collaborative filtering. In WWW, 2007.
[5]
M. Everingham, J. Sivic, and A. Zisserman. Taking the bite out of automated naming of characters in TV video. Image Vision Computing, 2009.
[6]
J. R. Finkel, T. Grenager, and C. Manning. Incorporating non-local information into information extraction systems by gibbs sampling. In ACL, 2005.
[7]
W. Hsu, L. Kennedy, C.-W. Huang, S.-F. Chang, C.-Y. Lin, and G. Iyengar. News video story segmentation using fusion of multi-level multi-modal features in TRECVID 2003. In ICASSP, 2004.
[8]
M. Huijbregts. Segmentation, Diarization, and Speech Transcription: Suprise Data Unraveled. PhD thesis, University of Twente, 2008.
[9]
T. Jebara, J. Wang, and S.-F. Chang. Graph construction and b-matching for semi-supervised learning. In ICML, 2009.
[10]
J. Kahn, O. Galibert, L. Quintard, M. Carre, A. Giraudel, and P. Joly. A presentation of the REPERE challenge. In CBMI, 2012.
[11]
Q. Li, S. Anzaroot, W.-P. Lin, X. Li, and H. Ji. Joint inference for cross-document information extraction. In CIKM, 2011.
[12]
National Institute of Standards and Technology. Text REtrieval Conference (TREC): VIDeo track (TRECVID).
[13]
A. Noulas, G. Englebienne, and B. J. A. Kröse. Multimodal speaker diarization. PAMI, 2012.
[14]
M. A. Siegler, U. Jain, B. Raj, and R. M. Stern. Automatic segmentation, classification and clustering of broadcast news audio. In DARPA Speech Recogn. Workshop, 1997.
[15]
M. Uricár, V. Franc, and V. Hlavác. Detector of facial landmarks learned by the structured output SVM. In VISAPP. SciTePress, 2012.

Cited By

View all
  • (2020)Person Browser System Based on Named Entity Recognition for Broadcast News Interview VideosInternational Journal of Control, Automation and Systems10.1007/s12555-019-0391-z19:1(186-199)Online publication date: 5-Aug-2020
  • (2019)VLX-Stories: Building an Online Event Knowledge Base with Emerging Entity DetectionThe Semantic Web – ISWC 201910.1007/978-3-030-30796-7_24(382-399)Online publication date: 26-Oct-2019
  • (2018)An intelligent multimedia information system for multimodal content extraction and queryingMultimedia Tools and Applications10.1007/s11042-017-4378-677:2(2225-2260)Online publication date: 1-Jan-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '13: Proceedings of the 21st ACM international conference on Multimedia
October 2013
1166 pages
ISBN:9781450324045
DOI:10.1145/2502081
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multimedia analysis
  2. speaker diarization
  3. topic linking

Qualifiers

  • Research-article

Conference

MM '13
Sponsor:
MM '13: ACM Multimedia Conference
October 21 - 25, 2013
Barcelona, Spain

Acceptance Rates

MM '13 Paper Acceptance Rate 47 of 235 submissions, 20%;
Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Person Browser System Based on Named Entity Recognition for Broadcast News Interview VideosInternational Journal of Control, Automation and Systems10.1007/s12555-019-0391-z19:1(186-199)Online publication date: 5-Aug-2020
  • (2019)VLX-Stories: Building an Online Event Knowledge Base with Emerging Entity DetectionThe Semantic Web – ISWC 201910.1007/978-3-030-30796-7_24(382-399)Online publication date: 26-Oct-2019
  • (2018)An intelligent multimedia information system for multimodal content extraction and queryingMultimedia Tools and Applications10.1007/s11042-017-4378-677:2(2225-2260)Online publication date: 1-Jan-2018
  • (2017)Joint Image-Text News Topic Detection and Tracking by Multimodal Topic And-Or GraphIEEE Transactions on Multimedia10.1109/TMM.2016.261627919:2(367-381)Online publication date: 1-Feb-2017
  • (2017)Automatic person information extraction using overlay text in television news interview videos2017 IEEE 15th International Conference on Industrial Informatics (INDIN)10.1109/INDIN.2017.8104837(583-588)Online publication date: Jul-2017
  • (2017)Strategy for automatic person indexing and retrieval system in news interview video sequences2017 10th International Conference on Human System Interactions (HSI)10.1109/HSI.2017.8005031(212-215)Online publication date: Jul-2017
  • (2016)News Event Understanding by Mining Latent Factors From Multimodal TensorsProceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion10.1145/2983563.2983564(9-16)Online publication date: 16-Oct-2016
  • (2016)Placing Broadcast News Videos in their Social Media Context Using HashtagsProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2970929(684-688)Online publication date: 1-Oct-2016
  • (2016)Learning Multimodal Temporal Representation for Dubbing Detection in Broadcast MediaProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2967211(202-206)Online publication date: 1-Oct-2016
  • (2016)Watching What and How Politicians Discuss Various TopicsProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912025(401-404)Online publication date: 6-Jun-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media