research-article

Free access

Enhancing multi-lingual information extraction via cross-media inference and fusion

Authors:

Marissa Passantino,

Thomas HuangAuthors Info & Claims

COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics: Posters

Pages 630 - 638

Published: 23 August 2010 Publication History

Abstract

We describe a new information fusion approach to integrate facts extracted from cross-media objects (videos and texts) into a coherent common representation including multi-level knowledge (concepts, relations and events). Beyond standard information fusion, we exploited video extraction results and significantly improved text Information Extraction. We further extended our methods to multi-lingual environment (English, Arabic and Chinese) by presenting a case study on cross-lingual comparable corpora acquisition based on video comparison.

References

[1]

Amato, F., Mazzeo, A., Moscato, V. and Picariello, A. 2010. Information Extraction from Multimedia Documents for e-Government Applications. Information Systems: People, Organizations, Institutions, and Technologies. pp. 101--108.

[2]

Appriou A., A. Ayoun, Benferhat, S., Besnard, P., Cholvy, L., Cooke, R., Cuppens, F., Dubois, D., Fargier, H., Grabisch, M., Kruse, R., Lang, J. Moral, S., Prade, H., Saffiotti, A., Smets, P., Sossai, C. 2001. Fusion: General concepts and characteristics. International Journal of Intelligent Systems 16(10).

[3]

Baluja, S. and Rowley, H. 2006. Boosting Sex Identification Performance. International Journal of Computer Vision.

Digital Library

[4]

Bergsma, S. 2005. Automatic Acquisition of Gender Information for Anaphora Resolution. Proc. Canadian AI 2005.

Digital Library

[5]

Cheung, P. and Fung P. 2004. Sentence Alignment in Parallel, Comparable, and Quasi-comparable Corpora. Proc. LREC 2004.

[6]

Cheung, S.-C. and Zakhor, A. 2000. Efficient video similarity measurement and search. Proc. IEEE International Conference on Image Processing.

[7]

Deschacht K. and Moens M. 2007. Text Analysis for Automatic Image Annotation. Proc. ACL 2007.

[8]

Feng, Y. and Lapata, M. 2008. Automatic Image Annotation Using Auxiliary Text Information. Proc. ACL 2008.

[9]

Gregoire, E. 2006. An unbiased approach to iterated fusion by weakening. Information Fusion. 7(1).

Digital Library

[10]

Gu, Z., Mei, T., Hua, X., Tang, J., Wu, X. 2007. Multi-Layer Multi-Instance Kernel for Video Concept Detection. Proc. ACM Multimedia 2007.

Digital Library

[11]

Hakkani-Tur, D., Ji, H. and Grishman, R. 2007. Using Information Extraction to Improve Cross-lingual Document Retrieval. Proc. RANLP 2007 Workshop on Multi-Source Multi-lingual Information Extraction and Summarization.

[12]

Iria, J. and Magalhaes, J. 2009. Exploiting Cross-Media Correlations in the Categorization of Multimedia Web Documents. Proc. CIAM 2009.

[13]

Ji, H. and Grishman, R. 2008. Refining Event Extraction Through Cross-document Inference. Proc. ACL 2008.

[14]

Ji, H. 2009. Mining Name Translations from Comparable Corpora by Creating Bilingual Information Networks. Proc. ACL-IJCNLP 2009 workshop on Building and Using Comparable Corpora (BUCC 2009): from parallel to non-parallel corpora.

Digital Library

[15]

Ji, H., Grishman, R., Freitag, D., Blume, M., Wang, J., Khadivi, S., Zens, R., and Ney, H. 2009. Name Translation for Distillation. Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation. Springer.

[16]

Ji, H. and Lin, D. 2009. Gender and Animacy Knowledge Discovery from Web-Scale N-Grams for Unsupervised Person Mention Detection. Proc. PACLIC 2009.

[17]

Oviatt, S. L., DeAngeli, A., & Kuhn, K. 1997. Integration and synchronization of input modes during multimodal human-computer interaction. Proceedings of Conference on Human Factors in Computing Systems (CHI'97), 415--422. New York: ACM Press.

Digital Library

[18]

Labsky, M., Praks, P., Svátek1, V., and Svab, O. 2005. Multimedia Information Extraction from HTML Product Catalogues. Proc. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 401--404.

Digital Library

[19]

Lin, D., Church, K., Ji, H., Sekine, S., Yarowsky, D., Bergsma, S., Patil, K., Pitler, E., Lathbury, R., Rao, V., Dalwani, K. and Narsale, S. 2010. New Data, Tags and Tools for Web-Scale N-grams. Proc. LREC 2010.

[20]

Magalhaes, J., Ciravegna, F. and Ruger, S. 2008. Exploring Multimedia in a Keyword Space. Proc. ACM Multimedia 2008.

Digital Library

[21]

Munteanu, D. S. and Marcu D. 2005. Improving Machine Translation Performance by Exploiting Non-Parallel Corpora. Computational Linguistics. Volume 31, Issue 4. pp. 477--504.

Digital Library

[22]

Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S.-F., Smith, J. R., Over, P., and Hauptmann, A. A light scale concept ontology for multimedia understanding for TRECVID 2005. Technical report, IBM, 2005.

[23]

Pazouki, E. and Rahmati, M. 2009. A novel multimedia data mining framework for information extraction of a soccer video stream. Intelligent Data Analysis, pp. 833--857.

Digital Library

[24]

Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., and Zhang, H.-J. 2007. Correlative Multi-label Video Annotation. Proc. ACM Multimedia 2007.

Digital Library

[25]

Saggion, H., Cunningham, H., Bontcheva, K., Maynard, D., Hamza, O., and Wilks, Y. 2004. Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project. Data Knowlege Engineering, 48, 2, pp. 247--264.

Digital Library

[26]

Wang, F. and Zhang, C. 2006. Label propagation through linear neighborhoods. Proc. ICML 2006.

Digital Library

Index Terms

Enhancing multi-lingual information extraction via cross-media inference and fusion
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Cross-Lingual Information Retrieval System for Indian Languages
Advances in Multilingual and Multimodal Information Retrieval

This paper describes our attempt to build a Cross-Lingual Information Retrieval (CLIR) system as a part of the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF competition. In this track, the task required retrieval of ...
A Neural Framework for English-Hindi Cross-Lingual Natural Language Inference
Neural Information Processing
Abstract
Recognizing Textual Entailment (RTE) between two pieces of texts is a very crucial problem in Natural Language Processing (NLP), and it adds further challenges when involving two different languages, i.e. in cross-lingual scenario. The paucity of ...
Semantic morphological variant selection and translation disambiguation for cross-lingual information retrieval
Abstract
Cross-Lingual Information Retrieval (CLIR) enables a user to query in a language which is different from the target documents language. CLIR incorporates a translation technique based on either a manual dictionary or a probabilistic dictionary ...

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings

COLING '10: Proceedings of the 23rd International Conference on Computational Linguistics: Posters

August 2010

1588 pages

General Chair:
Aravind K. Joshi
University of Pennsylvania
,
Program Chairs:
Chu-Ren Huang
The Hong Kong Polytechnic University
,
Dan Jurafsky
Stanford University

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 23 August 2010

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
148
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)3

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten