Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2911451.2914674acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

The BOLT IR Test Collections of Multilingual Passage Retrieval from Discussion Forums

Published: 07 July 2016 Publication History

Abstract

This paper describes a new test collection for passage retrieval from multilingual, informal text. The task being modeled is that of a monolingual English-speaking user who wishes to search discussion forum text in a foreign language. The system retrieves relevant short passages of text and presents them to the user, translated into English. The test collection contains more than 2 billion words of discussion thread text, 250 queries representing complex informational search needs, and manual relevance judgments of forum post passages, pooled from real systems. This information retrieval test collection is the first to combine multilingual search, passage retrieval, and informal online genre text.

References

[1]
J. Allan. HARD track overview in TREC 2005: High accuracy retrieval from documents. In Proceedings of TREC 2005, pages 52--68. NIST, 2006.
[2]
BOLT program home page. http://www.darpa.mil/program/broad-operational-language-translation, retrieved on Feb 1, 2016.
[3]
K. Griffitt and S. Strassel. The query of everything: Developing open-domain, natural language queries for BOLT information retrieval. In Proceedings of LREC 2016, Portoro\vz, Slovenia, 2016.
[4]
M. Kaszkiel and J. Zobel. Passage retrieval revisited. In Proceedings of SIGIR 1997, Philadelphia, PA, 1997.
[5]
J. Medero, K. Maeda, S. Strassel, and C. Walker. An efficient approach for gold-standard annotation: Decision points for complex tasks. In Proceedings of LREC 2006, Genoa, Italy, 2006.
[6]
T. Sakai and N. Kando. On information retrieval metrics designed for evaluation with incomplete relevance assessments. Information Retrieval, 11(5):447--470, March 2008.
[7]
S. Tellex, B. Katz, J. Lin, A. Fernandes, and G. Martin. Quantitative evaluation of passage retrieval algorithms for question answering. In Proceedings of SIGIR 2003, Toronto, Canada, 2003.
[8]
E. M. Voorhees and D. K. Harman, editors. TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.
[9]
W. Xu, R. Grishman, and L. Zhao. Passage retrieval for information extraction using distant supervision. In Proceedings of the 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011.
[10]
E. Yilmaz, E. Kanoulas, and J. A. Aslam. A simple and efficient sampling method for estimating AP and NDCG. In Proceedings of SIGIR 2008, Singapore, 2008.

Cited By

View all
  • (2024)A Workbench for Autograding Retrieve/Generate SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657871(1963-1972)Online publication date: 10-Jul-2024
  • (2021)REGIS: A Test Collection for Geoscientific Documents in PortugueseProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463256(2363-2368)Online publication date: 11-Jul-2021
  • (2019)Improved Cross-Lingual Question Retrieval for Community Question AnsweringThe World Wide Web Conference10.1145/3308558.3313502(3179-3186)Online publication date: 13-May-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
July 2016
1296 pages
ISBN:9781450340694
DOI:10.1145/2911451
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. forum search
  2. machine translation
  3. passage retrieval

Qualifiers

  • Short-paper

Conference

SIGIR '16
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Workbench for Autograding Retrieve/Generate SystemsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657871(1963-1972)Online publication date: 10-Jul-2024
  • (2021)REGIS: A Test Collection for Geoscientific Documents in PortugueseProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3463256(2363-2368)Online publication date: 11-Jul-2021
  • (2019)Improved Cross-Lingual Question Retrieval for Community Question AnsweringThe World Wide Web Conference10.1145/3308558.3313502(3179-3186)Online publication date: 13-May-2019
  • (2019)XCMRC: Evaluating Cross-Lingual Machine Reading ComprehensionNatural Language Processing and Chinese Computing10.1007/978-3-030-32233-5_43(552-564)Online publication date: 9-Oct-2019
  • (2019)A Test Collection for Passage Retrieval Evaluation of Spanish Health-Related ResourcesAdvances in Information Retrieval10.1007/978-3-030-15719-7_19(148-154)Online publication date: 14-Apr-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media