Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1738467.1738707guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Language models for web object retrieval

Published: 24 September 2009 Publication History

Abstract

Document-level information retrieval can unfortunately lead to highly inaccurate relevance ranking in answering object-oriented queries. A paradigm is proposed to enable searching at the object level. However, this reliability assumption is no longer valid in the object retrieval context when multiple copies of information about the same object typically exist. To resolve multiple copies inconsistent issue, we propose several language models for Web object retrieval, namely an unstructured object retrieval model, a structured object retrieval model, and a hybrid model with both structured and unstructured retrieval features. We test these models on a paper search engine and compare their performances. We conclude that the hybrid model is the superior by taking into account the extraction errors at varying levels.

References

[1]
Ricardo Baeza-Yates and Berthier Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley Publishers, 1999.
[2]
Deng Cai, Xiaofei He, Ji-Rong Wen, and Wei-Ying Ma. Block-Level Link Analysis. In Proceedings of SIGIR, 2004.
[3]
J. P. Callan. Passage-Level Evidence in Document Retrieval. In Proceedings of SIGIR, 1994.
[4]
J.P. Callan. Distributed information retrieval. In Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, edited by W. Bruce Croft. Kluwer Academic Publisher, pp. 127-150, 2000.
[5]
Abdur Chowdhury, Mohammed Aljlayl, Eric Jensen, Steve Beitzel, David Grossman and Ophir Frieder. Linear Combinations Based on Document Structure and Varied Stemming for Arabic Retrieval. In The Eleventh Text REtrieval Conference (TREC 2002), 2003.
[6]
Charles L.A. Clarke. Controlling Overlap in Content-Oriented XML Retrieval. In Proceedings of the SIGIR, 2005.
[7]
Nick Craswell, David Hawking and Trystan Upstill. TREC12 Web and Interactive Tracks at CSIRO. In The Twelfth Text Retrieval Conference(TREC 2003), 2004.
[8]
Ronald Fagin, Ravi Kumar, Kevin S. McCurley, Jasmine Novak, D. Sivakumar, John A. Tomlin and David P.Williamson. Searching the Workplace Web. In Proceedings of the Twelfth International World Wide Web Conference, 2003.
[9]
Hui Fang, Tao Tao and ChengXiang Zhai. A Formal Study of Information Retrieval Heuristics. In Proceedings of SIGIR, 2004.
[10]
David Hull. Using Statistical Testing in the Evaluation of Retrieval Experiments. In Proceedings of the ACM SIGIR, 1993.
[11]
Zaiqing Nie, Yuanzhi Zhang, Ji-Rong Wen and Wei-Ying Ma. Object-Level Ranking: Bringing Order to Web Objects. In Proceedings of the 14th international World Wide Web Conference (WWW), 2005.
[12]
Zaiqing Nie, Ji-Rong Wen and Wei-Ying Ma. Object-Level Vertical Search. To appear by the Third Biennial Conference on Innovative Data Systems Research (CIDR), 2007.
[13]
Norbert Fuhr. Probabilistic Models in Information Retrieval. The computer Journal, Vol.35, No.3, pp. 243-255.
[14]
Yiming Yang and Xin Liu. A re-examination of text categorization methods. In Proceedings of the ACM SIGIR, 1999.
[15]
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma. 2D Conditional Random Fields for Web Information Extraction. In Proceedings of the 22nd International Conference on Machine Learning (ICML), 2005.
[16]
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma. Simultaneous Record Detection and Attribute Labeling in Web Data Extraction. ACM Discovery and Data Mining (KDD), 2007.
[17]
Zuobing Xu, Ram Akella, Active Relevance Feedback for Difficult Queries. To be published in Proceedings of ACM 17th Conference on Information and Knowledge Management (CIKM) 2008
[18]
Zuobing Xu, Ram Akella, Bayesian Logistic Regression Model for Active Relevance Feedback. In Proceedings of the 31st ACM SIGIR Conference, 2008.
[19]
Zuobing Xu, Ram Akella, New Probabilistic Retrieval Model Based on the Dirichlet Compound Multinomial Distribution In Proceedings of the 31st SIGIR Conference, 2008.
  1. Language models for web object retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    WiCOM'09: Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
    September 2009
    5557 pages
    ISBN:9781424436927

    Sponsors

    • Tsinghua University: Tsinghua University
    • IEEE Communications Society
    • Beijing University of Posts and Telecommunications

    Publisher

    IEEE Press

    Publication History

    Published: 24 September 2009

    Author Tags

    1. information extraction
    2. information retrieval
    3. language model
    4. web objects

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media