Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2484028.2484056acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

Published: 28 July 2013 Publication History

Abstract

Patent prior art search is a task in patent retrieval where the goal is to rank documents which describe prior art work related to a patent application. One of the main properties of patent retrieval is that the query topic is a full patent application and does not represent a focused information need. This query by document nature of patent retrieval introduces new challenges and requires new investigations specific to this problem. Researchers have addressed this problem by considering different information resources for query reduction and query disambiguation. However, previous work has not fully studied the effect of using proximity information and exploiting domain specific resources for performing query disambiguation.
In this paper, we first reduce the query document by taking the first claim of the document itself. We then build a query-specific patent lexicon based on definitions of the International Patent Classification (IPC). We study how to expand queries by selecting expansion terms from the lexicon that are focused on the query topic. The key problem is how to capture whether an expansion term is focused on the query topic or not. We address this problem by exploiting proximity information. We assign high weights to expansion terms appearing closer to query terms based on the intuition that terms closer to query terms are more likely to be related to the query topic.
Experimental results on two patent retrieval datasets show that the proposed method is effective and robust for query expansion, significantly outperforming the standard pseudo relevance feedback (PRF) and existing baselines in patent retrieval.

References

[1]
A. Arampatzis and J. Kamps. A signal-to-noise approach to score normalization. In CIKM, pages 797--806, 2009.
[2]
L. Azzopardi and V. Vinay. Retrievability: an evaluation measure for higher order information access tasks. In CIKM, pages 561--570, 2008.
[3]
S. Bashir and A. Rauber. Improving retrievability of patents in prior-art search. In ECIR, pages 457--470, 2010.
[4]
S. Cetintas and L. Si. Effective query generation and postprocessing strategies for prior art patent search. JASIST, 63(3):512--527, 2012.
[5]
D. Ganguly, J. Leveling, W. Magdy, and G. J. F. Jones. Patent query reduction based on pseudo-relevant documents. In CIKM, pages 1953--1956, 2011.
[6]
S. Gerani, M. J. Carman, and F. Crestani. Aggregation methods for proximity-based opinion retrieval. TOIS, 30(4):26, 2012.
[7]
J.-H. Lee. Analyses of multiple evidence combination. In SIGIR, pages 267--276, 1997.
[8]
P. Lopez and L. Romary. Patatras: Retrieval model combination and regression models for prior art search. In CLEF (Notebook Papers/LABs/Workshops), pages 430--437, 2009.
[9]
P. Lopez and L. Romary. Experiments with citation mining and key-term extraction for prior art search. CLEF (Notebook Papers/LABs/Workshops), 2010.
[10]
M. Lupu and A. Hanbury. Patent retrieval. Foundations and Trends® in Information Retrieval, 7(1):1--97, 2013.
[11]
M. Lupu, K. Mayer, J. Tait, and A. Trippe. Current Challenges in Patent Information Retrieval. Springer, 2011.
[12]
Y. Lv and C. Zhai. Positional language models for information retrieval. In SIGIR, pages 299--306, 2009.
[13]
Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In SIGIR, pages 579--586, 2010.
[14]
W. Magdy and G. J. F. Jones. PRES: A score metric for evaluating recall-oriented information retrieval applications. In SIGIR, pages 611--618, 2010.
[15]
W. Magdy and G. J. F. Jones. A study on query expansion methods for patent retrieval. In PAIR 2011 - CIKM, pages 19--24, 2011.
[16]
P. Mahdabi, L. Andersson, M. Keikha, and F. Crestani. Automatic refinement of patent queries using concept importance predictors. In SIGIR, pages 505--514, 2012.
[17]
P. Sondhi, V. G. V. Vydiswaran, and C. Zhai. Reliability prediction of webpages in the medical domain. In ECIR, pages 219--231, 2012.
[18]
X. Xue and W. B. Croft. Automatic query generation for patent search. CKIM, pages 2037--2040, 2009.
[19]
X. Yin, X. Huang, and Z. Li. Promoting ranking diversity for biomedical information retrieval using wikipedia. In ECIR, pages 495--507, 2010.
[20]
C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR, pages 334--342, 2001.

Cited By

View all
  • (2023)Heterogeneous graph attention networks for passage retrievalInformation Retrieval10.1007/s10791-023-09424-326:1-2Online publication date: 16-Nov-2023
  • (2022)End to End Neural Retrieval for Patent Prior Art SearchAdvances in Information Retrieval10.1007/978-3-030-99739-7_66(537-544)Online publication date: 5-Apr-2022
  • (2022)Passage Retrieval on Structured Documents Using Graph Attention NetworksAdvances in Information Retrieval10.1007/978-3-030-99739-7_2(13-21)Online publication date: 10-Apr-2022
  • Show More Cited By

Index Terms

  1. Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
      July 2013
      1188 pages
      ISBN:9781450320344
      DOI:10.1145/2484028
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 July 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. patent search
      2. proximity information
      3. query expansion

      Qualifiers

      • Research-article

      Conference

      SIGIR '13
      Sponsor:

      Acceptance Rates

      SIGIR '13 Paper Acceptance Rate 73 of 366 submissions, 20%;
      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 25 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Heterogeneous graph attention networks for passage retrievalInformation Retrieval10.1007/s10791-023-09424-326:1-2Online publication date: 16-Nov-2023
      • (2022)End to End Neural Retrieval for Patent Prior Art SearchAdvances in Information Retrieval10.1007/978-3-030-99739-7_66(537-544)Online publication date: 5-Apr-2022
      • (2022)Passage Retrieval on Structured Documents Using Graph Attention NetworksAdvances in Information Retrieval10.1007/978-3-030-99739-7_2(13-21)Online publication date: 10-Apr-2022
      • (2021)Does More Context Help? Effects of Context Window and Application Source on Retrieval PerformanceACM Transactions on Information Systems10.1145/347405540:2(1-40)Online publication date: 27-Sep-2021
      • (2019)Patent expanded retrieval via word embedding under composite-domain perspectivesFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-018-7056-613:5(1048-1061)Online publication date: 1-Oct-2019
      • (2019)Patent retrieval: a literature reviewKnowledge and Information Systems10.1007/s10115-018-1322-7Online publication date: 14-Jan-2019
      • (2019)Enhancing the Healthcare Retrieval with a Self-adaptive Saturated Density FunctionAdvances in Knowledge Discovery and Data Mining10.1007/978-3-030-16148-4_39(501-513)Online publication date: 22-Mar-2019
      • (2018)Toward an Interactive Patent Retrieval Framework based on Distributed RepresentationsThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210106(957-960)Online publication date: 27-Jun-2018
      • (2018)PatSearchKnowledge and Information Systems10.1007/s10115-017-1127-057:1(135-158)Online publication date: 1-Oct-2018
      • (2017)Exploiting semantic knowledge base for patent retrieval2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/FSKD.2017.8393111(2195-2200)Online publication date: Jul-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media