Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458082.1458113acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections

Comparing citation contexts for information retrieval

Published: 26 October 2008 Publication History


In previous work, we have shown that using terms from around citations in citing papers to index the cited paper, in addition to the cited paper's own terms, can improve retrieval effectiveness. Now, we investigate how to select text from around the citations in order to extract good index terms. We compare the retrieval effectiveness that results from a range of contexts around the citations, including no context, the entire citing paper, some fixed windows and several variations with linguistic motivations. We conclude with an analysis of the benefits of more complex, linguistically motivated methods for extracting citation index terms, over using a fixed window of terms. We speculate that there might be some advantage to using computational linguistic techniques for this task.


S. Bradshaw. Reference directed indexing: Redeeming relevance for subject search in citation indexes. In Proceedings of the European Conference on Research and Advanced Technology for Digital Libraries (ECDL), pages 499--510, 2003.
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30:107--117, 1998.
E. Briscoe and J. Carroll. Robust accurate statistical annotation of general text. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC), pages 1499--1504, 2002.
C. Cleverdon, J. Mills, and M. Keen. Factors determining the performance of indexing sytems, volume 1. design. Technical report, ASLIB Cranfield Project, 1966.
M. D. Dunlop and C. J. van Rijsbergen. Hypermedia and free text retrieval. Information Processing and Management, 29(3):287--298, 1993.
A. Fujii. Enhancing patent retrieval by citation analysis. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR), pages 793--794, 2007.
D. Hawking and N. Craswell. The very large collection and web tracks. In E. M. Voorhees and D. K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval, chapter 9. MIT Press, 2005.
W. Hersh and R. T. Bhupatiraju. Trec genomics track overview. In Proceedings of the Text REtrieval Conference (TREC), pages 14--23, 2003.
W. Hersh, R. T. Bhupatiraju, L. Ross, P. Johnson, A. M. Cohen, and D. F. Kraemer. Trec 2004 genomics track overview. In Proceedings of the Text REtrieval Conference (TREC), 2004.
W. Hersh, A. M. Cohen, P. Roberts, and H. K. Rekapilli. Trec 2006 genomics track overview. In Proceedings of the Text REtrieval Conference (TREC), 2006.
M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14:10--25, 1963.
M. Kluck. The GIRT data in the evaluation of CLIR systems - from 1997 until 2003. In Proceedings of Cross-Language Evaluation Forum (CLEF), pages 376--390, 2003.
O. McBryan. GENVL and WWWW: Tools for taming the web. In Proceedings of the World Wide Web Conference (WWW), 1994.
E. Meij and M. de Rijke. Using prior information derived from citations in literature search. In Proceedings of the International Conference on Recherche d'Information Assistée par Ordinateur (RIAO), 2007.
A. Mikheev. Tagging sentence boundaries. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, pages 264--271, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
H. Nanba and M. Okumura. Automatic detection of survey articles. In Proceedings of Research and Advanced Technology for Digital Libraries (ECDL), pages 391--401, 2005.
J. O'Connor. Citing statements: Computer recognition and use to improve retrieval. Information Processing and Management, 18(3):125--131, 1982.
J. O'Connor. Biomedical citing statements: Computer recognition and use to aid full-text retrieval. Information Processing and Management, 19:361--368, 1983.
B. Powley and R. Dale. Evidence-based information extraction for high accuracy citation and author name identification. In Proceedings of the International Conference on Recherche d'Information Assistée par Ordinateur (RIAO), 2007.
A. Ritchie, S. Robertson, and S. Teufel. Creating a test collection: Relevance judgements of cited & non-cited papers. In Proceedings of the International Conference on Recherche d'Information Assistéée par Ordinateur (RIAO), 2007.
A. Ritchie, S. Teufel, and S. Robertson. Creating a test collection for citation-based IR experiments. In Proceedings of Human Language Technology conference and the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), 2006.
A. Ritchie, S. Teufel, and S. Robertson. How to find better index terms through citations. In Proceedings of COLING/ACL Workshop on How Can Computational Linguistics Improve Information Retrieval?, 2006.
A. Ritchie, S. Teufel, and S. Robertson. Using terms from citations for IR: Some first results. In Proceedings of the European Conference for Information Retrieval (ECIR), 2007.
J. Schneider. Verification of bibliometric methods' applicability for thesaurus construction. >PhD thesis, Department of Information Studies, Royal School of Library and Information Science, 2004.
A. S. Schwartz and M. Hearst. Summarizing key concepts using citation sentences. In Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology, pages 134--135, 2006.
H. Small. Co-citation in the scientific literature: A new measurement of the relationship between two documents. Journal of the American Society of Information Science, 24(4):265--269, 1973.
H. Small. Citation context analysis. In B. Dervin and M. J. Voigt, editors, Progress in Communication Sciences, volume 3, pages 287--310. Ablex Publishing, 1982.
T. Strohman, W. B. Croft, and D. Jensen. Recommending citations for academic papers. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR), pages 705--706, 2007.
T. Strohman, D. Metzler, H. Turtle, and W. B. Croft. Indri: a language-model based search engine for complex queries. Technical report, University of Massachusetts, 2005.
S. Teufel, A. Siddharthan, and D. Tidhar. Automatic classification of citation function. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 103--110, 2006.

Cited By

View all
  • (2024)Empowering Legal Citation Recommendation via Efficient Instruction-Tuning of Pre-trained Language ModelsAdvances in Information Retrieval10.1007/978-3-031-56027-9_19(310-324)Online publication date: 20-Mar-2024
  • (2022)A quantitative and qualitative open citation analysis of retracted articles in the humanitiesQuantitative Science Studies10.1162/qss_a_002223:4(953-975)Online publication date: 20-Dec-2022
  • (2022)Who Cites Whom and How It Impacts the Knowledge Production Process across Disciplines?: A Methodological InsightScience & Technology Libraries10.1080/0194262X.2022.2059607(1-14)Online publication date: 13-Apr-2022
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
October 2008
1562 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008


Request permissions for this article.

Check for updates

Author Tags

  1. citation context analysis
  2. ir evaluation


  • Research-article


CIKM08: Conference on Information and Knowledge Management
October 26 - 30, 2008
California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Feb 2025

Other Metrics


Cited By

View all
  • (2024)Empowering Legal Citation Recommendation via Efficient Instruction-Tuning of Pre-trained Language ModelsAdvances in Information Retrieval10.1007/978-3-031-56027-9_19(310-324)Online publication date: 20-Mar-2024
  • (2022)A quantitative and qualitative open citation analysis of retracted articles in the humanitiesQuantitative Science Studies10.1162/qss_a_002223:4(953-975)Online publication date: 20-Dec-2022
  • (2022)Who Cites Whom and How It Impacts the Knowledge Production Process across Disciplines?: A Methodological InsightScience & Technology Libraries10.1080/0194262X.2022.2059607(1-14)Online publication date: 13-Apr-2022
  • (2022)Semantic similarity-based credit attribution on citation paths: a method for allocating residual citation to and investigating depth of influence of scientific communicationsScientometrics10.1007/s11192-022-04522-3127:11(6257-6277)Online publication date: 28-Sep-2022
  • (2022)Integrated knowledge content in an interdisciplinary field: identification, classification, and applicationScientometrics10.1007/s11192-022-04282-0127:11(6581-6614)Online publication date: 14-Feb-2022
  • (2021)Context-aware legal citation recommendation using deep learningProceedings of the Eighteenth International Conference on Artificial Intelligence and Law10.1145/3462757.3466066(79-88)Online publication date: 21-Jun-2021
  • (2021)The explanatory power of citations: a new approach to unpacking impact in scienceScientometrics10.1007/s11192-021-04103-w126:12(9779-9809)Online publication date: 1-Dec-2021
  • (2021)A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.'s caseScientometrics10.1007/s11192-021-04097-5Online publication date: 5-Aug-2021
  • (2021)A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studiesScientometrics10.1007/s11192-021-04055-1126:8(6551-6599)Online publication date: 1-Aug-2021
  • (2020)Citation recommendation: approaches and datasetsInternational Journal on Digital Libraries10.1007/s00799-020-00288-221:4(375-405)Online publication date: 1-Dec-2020
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media