research-article

Discriminative models of integrating document evidence and document-candidate associations for expert search

Authors:

Aditya P. MathurAuthors Info & Claims

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 683 - 690

https://doi.org/10.1145/1835449.1835563

Published: 19 July 2010 Publication History

Abstract

Generative models such as statistical language modeling have been widely studied in the task of expert search to model the relationship between experts and their expertise indicated in supporting documents. On the other hand, discriminative models have received little attention in expert search research, although they have been shown to outperform generative models in many other information retrieval and machine learning applications. In this paper, we propose a principled relevance-based discriminative learning framework for expert search and derive specific discriminative models from the framework. Compared with the state-of-the-art language models for expert search, the proposed research can naturally integrate various document evidence and document-candidate associations into a single model without extra modeling assumptions or effort. An extensive set of experiments have been conducted on two TREC Enterprise track corpora (i.e., W3C and CERC) to demonstrate the effectiveness and robustness of the proposed framework.

References

[1]

P. Bailey, N. Craswell, A. De Vries, and I. Soboroff. Overview of the trec-2007 enterprise track. In TREC-15, 2007.

[2]

K. Balog. Non-local evidence for expert finding. In CIKM, 2008.

Digital Library

[3]

K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR, 2006.

Digital Library

[4]

K. Balog, L. Azzopardi, and M. de Rijke. A language modeling framework for expert finding. Information Processing & Management, 45(1):1--19, 2009.

Digital Library

[5]

K. Balog, L. Azzopardi, and M. de Rijke. Formal models for expert finding in enterprise corpora. In SIGIR, 2006.

Digital Library

[6]

K. Balog and M. De Rijke. Associating people and documents. In ECIR, 2008.

Digital Library

[7]

K. Balog, I. Soboroff, P. Thomas, N. Craswell, A. de Vries, and P. Bailey. Overview of the trec-2008 enterprise track. In TREC-16, 2008.

[8]

Y. Cao, J. Liu, S. Bao, and H. Li. Research on expert search at enterprise track of TREC 2005. In TREC-13, 2005.

[9]

P. Carlile. Working knowledge: how organizations manage what they know. Human Resource Planning, 21(4):58--60, 1998.

[10]

H. Chen, H. Shen, J. Xiong, S. Tan, and X. Cheng. Social network structure behind the mailing lists: Ict-iiis at trec 2006 expert finding track. In TREC-14, 2006.

[11]

W. Cooper. Exploiting the maximum entropy principle to increase retrieval effectiveness. JASIST, 34(1):31--39.

[12]

N. Craswell, A. de Vries, and I. Soboroff. Overview of the trec-2005 enterprise track. In TREC-13, 2005.

[13]

J. Dennis and R. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Society for Industrial Mathematics, 1996.

[14]

H. Fang and C. Zhai. Probabilistic models for expert finding. In ECIR, 2007.

Digital Library

[15]

Y. Fang, L. Si, and A. Mathur. Ranking experts with discriminative probabilistic models. In SIGIR Workshop on Learning to Rank for Information Retrieval, 2009.

[16]

Y. Fu, W. Yu, Y. Li, Y. Liu, M. Zhang, and S. Ma. THUIR at TREC 2005: Enterprise track. In TREC-14, 2006.

[17]

N. Fuhr. Probabilistic models in information retrieval. The Computer Journal, 35(3):243, 1992.

Digital Library

[18]

T. Liu. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, 3(3):225--331, 2009.

Digital Library

[19]

T. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. Letor: Benchmark dataset for research on learning to rank for information retrieval. In SIGIR Workshop on Learning to Rank for Information Retrieval, 2007.

[20]

C. Macdonald, D. Hannah, and I. Ounis. High quality expertise evidence for expert search. In ECIR, 2008.

Digital Library

[21]

C. Macdonald and I. Ounis. Voting for candidates: adapting data fusion techniques for an expert search task. In CIKM, 2006.

Digital Library

[22]

D. Metzler and W. Bruce Croft. Linear feature-based models for information retrieval. Information Retrieval, 10(3):257--274, 2007.

Digital Library

[23]

R. Nallapati. Discriminative models for information retrieval. In SIGIR, 2004.

Digital Library

[24]

A. Ng and M. Jordan. On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. NIPS, 2002.

Digital Library

[25]

D. Petkova and W. Croft. Proximity-based document representation for named entity retrieval. In CIKM, 2007.

Digital Library

[26]

S. Robertson. The probability ranking principle in IR. Journal of documentation, 33(4):294--304, 1977.

[27]

S. Robertson and K. Jones. Relevance weighting of search terms. JASIST, 27(3):129--146, 1976.

[28]

S. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford. Okapi at TREC-4. In TREC-4, 1996.

[29]

P. Serdyukov and D. Hiemstra. Being omnipresent to be almighty: The importance of the global web evidence for organizational expert finding. In SIGIR Workshop on Future Challenges in Expertise Retrieval, 2008.

[30]

P. Serdyukov and D. Hiemstra. Modeling documents as mixtures of persons for expert finding. In ECIR, 2008.

Digital Library

[31]

P. Serdyukov, H. Rode, and D. Hiemstra. Modeling multi-step relevance propagation for expert finding. In CIKM, 2008.

Digital Library

[32]

I. Soboroff, A. de Vries, and N. Craswell. Overview of the trec-2006 enterprise track. In TREC-14, 2006.

[33]

T. Strohman, D. Metzler, H. Turtle, and W. Croft. Indri: A language model-based search engine for complex queries. In International Conference on Intelligence Analysis, 2004.

[34]

J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In SIGKDD, 2008.

Digital Library

[35]

D. Tax, M. Van Breukelen, R. Duin, and J. Kittler. Combining multiple classifiers by averaging or by multiplying? Pattern recognition, 33(9):1475--1485, 2000.

[36]

D. Yimam-Seid and A. Kobsa. Expert finding systems for organizations. Sharing Expertise: Beyond Knowledge Management, 2003.

[37]

C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. TOIS, 22(2):214, 2004.

Digital Library

[38]

J. Zhu, X. Huang, D. Song, and S. Ruger. Integrating multiple document features in language models for expert finding. Knowledge and Information Systems, pages 1--26.

Digital Library

Cited By

Borna SBarry BMakarova SParte YHaider CSehgal ALeibovich BForte A(2024)Artificial Intelligence Algorithms for Expert Identification in Medical Domains: A Scoping ReviewEuropean Journal of Investigation in Health, Psychology and Education10.3390/ejihpe1405007814:5(1182-1196)Online publication date: 28-Apr-2024
https://doi.org/10.3390/ejihpe14050078
Liao WZhu YLi YZhang QOu ZLi X(2024)RevGNN: Negative Sampling Enhanced Contrastive Graph Learning for Academic Reviewer RecommendationACM Transactions on Information Systems10.1145/367920043:1(1-26)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3679200
Fang XSi SSun GSheng QWu WWang KLv H(2022)Selecting Workers Wisely for Crowdsourcing When Copiers and Domain Experts Co-existFuture Internet10.3390/fi1402003714:2(37)Online publication date: 24-Jan-2022
https://doi.org/10.3390/fi14020037
Show More Cited By

Index Terms

Discriminative models of integrating document evidence and document-candidate associations for expert search
1. Information systems
  1. Information retrieval
  2. Information storage systems

Recommendations

Formal models for expert finding in enterprise corpora
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Searching an organization's document repositories for experts provides a cost effective solution for the task of expert finding. We present two general strategies to expert searching given a document collection which are formalized using generative ...
Discriminative probabilistic models for expert search in heterogeneous information sources
Abstract
In many realistic settings of expert finding, the evidence for expertise often comes from heterogeneous knowledge sources. As some sources tend to be more reliable and indicative than the others, different information sources need to receive ...
Discriminative probabilistic models for passage based retrieval
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

The approach of using passage-level evidence for document retrieval has shown mixed results when it is applied to a variety of test beds with different characteristics. One main reason of the inconsistent performance is that there exists no unified ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

43
Total Citations
View Citations
513
Total Downloads

Downloads (Last 12 months)20
Downloads (Last 6 weeks)2

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Borna SBarry BMakarova SParte YHaider CSehgal ALeibovich BForte A(2024)Artificial Intelligence Algorithms for Expert Identification in Medical Domains: A Scoping ReviewEuropean Journal of Investigation in Health, Psychology and Education10.3390/ejihpe1405007814:5(1182-1196)Online publication date: 28-Apr-2024
https://doi.org/10.3390/ejihpe14050078
Liao WZhu YLi YZhang QOu ZLi X(2024)RevGNN: Negative Sampling Enhanced Contrastive Graph Learning for Academic Reviewer RecommendationACM Transactions on Information Systems10.1145/367920043:1(1-26)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3679200
Fang XSi SSun GSheng QWu WWang KLv H(2022)Selecting Workers Wisely for Crowdsourcing When Copiers and Domain Experts Co-existFuture Internet10.3390/fi1402003714:2(37)Online publication date: 24-Jan-2022
https://doi.org/10.3390/fi14020037
Ali NHalim ZHussain S(2022)An artificial intelligence-based framework for data-driven categorization of computer scientists: a case study of world’s Top 10 computing departmentsScientometrics10.1007/s11192-022-04627-9128:3(1513-1545)Online publication date: 31-Dec-2022
https://dl.acm.org/doi/10.1007/s11192-022-04627-9
Patil AMahalle P(2021)A Building Topical 2-Gram Model: Discovering and Visualizing the Topics Using Frequent Pattern MiningProceeding of First Doctoral Symposium on Natural Computing Research10.1007/978-981-33-4073-2_2(11-21)Online publication date: 19-Mar-2021
https://doi.org/10.1007/978-981-33-4073-2_2
Zheng WHou HWu NSun S(2021)Bayesian Belief Network Model Using Sematic Concept for Expert FindingKnowledge Science, Engineering and Management 10.1007/978-3-030-82147-0_10(114-125)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1007/978-3-030-82147-0_10
Husain OSalim NAlias RAbdelsalam SHassan A(2019)Expert Finding Systems: A Systematic ReviewApplied Sciences10.3390/app92042509:20(4250)Online publication date: 11-Oct-2019
https://doi.org/10.3390/app9204250
Gonçalves RDorneles C(2019)Automated Expertise RetrievalACM Computing Surveys10.1145/333100052:5(1-30)Online publication date: 13-Sep-2019
https://dl.acm.org/doi/10.1145/3331000
Dehghan MAbin A(2019)Translations Diversification for Expert FindingACM Transactions on Knowledge Discovery from Data10.1145/332048913:3(1-20)Online publication date: 29-May-2019
https://dl.acm.org/doi/10.1145/3320489
Liang S(2019)Unsupervised Semantic Generative Adversarial Networks for Expert RetrievalThe World Wide Web Conference10.1145/3308558.3313625(1039-1050)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308558.3313625
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents