Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/62437.62459acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

Optimum probability estimation based on expectations

Published: 01 May 1988 Publication History

Abstract

Probability estimation is important for the application of probabilistic models as well as for any evaluation in IR. We discuss the interdependencies between parameter estimation and other properties of probabilistic models. Then we define an optimum estimate which can be applied to various typical estimation problems in IR. A method for the computation of this estimate is described which uses expectations from empirical distributions. Some experiments show the applicability of our method, whereas comparable approaches are partially based on false assumptions or yield estimates with systematic errors.

References

[1]
Biebricher, P.; Fuhr, N.; Knorz, G.; Lustig, G.; Schwantner, M. (1988). The Automatic Indexing System AIR/PHYS ---From Research to Application. In this voIu#te.
[2]
Bookstein, A.; Swanson, D.R. (1974). Probabilistic models for automatic indexing. Journal of the American Society 4}for Information,Science 25, 312-318.
[3]
Bookstein, A. (1983). information Retrieval: A Sequential Le#rning Process. Journal of the American Society for Information Science 34, 331-342.
[4]
Cooper, W.S.; Huizinga, P. (1982). The maximum entropy principle and its application to the design of probabilistie information retrieval systems. Information Technology: Research and Development 1, 99-112.
[5]
Cox, D. (1970). Analy, is of Binary Data. Methuen, London.
[6]
Croft, W.B. (1981). Document Representation in Probabilistic Models of Information Retrieval. Journal of the American Society for Information Science 32(6), 451-457.
[7]
Fuhr, N.; gnorz# G. (1984). Retrieval test evaluation of a rule based automatic indexing (AIR/PHYS). In: Rijsbergen, C.J. van (Ed.): Research and Development in Information Retrieval. Cambridge University Press, Cambridge, 391-408.
[8]
Fuhr, N. (1985). A probabilistic model of dictionary based automatic indexing. In: Proeeeding# of the Conference "riao 85 (J#eeherche d'Informations A$$i#tde par Ordinateur}n, Grenoble, march 1985, 207-216.
[9]
Fuhr, N. (1986). Two models of retrieval with probabilistic indexing, in: Rabitti, F. (Ed.): ProceedingJ of the I986 A CM Conference on Renearch and Development in In'ormation Retrieval, 249-257.
[10]
Fuhr, N. (1988). Models for Retrieval with Probabilistic indexing. To appear in: Information Processing and Management.
[11]
Harper, D.J.; Rijsbergen, C.J. van (1978). An Evaluation of Feedback in Document retrieval using Cooecurrenee Data. Journal of Documentation 3g, 189-216.
[12]
Harter, S.D. (1975) Probabilistic Approach to Automatic Keyword Indexing, Part I: On the Distribution of Speciality Words in a Technical Literature. Part II: An Algorithm for Probabilistic Indexing. Journal of the American Society for Information Science 26, 197-206, 280-289.
[13]
Hiither, H.; Knorz, G. (1983). Sch#.tzung yon Zutcilungswahrscheinlichkcitcn fiir Dcskriptoren Ms Eintrag im Indcxierungswbrterbuch. In: Deutschcr Dokumenlartag 198#. K.G. Saur, Mfinchen, New York, London, Paris, 139-161.
[14]
Hiither, H. (1987'). SchKtzung yon Wahrseheinlichkeiten aufgrund kleiner Vorkommennhgufigkeiten in groflen Kollektionen. Internal Report No. DVII-87-1, TIt Darmstadt, FB Informatik, Datenverwaltungssysteme II.
[15]
Kantor, S.H. (1984). Maximum entropy and the optimum design of automated informatioz# retrieval systems. Information Technology: 2#esearch and Development 3, 88-94.
[16]
Kantor, P.B.; Lee, J.J. (1986). The maximum entropy principle in Information Retrieval. In: Rabitti, F. (Ed.): Proceeding8 of the 1986 A CM Conference on 27esearch and Development in information t#etrieval, 269-274.
[17]
Lam# K.; Yu, C. (1982). A Clustered Search Algorithm Incorporating Arbitrary Term Dependencies. A GM Transaetion# on Database Sy.'#terns 7, 500-508.
[18]
Losee# R. (1987)o The Effect of Database Size on Document llctrieval: Random and Best-First 1#etrieval Models. In: Rijsbergen, C.J. van; Yu, C.T. (tgd.): proceedings of the 1986 A GM Conference on Renearch and Development in Information Retrieval, 164-169.
[19]
Losee, R. (1988). Parameter Estimation for Probabilistie Document-Retrieval Models. To appear in: Journal of the American Society for Information Selenee.
[20]
ttaghavan, V.; Shi# H.; Yu# C.T. (1983). Evaluation of the 2 Poisson Model as a Basis for Using Term Frequency Data in Searching. In: Proceedings of the 1983 A CM Conference on Research and Development in Information Retrieval, 88-100.
[21]
EL#jsbergen, C.J. van (1977). A Theoretical Basis for the Use of Co-occurrence Data in Information Retrieval. Journal of Documentation 33, 106-119.
[22]
Rijsbergen, C.J. van; Harper, D.J. (1978). An evaluation of feedback in document retrieved using co-occurrence da#a. Journal of Documentation 34, 294-304.
[23]
Robertson, S.E.; Sparek Jones, K. (1976). Relevance Weighting of Search Terms. Journal of the American Society for Ynformation Science 27, 129-146.
[24]
Robertson, S.E.; Rijsbergen, C.J. van; Porter, M.F. (1981). Probabilistic models of indexing and searching. In: Oddy, R.N.; Robcrtson, S.E.; 1Lijsbergen, C.J. van; Williams, P.W. (Ed.)" fnformation Retrieval Research, Butterworth, London, 35-56.
[25]
Robertson# S.; Bovey# J. (1982). Statistical Problems in the application of probabili- #tic modela to information retrieval British Library Research and Development Report No. 5739. British Library, London.
[26]
ttobertson# /5. (1986). On Rclevazxce Weight Estimation and Query Expansion. Journal of Documentation 42, 182-188.
[27]
Salton, G.; Yu, C.T. (1976). Precision weighting- an effective automatic indexing method. Journal of the Association for Computing Machinery 23, 76-85.
[28]
Salton, G.; Buckley, C.; Yu, C.T. (1983). Aa evaluation of Term Dependence Models in information Retrieval. In: Salton, G.; Schneider, H.-J. (Ed.): tleJearch and Development in Information Retrieval Springer, Berlin, Heidelberg, New York, 151-173.
[29]
Wormser-Hacker# C. (1987). Der PADOK-Retrievaltest. Zur b#ethode und Verwendung statistischer Verfahren bet der Bewertung von Information-Retrieval-Systemen. Dissertation, UrLiversit#t Regensburg.
[30]
Yu, C.T.; Buckley# C.; Lain, K.; Salton, G. (1983). A Generalized Term Dependence Model in Information Retrieval. Information Technology: tle#earch and Development 2, 129-154.

Cited By

View all
  • (1989)Optimum polynomial retrieval functionsACM SIGIR Forum10.1145/75335.7534323:SI(69-76)Online publication date: 1-May-1989
  • (1989)Optimum polynomial retrieval functionsProceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/75334.75343(69-76)Online publication date: 1-May-1989
  • (1989)Überlegungen zum Aufbau eines Diagnose- und Dokumentationssystems für die Uni.-FrauenklinikArchives of Gynecology and Obstetrics10.1007/BF02417715245:1-4(1122-1123)Online publication date: Jul-1989

Index Terms

  1. Optimum probability estimation based on expectations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '88: Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
    May 1988
    677 pages
    ISBN:2706103094
    DOI:10.1145/62437
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 May 1988

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (1989)Optimum polynomial retrieval functionsACM SIGIR Forum10.1145/75335.7534323:SI(69-76)Online publication date: 1-May-1989
    • (1989)Optimum polynomial retrieval functionsProceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/75334.75343(69-76)Online publication date: 1-May-1989
    • (1989)Überlegungen zum Aufbau eines Diagnose- und Dokumentationssystems für die Uni.-FrauenklinikArchives of Gynecology and Obstetrics10.1007/BF02417715245:1-4(1122-1123)Online publication date: Jul-1989

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media