Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Estimation of the Collection Parameter of Information Models for IR

  • Conference paper
Advances in Information Retrieval (ECIR 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7814))

Included in the following conference series:

Abstract

In this paper we explore various methods to estimate the collection parameter of the information based models for ad hoc information retrieval. In previous studies, this parameter was set to the average number of documents where the word under consideration appears. We introduce here a fully formalized estimation method for both the log-logistic and the smoothed power law models that leads to improved versions of these models in IR. Furthermore, we show that the previous setting of the collection parameter of the log-logistic model is a special case of the estimated value proposed here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Balakrishnan, N., Rao, C.R.: Advances in Survival Analysis, 3rd edn. Handbook of Statistics, vol. 23, ch. 5, p. 96. North Holland (February 2004)

    Google Scholar 

  2. Church, K.W., Gale, W.A.: Poisson mixtures. Natural Language Engineering 1, 163–190 (1995)

    Article  Google Scholar 

  3. Clinchant, S., Gaussier, E.: Information-based models for ad hoc ir. In: Proceedings of the 33rd Annual International ACM SIGIR Conference (2010)

    Google Scholar 

  4. Clinchant, S., Gaussier, E.: Retrieval constraints and word frequency distributions a log-logistic model for ir. Information Retrieval 14(1), 5–25 (2011)

    Article  Google Scholar 

  5. Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proceedings of the 27th Annual International ACM SIGIR Conference (2004)

    Google Scholar 

  6. Johnson, N., Kemp, A., Kotz, S.: Univariate Discrete Distributions. John Wiley & Sons, Inc. (1993)

    Google Scholar 

  7. Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. Journal of the American Statistical Association 53(282), 457–481 (1958)

    Article  MathSciNet  MATH  Google Scholar 

  8. Lv, Y., Zhai, C.: A Log-Logistic Model-Based Interpretation of TF Normalization of BM25. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 244–255. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Ponte, J.M., Bruce Croft, W.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference (1998)

    Google Scholar 

  10. Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)

    Article  Google Scholar 

  11. Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Goswami, P., Gaussier, E. (2013). Estimation of the Collection Parameter of Information Models for IR. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36973-5_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36972-8

  • Online ISBN: 978-3-642-36973-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics