Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Improved Second-Order Bounds for Prediction with Expert Advice

  • Conference paper
Learning Theory (COLT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

Abstract

This work studies external regret in sequential prediction games with arbitrary payoffs (nonnegative or non-positive). External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. We focus on two important parameters: M, the largest absolute value of any payoff, and Q *, the sum of squared payoffs of the best action. Given these parameters we derive first a simple and new forecasting strategy with regret at most order of \(\sqrt{Q^{*}({\rm ln}N)}+M {\rm ln} N\), where N is the number of actions. We extend the results to the case where the parameters are unknown and derive similar bounds. We then devise a refined analysis of the weighted majority forecaster, which yields bounds of the same flavour. The proof techniques we develop are finally applied to the adversarial multi-armed bandit setting, and we prove bounds on the performance of an online algorithm in the case where there is no lower bound on the probability of each action.

The work of all authors was supported in part by the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Allenberg-Neeman, C., Auer, P.: Personal communication

    Google Scholar 

  2. Allenberg-Neeman, C., Neeman, B.: Full information game with gains and losses. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 264–278. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 48–77 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  4. Auer, P., Cesa-Bianchi, N., Gentile, C.: Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences 64, 48–75 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  5. Cesa-Bianchi, N., Freund, Y., Helmbold, D.P., Haussler, D., Schapire, R., Warmuth, M.K.: How to use expert advice. Journal of the ACM 3, 427–485 (1997)

    Article  MathSciNet  Google Scholar 

  6. Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Minimizing regret with label efficient prediction. IEEE Transactions on Information Theory (to appear)

    Google Scholar 

  7. Cesa-Bianchi, N., Lugosi, G., Stoltz, G.: Regret minimization under partial monitoring. Submitted for journal publication (2004)

    Google Scholar 

  8. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (to appear)

    Google Scholar 

  9. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  10. Hart, S., Mas-Colell, A.: A Reinforcement Procedure Leading to Correlated Equilibrium. In: Neuefeind, W., Trockel, W. (eds.) Economic Essays, Gerard Debreu, pp. 181–200. Springer, Heidelberg (2001)

    Google Scholar 

  11. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108, 212–261 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  12. Piccolboni, A., Schindelhauer, C.: Discrete prediction games with arbitrary feedback and loss. In: Proceedings of the 14th Annual Conference on Computational Learning Theory, pp. 208–223 (2001)

    Google Scholar 

  13. Vovk, V.G.: A Game of Prediction with Expert Advice. Journal of Computer and System Sciences 56(2), 153–173 (1998)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cesa-Bianchi, N., Mansour, Y., Stoltz, G. (2005). Improved Second-Order Bounds for Prediction with Expert Advice. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_15

Download citation

  • DOI: https://doi.org/10.1007/11503415_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26556-6

  • Online ISBN: 978-3-540-31892-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics