Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2330163.2330343acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Analysis of a natural gradient algorithm on monotonic convex-quadratic-composite functions

Published: 07 July 2012 Publication History

Abstract

In this paper we investigate the convergence properties of a variant of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Our study is based on the recent theoretical foundation that the pure rank-μ update CMA-ES performs the natural gradient descent on the parameter space of Gaussian distributions. We derive a novel variant of the natural gradient method where the parameters of the Gaussian distribution are updated along the natural gradient to improve a newly defined function on the parameter space. We study this algorithm on composites of a monotone function with a convex quadratic function. We prove that our algorithm adapts the covariance matrix so that it becomes proportional to the inverse of the Hessian of the original objective function. We also show the speed of covariance matrix adaptation and the speed of convergence of the parameters. We introduce a stochastic algorithm that approximates the natural gradient with finite samples and present some simulated results to evaluate how precisely the stochastic algorithm approximates the deterministic, ideal one under finite samples and to see how similarly our algorithm and the CMA-ES perform.

References

[1]
Y. Akimoto, Y. Nagata, I. Ono, and S. Kobayashi. Bidirectional Relation between CMA Evolution Strategies and Natural Evolution Strategies. In Parallel Problem Solving from Nature - PPSN XI, pages 154--163, 2010.
[2]
S. Amari. Natural Gradient Works Efficiently in Learning. Neural Computation, 10(2):251--276, 1998.
[3]
S. Amari and H. Nagaoka. Methods of Information Geometry. Methods of Information Geometry. American Mathematical Society, 2007.
[4]
S. Amari, H. Park, and K. Fukumizu. Adaptive method of realizing natural gradient learning for multilayer perceptrons. Neural Computation, 12:1399--1409, 2000.
[5]
L. Arnold, A. Auger, N. Hansen, and Y. Ollivier. Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles. arXiv:1106.3708v1, 2011.
[6]
A. Auger. Convergence results for the (1,μ)-SA-ES using the theory of φ-irreducible Markov chains. Theoretical Computer Science, 334(1-3):35--69, 2005.
[7]
S. Baluja and R. Caruana. Removing the genetics from the standard genetic algorithm. In Proceedings of the 12th International Conference on Machine Learning, pages 38--46, 1995.
[8]
S. Bhatnagar, R. S. Sutton, M. Ghavamzadeh, and M. Lee. Natural actor-critic algorithms. Automatica, 45(11):2471--2482, 2009.
[9]
P. Billingsley. Probability and Measure. Wiley Series in Probability and Mathematical Statistics. Wiley-Interscience, third edition, 1995.
[10]
P.-T. D. Boer, D. P. Kroese, S. Mannor, and R. Y. Rubinstein. A Tutorial on the Cross-Entropy Method. Annals of Operations Research, pages 19--67, 2005.
[11]
T. Glasmachers, T. Schaul, S. Yi, D. Wierstra, and J. Schmidhuber. Exponential Natural Evolution Strategies. In Proceedings of Genetic and Evolutionary Computation Conference, pages 393--400. ACM, 2010.
[12]
E. Greensmith, P. L. Bartlett, and J. Baxter. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning. Journal of Machine Learning Research, 5:1471--1530, 2004.
[13]
N. Hansen and S. Kern. Evaluating the CMA Evolution Strategy on Multimodal Test Functions. In Parallel Problem Solving from Nature - PPSN VIII, pages 282--291, 2004.
[14]
N. Hansen, S. D. Müller, and P. Koumoutsakos. Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evolutionary Computation, 11(1):1--18, 2003.
[15]
N. Hansen and A. Ostermeier. Completely derandomized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159--195, 2001.
[16]
D. A. Harville. Matrix Algebra from a Statistician's Perspective, Springer Verlag, 2008.
[17]
J. Jägersküpper. Algorithmic analysis of a basic evolutionary algorithm for continuous optimization. Theoretical Computer Science, 379(3):329--347, 2007.
[18]
P. Larrañaga and J. A. Lozano. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation, Springer Netherlands, 2002.
[19]
L. Mirsky. A trace inequality of John von Neumann. Monatshefte für Mathematik, 79(4):303--306, 1975.
[20]
J. Nocedal and S. J. Wright. Numerical Optimization. Springer Series in Operations Research. Springer, second edition, 2006.
[21]
H. Park, S. Amari, and K. Fukumizu. Adaptive natural gradient learning algorithms for various stochastic models. Neural networks : the official journal of the International Neural Network Society, 13(7):755--764, 2000.
[22]
J. Peters and S. Schaal. Natural actor-critic. Neurocomputing, 71(7-9):1180--1190, 2008.
[23]
M. Rattray, D. Saad, and S. Amari. Natural gradient descent for on-line learning. Physical review letters, 81(24):5461--5464, 1998.

Cited By

View all
  • (2023)Surrogate-Assisted $$(1+1)$$-CMA-ES with Switching Mechanism of Utility FunctionsApplications of Evolutionary Computation10.1007/978-3-031-30229-9_51(798-814)Online publication date: 9-Apr-2023
  • (2022)Monotone improvement of information-geometric optimization algorithms with a surrogate functionProceedings of the Genetic and Evolutionary Computation Conference10.1145/3512290.3528690(1354-1362)Online publication date: 8-Jul-2022
  • (2022)Global Linear Convergence of Evolution Strategies on More than Smooth Strongly Convex FunctionsSIAM Journal on Optimization10.1137/20M137381532:2(1402-1429)Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '12: Proceedings of the 14th annual conference on Genetic and evolutionary computation
July 2012
1396 pages
ISBN:9781450311779
DOI:10.1145/2330163
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. covariance matrix adaptation
  2. hessian matrix
  3. information geometric optimization
  4. natural gradient
  5. theory

Qualifiers

  • Research-article

Conference

GECCO '12
Sponsor:
GECCO '12: Genetic and Evolutionary Computation Conference
July 7 - 11, 2012
Pennsylvania, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)4
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Surrogate-Assisted $$(1+1)$$-CMA-ES with Switching Mechanism of Utility FunctionsApplications of Evolutionary Computation10.1007/978-3-031-30229-9_51(798-814)Online publication date: 9-Apr-2023
  • (2022)Monotone improvement of information-geometric optimization algorithms with a surrogate functionProceedings of the Genetic and Evolutionary Computation Conference10.1145/3512290.3528690(1354-1362)Online publication date: 8-Jul-2022
  • (2022)Global Linear Convergence of Evolution Strategies on More than Smooth Strongly Convex FunctionsSIAM Journal on Optimization10.1137/20M137381532:2(1402-1429)Online publication date: 1-Jan-2022
  • (2022)Analysis of Surrogate-Assisted Information-Geometric Optimization AlgorithmsAlgorithmica10.1007/s00453-022-01087-886:1(33-63)Online publication date: 22-Dec-2022
  • (2021)Risk-Aware Model-Based ControlFrontiers in Robotics and AI10.3389/frobt.2021.6178398Online publication date: 11-Mar-2021
  • (2020)On broaching-to prevention using optimal control theory with evolution strategy (CMA-ES)Journal of Marine Science and Technology10.1007/s00773-020-00722-9Online publication date: 19-Apr-2020
  • (2019)Generalized drift analysis in continuous domainProceedings of the 15th ACM/SIGEVO Conference on Foundations of Genetic Algorithms10.1145/3299904.3340303(13-24)Online publication date: 27-Aug-2019
  • (2018)Analysis of information geometric optimization with isotropic gaussian distribution under finite samplesProceedings of the Genetic and Evolutionary Computation Conference10.1145/3205455.3205487(897-904)Online publication date: 2-Jul-2018
  • (2017)On the Statistical Learning Ability of Evolution StrategiesProceedings of the 14th ACM/SIGEVO Conference on Foundations of Genetic Algorithms10.1145/3040718.3040722(127-138)Online publication date: 12-Jan-2017
  • (2016)Online Model Selection for Restricted Covariance Matrix AdaptationParallel Problem Solving from Nature – PPSN XIV10.1007/978-3-319-45823-6_1(3-13)Online publication date: 31-Aug-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media