Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Novel Decomposition-Based Multi-objective Evolutionary Algorithm Using Reinforcement Learning Adaptive Operator Selection (MOEA/D-QL)

  • Chapter
  • First Online:
New Horizons for Fuzzy Logic, Neural Networks and Metaheuristics

Abstract

It is known that applying adaptive operator selection (AOS) techniques can improve the search process in the space of candidate solutions of a multi-objective evolutionary algorithm. The AOS consists of two main tasks; the first assigns the weights or credits to each available operator, and the second selects the best operator. For the first time in this chapter, a reinforcement learning technique (Q-learning) is used to perform an adaptive selection of operators in the MOEA/D algorithm, which we call MOEA/D-QL. The objective of Q-learning is to learn a series of rules that tell an agent what action to take under certain circumstances; that is, an agent seeks to execute the actions that give it the most ac-cumulated reward. In this case, an action corresponds to a variation operator. Four variants of the Differential Evolution operator have been used for this work. In addition, two states are also used: the first state, \(S_{0}\), corresponds to a child solution that enters front 0, and the second state, \(S_{1}\), corresponds to solutions that do not enter front 0. MOEA/D-QL algorithm has been validated by comparing it with two state-of-the-art multi-objective algorithms, MOEA/D and a version of MOEA/D that uses an AOS based on Thompson's dynamic sampling called MOEA/D-DYTS. Fifteen multi-objective reference problems with 2 and 3 objectives were used as instances. Three metrics have been applied: hypervolume, generalized dispersion, and inverse generational distance. The non-parametric tests Wilcoxon signed-rank and Friedman were applied with a significance level of 5%, where it is observed that the MOEA/D-QL algorithm is superior in the hypervolume and inverted generational distance metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 159.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sun, L., Li, K.: Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D (Version 1). arXiv (2020)

    Google Scholar 

  2. Li, K., Fialho, A., Kwong, S., Zhang, Q.: Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 18(1), 114–130 (2014). Institute of Electrical and Electronics Engineers (IEEE)

    Google Scholar 

  3. Fialho, Á., da Costa, L., Schoenauer, M., Sebag, M.: Analyzing bandit-based adaptive operator selection mechanisms. Annals Math. Artif. Intell. 60(1), 25–64 (2010). Springer Verlag

    Google Scholar 

  4. Goldberg, D.E.: Probability matching, the magnitude of reinforcement, and classifier system bidding. Mach. Learn. 5(4), 407–425 (1990). Springer Science and Business Media LLC

    Google Scholar 

  5. Thierens, D.: An adaptive pursuit strategy for allocating operator probabilities. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation—GECCO ’05. The 2005 Conference. ACM Press (2005)

    Google Scholar 

  6. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2/3), 235–256 (2002). Springer Science and Business Media LLC

    Google Scholar 

  7. Sutton, R.S., Barto, A.G.: Reinforcement Learning, Second Edition: An Introduction. MIT Press, London, England (2018)

    Google Scholar 

  8. Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD Thesis, University of Cambridge, England (1989)

    Google Scholar 

  9. Zhang, Q., Li, H.: MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans. Evol. Comput. 11(6), 712–731 (2007)

    Article  Google Scholar 

  10. Deb, K., Thiele, L., Laumanns, M., Zitzler, E.: Scalable test problems for evolutionary multi-objective optimization. ETH Zurich (2001). https://doi.org/10.3929/ETHZ-A-004284199

    Article  Google Scholar 

  11. Zhang, Q., Zhou, A., Zhao, S., Suganthan, P.N., Liu, W., Tiwari, S.: Multiobjective optimization test instances for the CEC 2009 special session and competition; special session on performance assessment of multi-objective optimization algorithms, technical report; University of Essex, Colchester. UK; Nanyang Technological University, Singapore, vol. 264, pp. 1–30 (2008)

    Google Scholar 

  12. Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000). https://doi.org/10.1162/106365600568202

    Article  Google Scholar 

  13. Brambila-Hernández, J.A., García-Morales, M.Á., Fraire-Huacuja, H.J., del Angel, A.B., Villegas-Huerta, E., Carbajal-López, R.: Experimental evaluation of adaptive operators selection methods for the dynamic multiobjective evolutionary algorithm based on decomposition (DMOEA/D). In: Castillo, O., Melin, P. (eds.) Hybrid Intelligent Systems Based on Extensions of Fuzzy Logic, Neural Networks and Metaheuristics. Studies in Computational Intelligence, vol. 1096. Springer, Cham (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel Ángel García-Morales .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Brambila-Hernández, J.A., García-Morales, M.Á., Fraire-Huacuja, H.J., Cruz-Reyes, L., Frausto-Solís, J. (2024). Novel Decomposition-Based Multi-objective Evolutionary Algorithm Using Reinforcement Learning Adaptive Operator Selection (MOEA/D-QL). In: Castillo, O., Melin, P. (eds) New Horizons for Fuzzy Logic, Neural Networks and Metaheuristics. Studies in Computational Intelligence, vol 1149. Springer, Cham. https://doi.org/10.1007/978-3-031-55684-5_11

Download citation

Publish with us

Policies and ethics