Abstract
The main objective of this paper is to outline a theoretical framework to analyse how humans’ decision-making strategies under uncertainty manage the trade-off between information gathering (exploration) and reward seeking (exploitation). A key observation, motivating this line of research, is the awareness that human learners are amazingly fast and effective at adapting to unfamiliar environments and incorporating upcoming knowledge: this is an intriguing behaviour for cognitive sciences as well as an important challenge for Machine Learning. The target problem considered is active learning in a black-box optimization task and more specifically how the exploration/exploitation dilemma can be modelled within Gaussian Process based Bayesian Optimization framework, which is in turn based on uncertainty quantification. The main contribution is to analyse humans’ decisions with respect to Pareto rationality where the two objectives are improvement expected and uncertainty quantification. According to this Pareto rationality model, if a decision set contains a Pareto efficient (dominant) strategy, a rational decision maker should always select the dominant strategy over its dominated alternatives. The distance from the Pareto frontier determines whether a choice is (Pareto) rational (i.e., lays on the frontier) or is associated to “exasperate” exploration. However, since the uncertainty is one of the two objectives defining the Pareto frontier, we have investigated three different uncertainty quantification measures and selected the one resulting more compliant with the Pareto rationality model proposed. The key result is an analytical framework to characterize how deviations from “rationality” depend on uncertainty quantifications and the evolution of the reward seeking process.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and material
Both data and code for reproducing analysis and results of this paper are available at the following link: https://github.com/acandelieri/humans_strategies_analysis.
References
Abdar M, Pourpanah F, Hussain S, Rezazadegan D, Liu L, Ghavamzadeh M, Nahavandi S (2020) A Review of uncertainty quantification in deep learning: techniques, applications and challenges. arXiv preprint arXiv:2011.06225.
Archetti F, Candelieri A (2019) Bayesian optimization and data science. Springer International Publishing, Berlin
Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2–3):235–256
Bemporad A (2020) Global optimization via inverse distance weighting and radial basis functions. Comput Optim Appl 77(2):571–595
Berger-Tal O, Nathan J, Meron E, Saltz D (2014) The exploration-exploitation dilemma: a multidisciplinary framework. PloS One 9(4):e95693
Berk J, Nguyen V, Gupta S, Rana S, Venkatesh S (2018) Exploration enhanced expected improvement for bayesian optimization. In: joint european conference on machine learning and knowledge discovery in databases (pp. 621–637). Springer, Cham.
Berk J, Gupta S, Rana S, Venkatesh S (2020) Randomised gaussian process upper confidence bound for bayesian optimisation. arXiv preprint arXiv:2006.04296.
Bertram L, Schulz E, Hofer M, Nelson JD (2020) Emotion, entropy evaluations and subjective uncertainty. In: 42nd Annual Virtual Meeting of the Cognitive Science Society, 2020. Psych Archives.
Blanco NJ, Love BC, Cooper JA, McGeary JE, Knopik VS, Maddox WT (2015) A frontal dopamine system for reflective exploratory behavior. Neurobiol Learn Mem 123:84–91
Borji A, Itti L (2013) Bayesian optimization explains human active search. In: Advances in neural information processing systems, pp 55–63.
Brochu E, Cora VM, De Freitas N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599.
Candelieri A, Perego R, Giordani I, Ponti A, Archetti F (2020) Modelling human active search in optimizing black-box functions. Soft Comput 24:17771–17785. https://doi.org/10.1007/s00500-020-05398-2
Cohen JD, McClure SM, Yu AJ (2007) Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Philos Trans R Soc B Biol Sci 362(1481):933–942
De Ath G, Everson RM, Rahat AA, Fieldsend JE (2019) Greed is good: exploration and exploitation trade-offs in bayesian optimisation. arXiv preprint arXiv:1911.12809.
De Ath, G, Everson RM, Fieldsend JE, Rahat AA (2020) $\epsilon$-shotgun: $\epsilon$-greedy batch bayesian optimisation. arXiv preprint arXiv:2002.01873.
Depeweg S, Hernandez-Lobato JM, Doshi-Velez F, Udluft S (2018) Decomposition of uncertainty in Bayesian deep learning for efficient and risk-sensitive learning. In: International Conference on Machine Learning, 1184–1193.
Der Kiureghian A, Ditlevsen O (2009) Aleatory or epistemic? Does it matter? Struct Saf 31(2):105–112
Frazier PI (2018) Bayesian optimization. In: Recent advances in optimization and modeling of contemporary problems (pp. 255–278). Informs.
Friston K, Schwartenbeck P, FitzGerald T, Moutoussis M, Behrens T, Dolan RJ (2014) The anatomy of choice: dopamine and decision-making. Philos Trans R Soc B Biol Sci 369(1655):20130481
Gershman SJ (2017) Dopamine, inference, and uncertainty. Neural Comput 29(12):3311–3326
Gershman SJ (2018) Deconstructing the human algorithms for exploration. Cognition 173:34–42
Gershman SJ (2019) Uncertainty and exploration. Decision 6(3):277
Gershman SJ, Uchida N (2019) Believing in dopamine. Nat Rev Neurosci 20(11):703–714
Giakoumis D, Drosou A, Cipresso P, Tzovaras D, Hassapis G, Gaggioli A, Riva G (2012) Using activity-related behavioural features towards more effective automatic stress detection.
Gramacy RB (2020) Surrogates: gaussian process modeling, design, and optimization for the applied sciences. CRC Press
Griffiths TL, Kemp C, Tenenbaum JB (2008) Bayesian models of cognition. In: Sun R (ed) Cambridge handbook of computational cognitive modelling. Cambridge University Press, Cambridge
Hahn PR, He J, Lopes HF (2019) Efficient sampling for Gaussian linear regression with arbitrary priors. J Comput Graph Stat 28(1):142–154
Hennig P, Schuler CJ (2012) Entropy search for information-efficient global optimization. J Mach Learn Res 13:1809–1837
Hernandez-Lobato JM, Hoffman MW, Ghahramani Z (2014) Predictive entropy search for efficient global optimization of black-box functions. In: Advances in Neural Information Processing Systems (NIPS).
Iwazaki S, Inatsu Y, Takeuchi I (2020) Mean-variance analysis in bayesian optimization under uncertainty. arXiv preprint arXiv:2009.08166.
Jekel CF, Haftka RT (2019) Fortified test functions for global optimization and the power of multiple runs. arXiv preprint arXiv:1912.10575.
Kahneman D (2011) Thinking, fast and slow. Farrar, Straus and Giroux, New York
Kendall A, Gal Y (2017) What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems (NIPS), 5580–5590.
Kochenderfer MJ (2015) Decision making under uncertainty: theory and application. MIT Press
Kourouxous T, Bauer T (2019) Violations of dominance in decision-making. Bus Res 12(1):209–239
Kruschke JK (2008) Bayesian approaches to associative learning: from passive to active learning. Learn Behav 36(3):210–226
Kushner HJ (1964) A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J Basic Eng. DOI 10(1115/1):3653121
Močkus J (1975) On Bayesian methods for seeking the extremum. In: Optimization techniques IFIP technical conference (pp. 400–404). Springer, Berlin, Heidelberg.
Mollan KR, Trumble IM, Reifeis SA, Ferrer O, Bay CP, Baldoni PL, Hudgens, MG (2019) Exact power of the rank-sum test for a continuous variable. arXiv:1901.04597.
Neiswanger W, Ramdas A (2020) Uncertainty quantification using martingales for misspecified Gaussian processes. arXiv preprint arXiv:2006.07368.
Paria B, Kandasamy K, Póczos B (2020) A flexible framework for multi-objective bayesian optimization using random scalarizations. Uncertainty in Artificial Intelligence (pp. 766–776). PMLR.
Peters O (2019) The ergodicity problem in economics. Nat Phys 15(12):1216–1221
Platt ML, Huettel SA (2008) Risky business: the neuroeconomics of decision making under uncertainty. Nat Neurosci 11(4):398–403
Ponti A, Candelieri A, Archetti F (2021) A new evolutionary approach to optimal sensor placement in water distribution networks. Water 13(12):1625
Preuss R, Von Toussaint U (2018) Global optimization employing Gaussian process-based Bayesian surrogates. Entropy 20(3):201
Rakotomamonjy A, Traoré A, Berar M, Flamary R, Courty N (2018) Distance measure machines. arXiv preprint arXiv:1803.00250.
Russo D, Van Roy B (2016) An information-theoretic analysis of Thompson sampling. J Mach Learn Res 17(1):2442–2471
Sandholtz N (2020) Modeling human decision-making in spatial and temporal systems (Doctoral dissertation, Science: Department of Statistics and Actuarial Science).
Schulz E, Gershman SJ (2019) The algorithmic architecture of exploration in the human brain. Curr Opin Neurobiol 55:7–14
Schulz E, Tenenbaum JB, Reshef DN, Speekenbrink M, Gershman S (2015) Assessing the perceived predictability of functions. In: CogSci.
Schumpeter JA (1954) History of economic analysis. Psychology Press
Shahriari B, Swersky K, Wang Z, Adams RP, De Freitas N (2015) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148–175
Shieh G, Jan SL, Randles RH (2006) On power and sample size determinations for the Wilcoxon–Mann–Whitney test. J Nonparamet Stat 18(1):33–43
Srinivas N, Krause A, Kakade SM, Seeger MW (2012) Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans Inf Theory 58(5):3250–3265
Tversky A, Kahneman D (1989) Rational choice and the framing of decisions. In: Multiple criteria decision making and risk analysis using microcomputers (pp. 81–126). Springer, Berlin, Heidelberg.
Wang Z, Jegelka S (2017) Max-value entropy search for efficient bayesian optimization. In: International Conference on Machine Learning, 3627–3635.
Wang Z, Zhou B, Jegelka S (2016) Optimization as estimation with Gaussian processes in bandit settings. In: Artificial Intelligence and Statistics (pp. 1022–1031).
Wang Z, Gehring C, Kohli P, Jegelka S (2018) Batched large-scale bayesian optimization in high-dimensional spaces. In: International Conference on Artificial Intelligence and Statistics. 745–754.
Williams CK, Rasmussen CE (2006) Gaussian processes for machine learning (Vol. 2, No. 3, p. 4). MIT press, Cambridge
Wilson RC, Geana A, White JM, Ludvig EA, Cohen JD (2014) Humans use directed and random exploration to solve the explore–exploit dilemma. J Exp Psychol Gen 143(6):2074
Wilson AG, Dann C, Lucas C, Xing EP (2015) The human kernel. In: Advances in neural information processing systems (pp. 2854–2862).
Wilson RC, Bonawitz E, Costa VD, Ebitz RB (2020a) Balancing exploration and exploitation with information and randomization. Curr Opin Behav Sci 38:49–56
Wilson, J. T., Borovitskiy, V., Terenin, A., Mostowsky, P., & Deisenroth, M. P. (2020b). Efficiently sampling functions from Gaussian process posteriors. arXiv preprint arXiv:2002.09309.
Wu CM, Schulz E, Speekenbrink M, Nelson JD, Meder B (2018) Generalization guides human exploration in vast decision spaces. Nat Hum Behav 2(12):915–924
Žilinskas A, Calvin J (2019) Bi-objective decision making in global optimization based on statistical models. J Global Optim 74(4):599–609
Zuluaga M, Sergent G, Krause A, Püschel M (2013) Active learning for multi-objective optimization. In: International Conference on Machine Learning (pp. 462–470).
Acknowledgements
We greatly acknowledge the DEMS Data Science Lab, Department of Economics Management and Statistics (DEMS), for supporting this work by providing computational resources.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
Authors declare that they do not have any conflicts of interests or competing interests.
Ethics approval
Informed consent was given in accordance with the university’s procedure and the Helsinki declaration.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A
Appendix A
1.1 The ten test problems
The ten global optimization test functions used in this study, including their analytical formulations, search spaces and information about optimums and optimizers, can be found at the following link:
1.2 https://www.sfu.ca/~ssurjano/optimization.html
Since they are minimization test functions, we have returned \(- f\left( x \right)\) as score in order to translate them into the maximization problems depicted in Fig. 14.
1.3 Distances from Pareto frontiers for each player, by test function
The following 10 figures—one for each test function—report the distances of each decision from the Pareto frontiers and for each player.
1.4 Distances from Pareto frontiers for each test functions, by player
The following 14 figures—one for each player—report the distances of each decision from the Pareto frontiers and with respect to each test function.
Rights and permissions
About this article
Cite this article
Candelieri, A., Ponti, A. & Archetti, F. Uncertainty quantification and exploration–exploitation trade-off in humans. J Ambient Intell Human Comput 14, 6843–6876 (2023). https://doi.org/10.1007/s12652-021-03547-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03547-5