Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
nature human behaviour Article https://doi.org/10.1038/s41562-024-01894-9 Comparing experience- and description-based economic preferences across 11 countries Received: 23 February 2023 Accepted: 19 April 2024 Published online: xx xx xxxx Check for updates Hernán Anlló 1,2,3 , Sophie Bavard 1,3,4, FatimaEzzahra Benmarrakchi 3,5, Darla Bonagura3,6, Fabien Cerrotti1,3, Mirona Cicue7, Maelle Gueguen 3,6, Eugenio José Guzmán 8, Dzerassa Kadieva 9, Maiko Kobayashi2, Gafari Lukumon5, Marco Sartorio10, Jiong Yang 11, Oksana Zinchenko 3,12, Bahador Bahrami 3,13, Jaime Silva Concha 3,8, Uri Hertz 3,7, Anna B. Konova3,6, Jian Li 3,11,14, Cathal O’Madagain 3,5, Joaquin Navajas3,10,15,16, Gabriel Reyes 3,8, Atiye Sarabi-Jamab3,17, Anna Shestakova 3,12, Bhasi Sukumaran 3,18, Katsumi Watanabe 2,3 & Stefano Palminteri 1,3,19 Recent evidence indicates that reward value encoding in humans is highly context dependent, leading to suboptimal decisions in some cases, but whether this computational constraint on valuation is a shared feature of human cognition remains unknown. Here we studied the behaviour of n = 561 individuals from 11 countries of markedly different socioeconomic and cultural makeup. Our findings show that context sensitivity was present in all 11 countries. Suboptimal decisions generated by context manipulation were not explained by risk aversion, as estimated through a separate description-based choice task (that is, lotteries) consisting of matched decision offers. Conversely, risk aversion significantly differed across countries. Overall, our findings suggest that context-dependent reward value encoding is a feature of human cognition that remains consistently present across different countries, as opposed to description-based decision-making, which is more permeable to cultural factors. Cross-cultural differences in economic decision-making processes have been investigated in several domains, such as risk preference and behavioural game theory. Although several qualitative features seem to be preserved (such as prospect theory-like preferences and delay discounting1,2), evidence has repeatedly shown culturally driven differences in many decision-making traits3–5. To date, efforts to assess the cross-cultural stability of decisionmaking processes have mainly (if not only) focused on what can be defined as description-based paradigms (that is, using tasks where all of the decision-relevant information, such as prospective outcomes and their costs, can be inferred from explicit cues or instructions6–8). A full list of affiliations appears at the end of the paper. Nature Human Behaviour However, little is known concerning the cross-cultural stability (or lack thereof) of experience-based decisions, which encompass all situations where the decision-making variables have to be inferred from past experience9,10. One prominent conceptual framework with which to investigate experience-based decision processes is reinforcement learning (RL). The RL computational framework encompasses the ensemble of cognitive mechanisms and behaviours involved in the acquisition of knowledge through trial and error. More specifically, models of RL propose computational solutions to a broad range of value maximization problems (such as foraging, navigation and economic decision-making) by decomposing these problems in their e-mail: hernan.anllo@cri-paris.org; stefano.palminteri@ens.fr Article elementary building blocks (action, state and rewards). The empirical and experimental foundations of this formal understanding of the learning process span multiple disciplines, from neuroscience to artificial intelligence11. The lack of cross-cultural investigation of human RL processes is particularly problematic, given that RL is a pervasive cognitive process, with many important implications for mental health, education and economics12–15. Despite its general adaptive value (seek rewards and avoid punishments), laboratory-based research has illustrated that RL processes in many circumstances deviate from statistical and normative standpoints16,17. Determining whether such RL biases are cultural artefacts, or rather stable components of human decision processes, can provide additional fundamental hints to enable understanding of the computational constraints of bounded rationality18,19. Among several features characterizing human RL, the notion of outcome (or reward) context dependence has recently risen to prominence16. More specifically, a series of studies conducted mostly with Western, educated, industrialized, rich and democratic (WEIRD) populations20 have shown that in many RL tasks participants encode outcomes (that is, rewards and punishments) in a context-dependent manner21–24. While there may not be a consensus yet concerning the exact functional form of such context dependency, the available findings seem to favour the idea that subjective outcomes are calculated relatively, following some form of range normalization25–27. Such context dependence-induced rescaling of subjective outcomes is often interpreted as a consequence of efficient information coding in the human brain28,29. According to this hypothesis, this feature can be understood as the result of fundamental neurocomputational constraints akin to those observable in perceptual decision-making30–32. In accordance with this proposal of outcome context dependence in RL as a form of efficient coding, multiple studies using similar tasks in different species have consistently found evidence of range value adaptation, which suggests we may be looking at a general principle of brain functioning33,34. One well-known consequence of context dependence in RL is that, in some cases, it can induce suboptimal decisions25–27. In particular learning contexts, individuals mistakenly attribute higher subjective values to objectively worse options because of how these options are appraised in relation to the local reward distribution, resulting in choices that fail to maximize reward. If indeed there exists such a fundamental computational constraint in the human brain, the behavioural signatures of context dependence should be a stable feature of decision-making, and thus persist across different populations and cultures. In the present work, we set out to test this hypothesis by leveraging a task capable of eliciting context-dependent RL behaviours and deploying it across 11 countries of remarkably different socioeconomic and cultural makeup (Argentina, Iran, Russia, Japan, China, India, Israel, Chile, Morocco, France and the United States). This allowed us to test the cross-cultural stability of context-dependent value encoding in human RL, and thus assess its putative role as a core computational process of experience-based decision-making. In addition, we also administered to our participants a descriptionbased decision-making task that included decision contexts overlapping with those presented in the RL task. The rationale behind this second task was twofold. First, it allowed us to determine the extent to which choice behaviour measured in the RL task can be explained by risk aversion, estimated using standard procedures in behavioural economics (that is, using lottery tasks). Second, it gave us the opportunity to compare the variability of experience- and description-based decision-making processes across countries. Our results indicate a remarkable similarity in how context dependence affects decisions from experience and generates suboptimal choice across countries, consistent with the idea that it may derive from deep and conserved constraints on cognition. Our results also showed that risk aversion inferred from the description-based lottery task Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 could not account for these effects. Interestingly, description-based decisions were also found to be highly variable across countries, further confirming the functional dissociation between the behaviour elicited by the two modalities6,7,35. Exploratory analyses using independent socioeconomic, cultural and cognitive measures taken from our samples further showed that the origin of cross-country differences in description-based decisions is multifactorial, as was previously found for risk and other cognitive domains5,36,37. Overall, our results suggest that reinforcement (experience-based) decision processes are much more culturally stable than description-based ones and have important implications for theories of bounded rationality18,19. We conclude this work by discussing the possible implications of these results for the current implementation of policies and interventions aimed at contrasting the economic and social burden of biased decision-making worldwide. Results Behavioural protocol Our behavioural protocol consisted of a RL (that is, experience-based) task, in the form of a previously validated two-armed bandit task26, followed by a description-based decision-making task consisting of choices between lotteries (Fig. 1a). Both decision-making tasks were preceded by dedicated instructions and a short training session and succeeded by a series of questionnaires directed at obtaining information on participants’ socioeconomic, cultural and cognitive features, as well as general demographics (Supplementary Fig. 1). In a two-armed bandit task, participants make trial-by-trial choices between two possible options (which would be conceptualized as lotteries, following the traditional nomenclature in economics). Each option has a given probability of providing a certain reward, and participants’ choices can affect reward maximization. Crucially, they initially ignore the value of each option, but, as trials advance, participants progressively accumulate feedback information and can learn an experiential notion of the value of each option. In the present work, our bandit task design and implementation reproduced that of Bavard et al.26. Thus, the RL task consisted of two phases: a learning phase and a transfer phase. During the learning phase, participants were presented with eight abstract icon cues, each representing a lottery of non-disclosed expected value, paired in four stable decision contexts. In the learning phase, each decision context featured only two possible outcomes: either 10/0 points or 1/0 points. The outcomes were probabilistic (75 or 25%). For convenience, contexts were labelled by taking into account the difference in expected value between the most and least rewarding options (that is, the expected value-maximizing (correct) and value-minimizing (incorrect) options) (Fig. 1b). In the ensuing transfer phase, these same eight lotteries were rearranged into new decision contexts (as was previously done in similar designs for humans and birds22,26,33,34,38). In addition to the change in decision contexts, the key difference between the learning and transfer phases was that during the learning phase participants were presented with complete feedback whereas in the transfer phase no feedback was provided, so that choices could only be based on values learned during the learning phase (Fig. 1b). Finally, we conducted an additional task, which we identified as the lottery task (Fig. 1c). There, the values of the options were explicitly disclosed, as the abstract cues were replaced by cue cards informing reward magnitudes and probabilities in an explicit numerical manner. The lottery task featured the same decision contexts as those used in the transfer phase plus four additional contexts designed to better assess risk preferences. These last contexts consisted of choices comparing varying probabilities of winning 10 points (100, 75, 50 and 25%) against the certainty of winning 1 point. The present work consists of a direct cross-cultural extension of the hypotheses and analytical pipeline already exposed in Bavard et al.26 (tightly linked to previous studies from our laboratory and collaborating teams22). In this previous instance, the authors used the same outcome measures and a computational approach largely overlapping with the present one. Thus, while we understand the rationale behind preregistration, Article a https://doi.org/10.1038/s41562-024-01894-9 RL task Design Training (~5 min) b ∆EV = 5 ∆EV = 6.75 Explicit choices (32 choices) Transfer (120 choices) c RL task P: 0.75 0.25 M: 10 10 d Learning (120 choices) Lottery task 0.75 0.25 1 1 0.75 0.25 10 10 0.75 0.25 1 1 ∆EV = 0.5 ∆EV = 5 ∆EV = 0.5 ∆EV = 2.25 ∆EV = 7.25 ∆EV = 1.75 Learning phase choice sets (with feedback) Transfer phase choice sets (without feedback) Included sites e Questionnaires (~10 min) Lottery task ∆EV = 9 10 1 100% 100% ∆EV = 6.5 10 1 75% 100% ∆EV = 4 10 1 50% 100% ∆EV = 1.5 10 1 25% 100% 10 1 75% 75% ∆EV = 6.75 10 1 25% 25% ∆EV = 2.25 10 1 75% 25% ∆EV = 7.25 10 1 25% 75% ∆EV = 1.75 Risk aversion assessment (without feedback) Transfer phase choice sets (without feedback) Country characteristics HDI Cultural distance France Distance from India 0.75 0.50 0.25 0.2 China Israel Japan 0.1 Morocco United States Iran Argentina Russia Chile India 0 United States Israel Japan France Chile Argentina Russia Iran China Morocco India 0 0 0.05 0.10 0.15 Distance from United States Fig. 1 | Behavioural protocol and sample. a, Outline of the experimental design, including training, the RL task, the lottery task and questionnaires. b, Probabilities (P) and magnitudes (M) of each of the lotteries for the learning and transfer phases of the RL task, together with the differences in expected values (∆EV) between options for each local decision context. Complete feedback was provided during the learning phase (factual and counterfactual feedback), whereas no feedback was provided during the transfer phase. c, Probabilities and magnitudes of each of the lotteries for the lottery task, together with the differences in expected values between options for each local decision context. No feedback was provided. d, Geographical locations of the participating countries. Dots represent the cities where data collection was conducted (that is, New Jersey, Haifa, Tokyo, Paris, Santiago de Chile, Buenos Aires, Moscow, Tehran, Beijing, Rabat and Chennai), colour coded as a function of each country’s HDI score (see left panel of e). e, Country macrometric characteristics, including HDI scores (left) and cultural distance between each country, India and the United States (right). we posit that in this particular case its absence is counterbalanced by the coherence between the existing published analytical pipelines and the present one. Of note, analysis of preferences and choices in the lottery task (a novelty of this study) followed the same logic as that of the RL task. Finally, analyses on the possible correlations between socioeconomic/cultural factors and outcome measures were explicitly defined as exploratory (as no specific hypothesis was proposed). (Fig. 1e, left). To assess the cultural spread of the selected countries, we used the 1981–2014 dataset of Muthukrishna and colleagues’ cultural distance metric40, to estimate the cultural difference between each of the selected countries with respect to the United States and India, which represented the highest and lowest HDI values in our sample (Fig. 1e, right). To ensure that our samples would adequately represent the culture of the country to which they belonged, inclusion criteria required that participants: (1) had the target country nationality; (2) resided in the target country; (3) had completed at least the full basic education cycle in the target country; and (4) spoke the country’s official language as their native language. These criteria were assessed for each participant during a video meeting before launching the experiment. The meeting, task instructions and questionnaires were delivered in each country’s official language by local researchers. Additionally, to confirm the diversity of the sample beyond country macrometrics, participants completed individual questionnaires on socioeconomic status41, individualistic/collectivistic tendencies42 and centrality of religiosity in their social environment43, as well as a Population demographics Our main goal was to test the replicability of context dependence in RL across countries (while disentangling it from risk aversion, as it is standardly assessed in behavioural economics using lottery-based tasks). Thus, our final sample included 11 countries (United States, Israel, Japan, France, Chile, Argentina, Russia, Iran, China, Morocco and India), covering a total of five continents and ten languages (Fig. 1d). Country selection was aimed at portraying a gradual spread across the United Nations’ Human Development Index (HDI)39. This coefficient is built with many metrics, such as gross domestic product, industrialization, mean education level, income inequality and liberty indexes Nature Human Behaviour Article https://doi.org/10.1038/s41562-024-01894-9 Table 1 | Demographic and sociocultural metrics and sample sizes United States Israel Japan France Chile Argentina Russia Iran China Morocco India All P 51 58 55 58 59 51 58 60 53 56 64 623 – Completion issues 0 7 3 3 5 1 7 6 1 2 8 43 – Rollout issues 1 1 2 1 0 0 1 5 3 3 2 19 – n (initial) Exclusions n (final) 50 50 50 54 54 50 50 49 49 51 54 561 – Mean (s.d.) age (years) 26.5 (4.2) 26 (2.9) 20.6 (1.7) 28.9 (5.7) 22.5 (2.2) 22.5 (3.6) 26.3 (4.1) 27 (5.4) 23.4 (2.8) 21.8 (2.9) 23.1 (4.9) 24.4 (4.6) <0.0001 Gender (% female) 74 70 58 67 65 72 50 65 49 47 53 60.9 0.99 University education (%) 95a 100 100 100 100 100 100 100 100 100 100 – HDI (2019) 0.926 0.919 0.919 0.901 0.851 0.845 0.824 0.783 0.761 0.686 0.645 – Cultural distance From United States – 0.1060 0.1222 0.1195 0.0627 0.0638 0.1369 0.0959 0.1618 0.1573 0.0845 – – From India 0.0845 0.1454 0.1200 0.2811 0.0491 0.0525 0.0814 0.0669 0.1474 0.0975 – – – Socioeconomic status (mean (s.d.)) Childhood 3.9 (0.3) 4.8 (0.3) 6.1 (0.2) 4.8 (0.2) 5.9 (0.3) 6.1 (0.2) 4.3 (0.3) 5.1 (0.3) 4.2 (0.3) 4.6 (0.3) 5.2 (0.3) – <0.0001 Adulthood 3.9 (0.3) 3.5 (0.2) 5.7 (0.3) 3.9 (0.3) 4.0 (0.2) 4.9 (0.2) 4.2 (0.2) 5.2 (0.3) 4.8 (0.3) 3.8 (0.3) 5.1 (0.3) – <0.0001 Social hierarchy 5.4 (0.3) 6.1 (0.2) 7.0 (0.2) 5.9 (0.2) 6.7 (0.2) 6.6 (0.2) 5.5 (0.2) 6.8 (0.2) 5.2 (0.3) 6.1 (0.3) 6.0 (0.3) – <0.0001 Individualistic and collectivistic tendencies (mean (s.d.)) Vertical individualistic 18 (0.9) 22 (0.8) 23 (0.8) 18 (1.0) 17 (1.0) 18 (1.0) 21 (0.7) 23 (0.9) 26 (0.8) 25 (0.9) 24 (0.7) – <0.0001 Horizontal individualistic 29 (0.6) 28 (0.7) 25 (0.8) 28 (0.6) 29 (0.6) 27 (0.7) 26 (0.7) 31 (0.6) 28 (0.8) 31 (0.5) 28 (0.8) – <0.0001 Vertical collectivistic 24 (1.0) 26 (0.7) 21 (0.9) 24 (0.7) 25 (0.9) 19 (0.7) 19 (0.7) 21 (1.0) 27 (0.7) 30 (0.8) 30 (0.9) – <0.0001 Horizontal collectivistic 28 (0.8) 28 (0.8) 26 (0.9) 27 (0.6) 31 (0.6) 31 (0.5) 25 (0.7) 25 (0.7) 26 (0.7) 30 (0.7) 28 (0.8) – <0.0001 Centrality of religiosity in social environment (mean (s.d.)) Experiences 8.0 (0.6) 6.8 (0.5) 5.8 (0.4) 6.8 (0.5) 7.5 (0.5) 5.7 (0.4) 6.4 (0.4) 9.1 (0.5) 4.0 (0.3) 13.0 (0.4) 11.0 (0.5) – <0.0001 Role in ideology 9.9 (0.6) 9.0 (0.6) 8.0 (0.4) 8.9 (0.6) 10.5 (0.4) 7.1 (0.5) 8.3 (0.6) 11.0 (0.6) 5.3 (0.4) 14.0 (0.3) 11.0 (0.5) – <0.0001 Religious thought 7.6 (0.4) 6.4 (0.4) 7.7 (0.3) 8.2 (0.5) 6.6 (0.4) 7.5 (0.4) 7.3 (0.4) 7.8 (0.4) 5.8 (0.4) 11.0 (0.4) 9.1 (0.5) – <0.0001 Private life 7.8 (0.4) 6.0 (0.5) 7.3 (0.4) 6.9 (0.5) 7.6 (0.5) 5.9 (0.4) 6.1 (0.4) 7.7 (0.6) 5.4 (0.4) 12.0 (0.5) 10.0 (0.5) – <0.0001 Public life 5.6 (0.5) 6.2 (0.5) 5.7 (0.3) 5.9 (0.4) 5.0 (0.4) 4.7 (0.4) 4.4 (0.3) 5.4 (0.4) 4.1 (0.3) 9.2 (0.5) 8.6 (0.5) – <0.0001 a Of the 78% of US participants who chose to disclose their education level. P values were Bonferroni corrected for the comparisons presented in this table. P values were calculated by conducting separate linear and mixed-effects linear regressions, where the country variable was used as a predictor. cognitive reflection test44 (see Methods for a detailed description of each metric). Sample sizes for each country were set based on a power analysis conducted based on the online results of Bavard et al.26 (n = 46 per country; see Methods). After exclusions (failure to complete the task, n = 43; troubleshooting/translation issues during task rollout, n = 19), the final sample comprised the remaining n = 561 participants (342 female; mean (s.d.) age = 24.4 (4.6) years; n = 51 on average per country). Separate linear regressions, using each of the demographic and sociocultural indexes as predictors of nationality, confirmed that country samples Nature Human Behaviour were significantly different in many respects. A summary of these differences, demographic information, sample sizes and exclusions can be found in Table 1. Detailed results of the regressions can be found in Supplementary Table 1. Experience-based RL task First, we looked at performance in the RL task. We focused on correct responses (that is, the probability of picking the expected value-maximizing choice) as the behavioural dependent variable. The correct response rate was analysed separately in each RL phase Article (that is, learning and transfer), as a function of decision context (a within-participants variable) and country (a between-participants variable). We also compared the correct response rate against chance level (or indifference; P = 0.5) to assess learning and preferences. As in previous studies using the same or similar designs22,26, of particular relevance for the demonstration of outcome context dependence were: (1) the comparison of accuracies between the ∆EV = 5.0 and the ∆EV = 0.5 decision contexts in the learning phase (where an absence of difference—the magnitude effect—is taken as a sign of relative value learning); and (2) the preference expressed in the ∆EV = 1.75 decision context of the transfer phase (where below-chance accuracy is taken as an indicator of context-dependent outcome encoding). The results showed that the average correct response rate for the learning phase was significantly different from the chance level of 0.5 for all countries and decision contexts (Fig. 2a), which confirmed that learning had occurred (pooled sample at ∆EV = 5.0: 0.8 ± 0.2; t(560) = 42; P < 0.001; d (95% confidence interval (CI)) = 1.8 (1.66, 1.92); for ∆EV = 0.5: 0.8 ± 0.2; t(560) = 38; P < 0.001; d = 1.6 (1.49, 1.74); see Supplementary Table 3 for model selection and Supplementary Table 4 for full regression results). Although we found significant differences in aggregate performance between countries (country main effect: χ2(10) = 58; P < 0.001), learning and above-chance performance levels were observable in all samples and contexts (Supplementary Fig. 2). Importantly, we did not find statistical evidence for magnitude effects in any of our country samples, and learning performance remained consistently above chance for the ∆EV = 5.0 and ∆EV = 0.5 conditions in all samples (decision context main effect: χ2(1) = 2; P = 0.142; decision context × country interaction: χ2(10) = 12; P = 0.289). Furthermore, a corrected Akaike information criterion (AICc) weight ratio analysis of regression models fitted to our data also pointed to a lack of magnitude effect in our sample (that is, a model including decision context as a regressor was 0.01 times more likely to predict correct choices than the same model without it). As an additional index of relative evidence of one model over the other, Bayes factor computation strongly favoured the null model (BF < 0.001). We then turned to analysis of the transfer phase (Fig. 2b). In this case, correct choice rates were strongly modulated across decision contexts (decision context main effect: χ2(3) = 326; P = < 0.001). When assessing the statistical evidence in favour of a country effect, we only found marginal results (country main effect: χ2(10) = 18; P = 0.049; decision context × country interaction: χ2(30) = 41; P = 0.093). Upon further inspection, an AICc weight ratio analysis of regression models fitted to our data pointed to a lack of country effect within our sample (that is, a model including country as a regressor was 0.0003 times more likely to predict correct choices than the same model without it). As an additional index of relative evidence of one model over the other, a Bayes factor computation strongly favoured the null model (BF < 0.001; see also the Supplementary Information). Thus, we concluded that this marginal result was not indicative of a significant country effect. Replicating previous findings, indicating that participants could successfully retrieve and generalize the values learned during the learning phase, correct choice rates in the ∆EV = 7.25 and ∆EV = 6.75 decision contexts were well above the chance level (for ∆EV = 7.25: 0.7 ± 0.3; t(560) = 15; P < 0.001; d = 0.6 (0.55, 0.73); for ∆EV = 6.75: 0.56 ± 0.4; t(560) = 3.5; P < 0.001; d = 0.15 (0.07, 0.23)). Crucially, however, accuracy in the ∆EV = 1.75 context was consistently below the chance level for all countries, indicative of suboptimal preferences induced by context dependence (pooled sample: 0.33 ± 0.3; t(560) = −12; P < 0.001; d = −0.5 (−0.6, −0.4); see individual per-country t-tests in Supplementary Table 5). Crucially, the presence of suboptimal behaviour in the ∆EV = 1.75 context was observable in every country (Supplementary Table 5), with no significant differences between countries (Fig. 2e, left; see Supplementary Table 6 for post-hoc pairwise contrasts). These results replicated the same suboptimal response patterns for the ∆EV = 1.75 decision context already seen in a previous Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 publication26, and were consistent with other previous findings showing evidence of context dependence13,22–24. Chiefly, the observed behavioural signatures of outcome context dependence were cross-culturally stable in the RL task. Contrary to what a model encoding values on an absolute scale would have predicted, performance was not affected by the outcome magnitude during the learning phase: this constitutes a positive manifestation of context-dependent adaptive coding28. Additionally, preferences were globally below chance in the ∆EV = 1.75 condition. Namely, a previously optimal option (EV = 0.75) was preferred to a previously suboptimal option (EV = 2.5) despite its expected value being higher in the new decision context. This illustrated the already known negative side of outcome context dependence in the context of RL: suboptimal decisions may arise when options are extrapolated from their original context. Description-based lottery task We then analysed participants’ preferences in the description-based lottery task (Fig. 2c,d). We first considered choices in the decision contexts aimed at benchmarking risk preferences, where a sure small payoff (1 point) was presented against risky options with varying probabilities of delivering a bigger payoff (10 points). These four decision contexts allowed us to estimate risk preference, quantified as the decrease in expected value-maximizing choice rates as the probability for obtaining the larger payoff decreased (that is, the propensity to choose the objectively higher value option as the levels of risk for that option increased). The results showed a coherent modulation of decision context on choice behaviour: as the involved risk increased, choice ratios for the objectively higher value offers decreased for all countries (pooled sample; for ∆EV = 9: 0.94 ± 0.1; t(560) = 60; P < 0.001; d = 2.6; for ∆EV = 6.5: 0.79 ± 0.2; t(560) = 23; P < 0.001; d = 1; for ∆EV = 4: 0.72 ± 0.3; t(560) = 16; P < 0.001; d = 1; for ∆EV = 1.5: 0.53 ± 0.4; t(560) = 2; P = 0.088; d = 0; decision context main effect: χ2(3) = 326; P = <.001; see Supplementary Table 3 for model selection and Supplementary Table 4 for full regression results). Interestingly, although risk affected expected value maximization in all country samples, it did so differently across countries (country main effect: χ2(10) = 57; P < 0.001; country × decision context interaction: χ2(30) = 100; P < 0.001; see Supplementary Table 5 for per-country t-test analyses). This indicated that preferences expressed in the description-based task were not cross-culturally stable, unlike behaviour observed in the RL task. After verifying the presence of risk aversion in the benchmark decision contexts of the lottery task, we analysed preferences in the decision contexts homologous to those of the transfer phase in RL (Fig. 2d). This allowed us to directly compare between experienceand description-based preferences. We focused mainly on the behaviour expressed for the ∆EV = 1.75 decision context, where a tendency to significantly choose suboptimal choices can be interpreted as a sign of context dependence in the RL task. Crucially, and contrary to RL behaviour, the results showed that in all countries the correct choice rate was significantly above chance for this decision context in the description-based task (pooled sample; for ∆EV = 7.25: 0.9 ± 0.1; t(560) = 58; P < 0.001; d = 2.4; for ∆EV = 6.75: 0.9 ± 0.1; t(560) = 51; P < 0.001; d = 2; for ∆EV = 2.25: 0.9 ± 0.1; t(560) = 47; P < 0.001; d = 2; for ∆EV = 1.75: 0.6 ± 0.4; t(560) = 9; P < .001; d = 0.4). Additionally, the ∆EV = 1.75 lottery context presented evidence of significant between-country differences that were absent in RL (Fig. 2e, right; country × decision context interaction: χ2(30) = 68; P = < 0.001; see Supplementary Table 6 for post-hoc pairwise contrasts). To directly compare between descriptive and experiential choices for the ∆EV = 1.75 context, we modelled preferences in this decision context by including an additional regressor (decision type; levels: RL and lottery). The results indicated a significant decision modality effect (χ2(1) = 216; P = < 0.001) that confirmed the difference between the two tasks. Overall, the results from the lottery task illustrated two important points. First, we were able to detect significant across-country Article https://doi.org/10.1038/s41562-024-01894-9 b 1.0 0.5 0.8 0.3 0.6 0.4 0.2 P (EV maximizing) RL task (learning phase) Difference 0.1 –0.1 –0.3 0 –0.5 1 P (75%) vs 1 P (25%) ∆EV = 0.5 10 P (75%) vs 10 P (25%) ∆EV = 5.0 RL task (transfer phase) 1.0 0.8 0.6 0.4 Country 0.2 0.8 United States Israel Japan France Chile Argentina Russia Iran China Morocco India 0.6 All 0 ∆EV 0.5 minus ∆EV = 5 Decision context Decision context d Lottery task (risk benchmark) 1.0 P (EV maximizing) 0.8 0.6 0.4 0.2 0 10 P (100%) vs 1 P (100%) ∆EV = 9.00 10 P (50%) vs 1 P (100%) ∆EV = 4.00 10 P (75%) vs 1 P (100%) ∆EV = 6.5 Lottery task (transfer contexts) 1.0 0.4 0.2 0 10 P (25%) vs 1 P (100%) ∆EV = 1.5 Decision context Decision context Country pairwise contrast (∆EV 1.75) RL task Lottery task Chile Chile France France Japan Japan India Morocco Iran China Russia Chile Argentina France Israel Japan Israel United States United States Israel United States 0.2 0 P < 0.05 India Russia Argentina Iran Russia Argentina Distance 0.4 Russia Iran Chile China Iran Argentina Morocco China Israel Morocco Japan India United States India France e 10 P (25%) vs 1 P (75%) ∆EV = 1.75 10 P (25%) vs 1 P (25%) ∆EV = 2.25 10 P (75%) vs 1 P (75%) ∆EV = 6.75 10 P (75%) vs 1 P (25%) ∆EV = 7.25 Morocco P (EV maximizing) c 10 P (25%) vs 1 P (75%) ∆EV = 1.75 10 P (25%) vs 1 P (25%) ∆EV = 2.25 10 P (75%) vs 1 P (75%) ∆EV = 6.75 10 P (75%) vs 1 P (25%) ∆EV = 7.25 China P (EV maximizing) a Fig. 2 | Behavioural results. a–d, Proportion of correct answers (that is, choices that maximize expected value) in the RL task (learning phase (a) and transfer phase (b)) and lottery task (benchmark of risk preferences (c) and transfer decision contexts (d)) for each individual country (dots) and the average of all countries (boxes) for each of the two (a) and four (b–d) task decision contexts. In a, the difference between the big (∆EV = 5.0) and the small (∆EV = 0.5) magnitude context is shown to the right. In c, the decision contexts were presented to estimate risk aversion. In d, the decision contexts were homologous to those of the transfer phase. e, Country pairwise contrasts for the ∆EV = 1.75 decision context. Shown are the Euclidean distances between the mean proportion of correct answers of each country during the RL task (left) and lottery task (right). The bars represent s.e.m. The midline of each box represents the mean of all countries. Bounds of boxes represent 95% confidence intervals of the mean. Red boxes represent a significant pairwise contrast. In a–d, correct choice rates were analysed independently for samples of the United States (n = 50), Israel (n = 50), Japan (n = 50), France (n = 54), Chile (n = 54), Argentina (n = 50), Russia (n = 50), Iran (n = 49), China (n = 49), Morocco (n = 51) and India (n = 54). behavioural differences in our sample. This excludes that absence of an effect in the RL task may be due to a general inability of our protocol to detect behavioural differences. Second, these findings showed that risk aversion, as inferred from preferences expressed in the lottery task, could not account for preferences in the RL task. This was specifically true for the key ∆EV = 1.75 decision context, where we observed a clear case of preference reversal when comparing the two decision modalities45. and parsimoniously capture the behavioural consequences of both context-dependent outcome encoding (in the RL task) and decreasing marginal utility (in the lottery task). In both tasks, the subjective value of a given outcome or payoff was adjusted through the implementation of a free parameter (0 ≤ ν ≤ 1) as follows: 10p × ν, if Robj,t = 10p Rscaled,t = { Robj,t otherwise Computational results To quantify the observed decision-making strategies in a systematic manner that encompassed all decision contexts across all tasks, we formalized choice behaviour using simple models built around the notion of subjective outcome scaling. This choice was motivated by the fact that this outcome-scaling process, described below, could satisfactorily Nature Human Behaviour where Rscaled,t represents the scaled subjective outcome and Robj,t the objective unscaled outcome at trial t. For RL trials, we embedded the scaling process within a fully parameterized version of the standard Q-learning algorithm, where option-dependent Q values were learnt from the range-adapted reward term Rscaled. The algorithm also included Article Nature Human Behaviour a Scaling parameter (ν) 1.00 0.75 νRL 0.50 νLOT 0.25 Un ite d St a te s Is ra el Ja pa Fr n an ce C h Ar i ge le nt in a Ru ss ia Ira n C h M ina or oc co In di a 0 b Country pairwise contrast νRL νLOT Distance 0.2 0.1 0 P < 0.05 ite d St a Is tes r Ja ael Fr pa a n Ar Cnce ge hi n le Ru tina ss i Ira a C M h n or in oc a Un ite In co d di St a a Is tes r Ja ael Fr pa a n Ar Cnce ge hi n le Ru tina ss i Ira a C M h n or in oc a In co di a India Morocco China Iran Russia Argentina Chile France Japan Israel United States Un free inverse temperature (β), forgetfulness (φ) and learning rate (α) parameters, inasmuch as the RL process consists of acquiring value from experience and subsequently storing that value in memory for value actualization and learning11. For the lottery task trials, we formalized choice behaviour based on the subjective expected value that participants attributed to each choice as a function of its inherent risk, by multiplying Rscaled,t by reward probability (as is customarily done in standard linear utility models46). While we retained choice inverse temperature (β) for this instance of the model, no memory actualization or learning processes were expected to take place during lottery, which rendered φ and α unnecessary. We differentiated between scaling and inverse temperature in RL and lottery decision contexts by fitting specific parameters as νRL and βRL and νLOT and βLOT, respectively. We made sure that our fitting procedure allowed us to correctly recover the parameters in simulated datasets, as well as produce simulations that would closely replicate the observed behavioural data (see Supplementary Information for the procedure and results of the simulations and parameter recovery). Utilizing the same scaling parameter (ν) in both models was a crucial step in the formalization, as it allowed us to compare experiential and descriptive adaptation mechanisms in the same terms while integrating all of the possible decision contexts. We expected νRL to reflect context-dependent range value adaptation in the RL task and νLOT to capture marginally decreasing utility (and therefore risk aversion) in the lottery task. It follows that νRL was expected to remain invariant across country samples, confirming that relative value encoding occurred universally and independent of risk preferences. Conversely, we expected νLOT to differ significantly between countries, in line with the observed risk aversion behaviours for each country sample, and to be decorrelated from νRL. As shown in Fig. 3a, scaling patterns conformed to these hypotheses. First, we found minimal to no evidence for differences between countries in νRL (νRL ~ country; sum of squares (SS) = 0.98; degrees of freedom (d.f.) = 10; P = 0.066). We confirmed this lack of effect through AICc weight ratio analysis: we considered a full model including country as a predictor, and as null an identical model not including it. The results strongly disfavoured country as a relevant predictor of νRL in terms of information loss (that is, the full model having 0.23 times the strength of the null model). As an additional index of relative evidence of one model over the other, Bayes factor computation strongly favoured the null model (BF < 0.001). Second, evidence showed that νLOT differed significantly across country samples (νLOT ~ country; SS = 3; d.f. = 10; P = 0.004). Here, the AICc weight ratio strongly favoured the country effect model (the full model being 16.65 times stronger than the null model). Finally, as seen in Fig. 3b, between-country pairwise contrasts revealed significant differences in νLOT (see Supplementary Table 9 for post-hoc pairwise contrasts). Indeed, νLOT differed substantially across countries, from quite substantial risk aversion (median νLOT = 0.28 in the Chilean sample) to moderate to high risk aversion (median νLOT = 0.62 in the Israeli sample). Crucially, νLOT values were highly correlated with the risk aversion behavioural patterns previously observed in the ∆EV = 1.5 (R = 0.84 (95% CI = 0.81, 0.86) and P < 0.001) and ∆EV = 1.75 lottery trials (R = 0.64 (95% CI = 0.59, 0.69) and P < 0.001) and decorrelated from νRL (R = 0.08 (95% CI = 0, 0.16) and P = 0.235) (Supplementary Fig. 4 and Supplementary Table 7). In summary, our computational approach confirmed strong evidence for stable cross-country outcome context dependence in the RL task using a compact computational measure. A similar analysis performed in the lottery task confirmed that the preferences in the RL task could not be accounted for by risk aversion inferred from the lottery task. Crucially, these results also confirmed a difference in the stability of experience- and description-based processes across countries. To discard that the differences found in scaling between phases could be confounded by differences in task performance (that is, lack https://doi.org/10.1038/s41562-024-01894-9 Fig. 3 | Computational results. a, Values of the scaling free parameter estimated during the RL task (νRL) and lottery task (νLOT). b, Country pairwise contrasts for the scaling parameters. Shown are the Euclidean distances between the means of the scaling parameters of each country during the RL task (left) and lottery task (right). Translucent dots represent individual participants’ values. Dots with a bold outline represent the mean. Bars represent s.e.m. Red boxes represent a significant pairwise contrast. of learning or inattention), we reanalysed and refitted the data after excluding all participants who had less than 100% accuracy in choices involving fully dominated options in the lottery task (as seen in previous studies on economic preferences47,48). In such contexts (that is, ∆EV = 7.25 and ∆EV = 9), suboptimal choices can be ascribed to general inattention, or the use of task-irrelevant heuristics (for example, basing choices on a cue’s visual features and so on). These analyses, available in the Supplementary Information, confirmed that this strict elimination criterion improved overall performance (and resulted in less stochastic choices, as proxied by the increase of both βRL and βLOT). However, even after exclusion of these participants (n = 124; total remaining, n = 437), we were still able to replicate all of the behavioural and computational patterns of the results presented thus far (Supplementary Figs. 5–8). Drivers of risk aversion differences Our main goal was to test whether the behavioural and computational signatures of context-dependent outcome encoding in RL would replicate across samples from different countries and cultural backgrounds, and whether or not said preferences would differ from those of a description-based task. Indeed, we found positive evidence showing that context dependence as captured in experience-based decision-making tasks is stable across the included countries and distinct from risk aversion in tasks from description. Importantly, we did not have any specific directional prediction on which cultural or socioeconomic factors would influence preferences in general (and more Article specifically, risk aversion in the lottery task). However, in an exploratory manner, we evaluated whether the cultural and socioeconomic metrics we had obtained characterized the differences in risk aversion between samples. We did so by producing separate linear regressions of the scaling (νRL and νLOT) and inverse temperature (βRL and βLOT) parameters against our country- and participant-level cultural, economic and cognitive metrics. The results of these exploratory analyses (Supplementary Table 12) showed that single-dimension subjective metrics did not significantly predict the values of the outcome-scaling parameters for either task. In contrast, country-level macrometrics composed of multiple dimensions (that is, HDI and cultural distance) did improve the models. This fell in line with previous findings on intercultural risk preferences, which show that individual differences rarely inform risk preferences, but country-level macrometric indexes are marginally better5,36,37. It should be noted, however, that even when significant the correlation magnitudes were considerably small. Nonetheless, it should be noted that cultural metrics generally predicted changes in νLOT, but not νRL, which was consistent with the robustness of RL biases to cultural factors, as well as the gap between experiential and descriptive choices found in our main results. Discussion As a phenomenon, culture has been defined as the ensemble of transmissible social and cognitive features that determine the common identity and way of life of a group of people49. Cross-cultural research usually focuses on identifying how said features can be organized in larger coherent constructs that act as cultural vectors, shaping preferences and behaviour50. Perhaps the most researched among these constructs is the collectivism versus individualism spectrum50, which scores tendencies to act at the behest of oneself versus the interests of the collective42. Other well-researched cultural constructs include the analytic versus holistic thought spectrum51 (object-focused reasoning versus context-focused reasoning) and tight versus loose normativity spectrum52 (strong versus lax enforcement of social norms). When it comes to studying decisions across different cultures, these broad indexes can be difficult to unify under a common theoretical and methodological framework, which leads to results not always being consistent51. However, despite some notable exceptions, evidence from the past two decades has shown that WEIRD countries broadly lean towards individualistic behaviour and analytic thought, while Eastern countries exhibit behaviours consistent with collectivism and holistic thought50. These cultural determinants have in turn been shown to shape several aspects of decision and choice behaviour, including risk preferences (for example, individualism positively correlates with loss aversion53), heuristics (for example, analytic populations are more thorough54) and reference point adaptation (for example, holistic populations adjust reference points more often55). In the present work, we sought to assess the cross-cultural stability of another recently discovered but well-documented feature of human behaviour: context-dependent RL. It is important to underscore that however robust, the vast majority of the results concerning context effects in human RL to date come from WEIRD samples16,21–26,56. This severely limited the interpretation of context-dependent outcome encoding as a fundamental building block of human RL. In this Article, we aimed to address this issue by showing evidence of the generalizability of outcome context dependence in samples from 11 countries of different sociocultural makeup. Outcome context dependence was evident both from behavioural signatures (that is, magnitude invariant performance in the learning phase and persistent suboptimal preferences in the transfer phase) and from analysis of the key parameter of our computational model (that is, νRL). In addition to our RL task, we also administered a description-based task featuring an overlapping set of decision contexts. This allowed us to demonstrate that risk aversion (as is standardly inferred in behavioural economics from lottery tasks) could not account for behavioural signatures of context dependence in Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 the RL task (especially suboptimal preferences in the transfer phase). Furthermore, we have also shown that while experience-based processes and preferences were remarkably stable across the included countries, description-based processes were not. By replicating the finding of context-dependent RL outside the WEIRD space, our work shows that this cognitive feature is not likely to be a simple cultural artefact57,58. Of course, we acknowledge that our current sample is not diverse enough to argue for a definitive universality of contextual value encoding in RL. We also acknowledge that our samples may be neglecting within-country variations (some of the included countries contain within themselves very different ethnic and linguistic communities that we did not cover). However, the fact that our results would show this bias consistently throughout samples constitutes strong evidence in that direction, particularly since our samples were distinct enough to elicit between-country differences in description-based choices. Future research efforts seeking to extend the present findings should consider testing in rural versus urban population settings59 and across different social layers within the same societies2. Additionally, further replications should consider larger sample sizes, both to study more complex interactions and to disambiguate the status of near-significant effects present in the current study. The presence of context-dependent RL across such a diverse sample falls in line with numerous previous findings pointing to the reliability of the phenomenon. Multiple studies have shown the flexibility of context dependence38, its validity for non-binary outcomes24 and non-binary decision spaces60 and different temporal learning dynamics61. Furthermore, instances of context-dependent value learning have also been observed reliably in a wide range of non-human animals, as diverse as mammals, birds and insects34,62. The coincidence between our present cross-cultural results and the ample array of cross-species previous findings, reinforces the notion that RL processes may be largely hard coded and evolutionarily stable63. Indeed, despite the incidental generation of suboptimal preferences (for example, in the transfer phase), context-dependent value learning probably presents an overall adaptive value. Theoretical propositions suggest that the normativity of context-dependent value learning can be traced to at least two, not mutually exclusive sources. First, it is possible that outcome context dependence in RL may constitute just another manifestation of the adaptive coding phenomenon28,29. In adaptive coding theory, the neural representations of objective variables are transformed as a function of their underlying distribution as a means to adjust to neural constraints in information processing30,64,65. Second, it is also possible that context-dependent value learning serves the purpose of maximizing performance (that is, fitness) in many ecological foraging situations66. Namely, encoding the convenience of a choice with respect to its alternatives in context (that is, storing the result of a computation rather than all of its components) would be much less resource intensive and ecological than committing to memory large repertoires of absolute values dissociated from their contexts67. A crucial contribution of the present work is the analysis of behavioural performance in a description-based decision-making task featuring the same decision problems as in the transfer phase (in addition to other benchmark decision problems). This allowed us, first and foremost, to rule out the possibility that an absence of cross-cultural variation in context-dependent value learning could be merely due to our inability to detect any cross-cultural differences in choice behaviour in our sample. This was not the case, as we observed that behavioural risk preferences elicited during the lottery task were significantly different across countries. As with previous cross-cultural studies on decision-making, differences in lottery-elicited risk preferences were found to be multicausal5,36,37. Possible causes for this lack of clarity in the aetiology of risk preferences can be traced to the diversity of methods used to quantify risk aversion across studies and to the fact that most of the tested predictors evaluated so far have been shown to account for only small fractions of the total variance37. As stated, pinpointing Article the cultural drivers of differences in risk preferences across countries was beyond the scope of the present work. Given their effect size and exploratory nature, these results can not be interpreted at the moment as anything more than venues for future research. Still, our findings highlight the necessity of developing a unified strategy for quantifying risk preferences that may take into account the socioeconomic, demographic and cognitive characteristics of intercultural samples68. Importantly, the addition of a lottery task featuring decision contexts homologous to those of the RL task allowed us to directly compare experience- and description-based choice behaviour. This led us to show that in otherwise comparable decision contexts risk aversion as inferred from a standard lottery task does not explain preferences in the transfer phase of a RL task. This was particularly noteworthy for the ∆EV = 1.75 decision context, in which suboptimal choice preferences are customarily considered a hallmark of context dependence in value learning23,26,34. Indeed, in the present work, preference reversal in this context was observable for all countries during RL, and shown to be different from risk-driven choice behaviour, thus calling for an alternative explanation. These differences between the RL and lottery tasks, concerning both subjective outcome encoding and cross-cultural stability, were well recapitulated by our modelling approach. We devised a simple parsimonious outcome-scaling process that, fitted to both experiential and described versions of our decision problems, led to the emergence of two clearly distinguishable sets of values for the scaling parameter. It is important to underscore that, while for parsimony and commensurability purposes we modelled preferences in RL and lottery tasks with the same outcome-scaling model, this does not imply the assumption that both tasks share similar computational processes. Indeed, based on the present and other behavioural findings13,21,26 it is likely that these different value-scaling schemes arise from different underlying computations altogether: respectively, outcome range adaptation in RL and diminishing marginal utility in lottery (see the Supplementary Information for further consideration). It is nonetheless important to note that here we are not claiming that context-dependent valuation is exclusive to choices based on experience (or reinforcement). In fact, many contextual effects have been documented in descriptive choices (such as the decoy effect). Further studies should determine whether such effects of description-based choices are cross-culturally stable. The present results broadly fit within the larger framework of the experience–description gap by showing that preferences for the same decision contexts are strongly affected by the modality in which the problems are presented6,7,69. This begs the question of whether or not differences in probability weighting, which are robustly reported between experience- and description-based decisions, could explain the observed discrepancy, and more specifically the preference reversal in the ∆EV = 1.75 decision context8. Prima facie, the fact that the 1 point with 75% chance option would be preferred to the 10 points at 25% chance option is compatible with the traditional experienced-based pattern of underweighting rare events7,70. However, it should be noted that for the preference reversal to derive solely from different probability weightings it would require a probability distortion much larger than what has commonly been observed in experiments and meta-analyses to date8,71. Furthermore, the learning phase of our experience-based task featured complete feedback—a manipulation that makes feedback information independent from choice and thus decreases or even eliminates insufficient probability sampling (which is the traditional explanation for the classical probability weighting of experience-based choices). Finally, the underweighting of rare events would not explain the absence of a magnitude effect during the RL learning phase. Conversely, outcome context dependence does provide a satisfactory and parsimonious explanation for the observed choice patterns in both the learning and transfer phases. Finally, we offer some reflection on the implications of our findings for behavioural science-inspired interventions in policy-making. Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 In recent years, the idea that descriptive models of behavioural decision-making should be used to inform better policies (top down) or for designing better decision architectures (bottom up) has gained traction72–74. In the long term, this approach may help to improve both individual and collective decision-making in domains where biases and suboptimal decision-making represent key bottlenecks (for example, issues such as choice of vaccination or behaviours favouring environmental protection). Historically, decision models in (behavioural) economics, nudging and behaviourally inspired policies have been based on description-based choice behaviour. Our results show that, compared with description-based processes, experience-based decision models are much more stable on a cross-cultural level, possibly capturing deep and preserved features of human cognition. We therefore believe that, especially if this pattern is confirmed and generalized to other tasks and processes, the present work calls for a better consideration of experience-based decision models in designing behavioural science-informed public policies in general. Methods Participants Recruitment was conducted locally, through the standard channels of each participating institution (for example, dedicated mailing lists, flyers and online advertisements). Sample size was determined through a power analysis based on the behavioural results of an online experiment by Bavard et al.26. For the ∆EV = 1.75 context of said experiment (blocked trials and complete feedback version), online participants reached a difference between choice rate and chance (0.5) of 0.27 ± 0.30 (mean ± s.d.). To obtain the same difference with a power of 0.95, the MATLAB function samsizepwr.m indicated that 46 participants per country were needed. Samples were allowed to exceed this limit by up to 20%, to ensure that the desired power would be achieved regardless of potential participant exclusions. Exclusion criteria consisted of failure to complete the task (n = 43) and troubleshooting/ translation issues during the online task rollout (n = 19). A remainder of n = 561 participants (342 female; mean (s.d.) age = 24.4 (4.6) years) comprised the final sample. Ethics The research was carried out following the principles and guidelines for human experimentation provided in the Declaration of Helsinki (1964; revised in 2013). This study belongs to a series of experiments approved by the INSERM Ethics Evaluation Committee (IRB00003888). Wherever needed, this ethical authorization was seconded with further authorizations at the local level at the behest of each participating institution (for Japan, the Waseda University Ethics Committee (2019357(1)); for the United States, Rutgers Institutional Review Board (IRB) (Pro2019000049); for Israel, the University of Haifa IRB (psychology ethics committee 038/20); for Russia, the Higher School of Economics Committee on Interuniversity Surveys and Ethical Assessment of Empirical Research; for India, the Memorandum of Understanding, with SRM University granting validity to French approval IRB00003888; and for China, the School of Psychological and Social Sciences at Peking University (approval number 2018-03-01)). All of the remaining collaborators did not need unique ethics approval because of compatibility between local requirements and the existing standards in France and other countries. All participants provided standardized written informed consent before inclusion. Payment To sustain motivation throughout the experiment, participants were given a bonus depending on the number of points won in each task. To ensure motivation would be even across countries, each participating institution calculated the average cost of a local university lunch (an inter-country average cost of €5.8 ± 2.82) and divided it by the total number of points to be potentially won throughout the experiment Article (that is, 1,275 points for a perfect run, with an average value of points of €0.0045 ± 0.002 and an average bonus reward of €5.4 ± 1.53). In addition to the bonus accrued through point accumulation, all participants received a flat participation rate equivalent to an additional student lunch (see Supplementary Table 2 for average bonuses in local currencies). Behavioural task There were two behavioural tasks: the RL task and the lottery task (Fig. 1a). The RL task was a direct reproduction of the probabilistic instrumental learning task performed in experiment 7 of Bavard et al.26. Participants were asked to choose on a trial basis between the undisclosed lotteries of different two-armed bandit problems, with the goal of maximizing overall reward. The lottery task consisted of a standard economic decision-making task, where participants had to choose on a trial basis between two lotteries of known expected value, again with the intention of maximizing overall reward. In the RL task, the lotteries for each decision context were represented by abstract stimuli (cues) taken from randomly generated identicons. Identicons were generated so that hue and saturation had similar values within the HSLUV colour scheme (www.hsluv.org). In the lottery task, cue cards displaying the reward and probability values for each option were used instead. For all tasks, each decision context was formed by two cues, one at each side of the screen, equidistant from the screen centre. Each trial consisted of a single decision context. Stimulus location was pseudo-randomized, so that every cue would appear an equal number of times on each side of the screen. In the RL task, participants had to complete a learning phase and then a transfer phase16,21–26,56. In the learning phase (Fig. 1b, top) cues appeared in four different fixed pairs (that is, decision contexts). Within pairs, each cue would lead to possible zero and non-zero outcomes with reciprocal probabilities (0.75/0.25 and 0.25/0.75). Each decision context featured only two possible outcomes: either 10/0 points or 1/0 points. Contexts were labelled by taking into account the difference in expected value between options (that is, two ∆EV = 5.0 and two ∆EV = 0.5 decision contexts). Once a choice was made by clicking on a cue, a fixed 500 ms delay ensued, after which factual and counterfactual choice feedback was displayed for 1,000 ms in the form of 10, 1 or 0 points cue cards. After learning phase completion, the subtotal of points earned was displayed, together with its monetary equivalent in the local currency. In the transfer phase, cues were rearranged into four new pairs (∆EV = 7.25, ∆EV = 6.75, ∆EV = 2.25 and ∆EV = 1.75). Crucially, the probability of obtaining a specific outcome from each cue remained the same as in the learning phase (Fig. 1b, bottom). In the lottery task (Fig. 1c), participants had to choose between explicit cue cards, which were paired reproducing the four decision contexts of the transfer phase, and another four decision contexts comparing varying probabilities of winning 10 points (100, 75, 50 and 25%) versus the certainty of winning 1 point (∆EV = 9, ∆EV = 6.5, ∆EV = 4 and ∆EV = 1.5). Neither the transfer phase nor the lottery task presented any post-choice feedback: choices were followed by a fixed 500 ms delay interval, after which ‘???’ cue cards were displayed for 1,000 ms. Each decision context of the RL task (four in the learning phase and four in the transfer phase) was presented 30 times, for a total of 240 trials. Decision contexts of the lottery task (four reproducing transfer and four benchmarking risk aversion) were presented four times each, for a total of 32 trials. The presentation order of decision contexts was pseudo-randomized within each phase so that all trials of a given decision context would be clustered (that is, blocked stimuli presentation). Questionnaires After completing the behavioural experiment, participants were required to complete several psychometric and socioeconomic questionnaires. Socioeconomic questionnaires included the individualistic and collectivistic tendencies inventory42, the perceived socioeconomic Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 status in childhood, adulthood and social hierarchy questionnaires41 and the centrality of religiosity questionnaire43. The sole goal of these questionnaires was to confirm that samples were socioculturally different from each other, as simply belonging to different countries may not have ensured a difference. Psychometric questionnaires were incorporated for purely exploratory purposes, including the Ten-Item Personality Inventory75 and the extended version of the Cognitive Reflection Test44. The order of the questionnaires, as well as the questions within each questionnaire, was randomized (see Supplementary Information for a technical description of each questionnaire and exploratory analyses). Country metrics Questionnaires gave us the opportunity to assess different dimensions of the socioeconomic and cultural makeup of each country sample from participants’ own subjective answers. To quantify the socioeconomic and cultural profile of each country sample in a macrometric way, we also incorporated into the analysis each country’s HDI score39, as well as the cultural distance between countries40. Both of these coefficients were computed by combining large numbers of economic, educational, political and psychosocial markers. Under the same rationale as the questionnaires, inclusion of these metrics was not hypothesis driven, but rather served to establish the differences between country samples and enable exploratory analyses (see Supplementary Information for details on metrics). Procedure Testing was conducted in a hybrid face-to-face/online format, where participants met a local experimenter for an online live debrief held in their local language to verify identity and cultural affiliation. After the interview, participants received a personalized link to a Gorilla server (www.gorilla.sc) where the experiment was hosted. After clicking on the link, participants were sent to a consent form, which they had to complete in order to access the experiment. The experiment started by providing written instructions on how to perform the task. It was explained to participants that they would have to choose between two different options over several trials, with the goal of maximizing overall point reward. They were told that they would have to make this decision without necessarily knowing the probability and magnitude of rewards for each option at first. Finally, it was explained at length that their final payoff would be affected by their choices, as rewards were convertible to actual currency. The possible outcomes in points (0, 1 and 10 points) were explicitly shown, as well as the points-to-currency conversion rate for their country (for example, 1 point = €0.005 in France; see Supplementary Table 2). Instructions were followed by a short training session of 12 trials, designed to familiarize participants with response modality. Participants could decide to repeat the training session up to twice before starting the experiment. After finishing the training session, participants had to complete the RL task (learning and transfer), lottery task and sociocultural questionnaires, in that order. Presenting the tasks in this particular order, rather than balancing task presentation order, was deemed preferable to prevent participants from entering the RL task with previous reward distribution knowledge that could affect performance76–78. Namely, the lottery task: (1) provided participants with the exact value of all choices and their range; and (2) revealed the configuration of all decision contexts in the transfer phase. Such information could push participants to implement rogue policies that would turn the RL task into a matching task (for example, actively searching for which implicit cue corresponds to which explicit cue). As an additional measure to prevent the use of alternative strategies, the existence of the transfer phase was not disclosed until the end of the learning phase. Crucially, before starting the transfer phase, participants were made explicitly aware of the fact that they would be presented with the same cues they had seen during the learning phase, but combined in different pairs. Before starting Article https://doi.org/10.1038/s41562-024-01894-9 the lottery task, participants were shown an example of a cue card with its explicit reward probability and magnitude written on it and were again instructed to choose the option that they thought would maximize overall point reward. Following completion of the lottery task, participants had to answer all sociocultural and psychometric questionnaires. The order of the questionnaires, as well as the order of each item within the questionnaires, was randomized. Completing the full experiment, including consent and questionnaires, took approximately 25 min (average response time per trial: 1.46 ± 6.7 s; median: 0.96 s). Once finished with the experiment, participants were given a personalized completion code and were tasked with sending this code to the experimenter by email to signal completion and trigger payment. The online debrief, task instructions and questionnaires were all delivered in each country’s official language by local researchers. Statistical analyses All of the statistical analyses were performed and visualized using R79–81. The main dependent variable was the correct choice rate (that is, choices directed towards the option with the highest expected value). Statistical effects were assessed by phase, using generalized linear mixed-effect models with a random intercept per participant79, with decision context and country of sample as categorical predictors (that is, P(correct) ~ decision context × country + ε; see Supplementary Information for model selection). P values were computed through analysis of deviance (type II Wald χ2 test) and we report χ2, degrees of freedom and P values. The proportion of variance explained per predictor was not reported because of how variance is partitioned in mixed models82. In cases where only one data point per participant was available (for example, differences in parameter values across countries), statistical significance was evaluated through standard linear models using country as a categorical predictor (for example, νRL ~ country). For those analyses, we report the F statistic, sum of squares, P value and Cohen’s F. Post-hoc contrasts were calculated with their respective confidence intervals, through estimated marginal means analysis, and P values were Benjamini–Hochberg corrected. In particular, whenever we had to assess whether choice rate performances were significantly different from chance, we performed additional t-tests against the chance level (0.5). In those cases, we report the t-statistic, P value and Cohen’s d to estimate effect size. The significant association between continuous quantities (for example, between parameter value and performance for a given decision context) was tested through correlation analysis, for which we report the t-statistic, degrees of freedom, P values and the R coefficient as the effect size. To prove lack of effect, we conducted AICc weight ratio analyses83,84 using a model containing the tested predictor (full) and its equivalent minus said predictor (null). Computational analyses The SCALING model was built around the notion of value scaling. Value scaling for both the RL and lottery tasks was arbitrated by the free parameter (ν) designed to capture value adaptation as follows: then modelled participants’ choice behaviour using a softmax decision rule that yielded the probability that for a state s a participant would choose, say, option a over option b according to: 1 P(a)t = 1 + e β×(Q(b)t −Q(a)t ) where β is the inverse temperature parameter. Low inverse temperatures (β → 0) cause the action to be stochastically equiprobable. High inverse temperatures (β → +∞) result in choices deterministically determined by the difference between the Q values11. Our algorithm also included a forgetfulness parameter ϕ (0 ≤ ϕ ≤ 1) that allowed us to account for the possibility of forgetting the option values when moving from the learning to the transfer phases of the RL task. The Q values used to fit (and simulate) the transfer phase choices ( Q (∶)TRA ) were calculated from the Q values of the learning phase Q (∶)LEA as follows: Q (∶)TRA = Q (∶)LEA × ϕ For the lottery task, expected utilities (EU) of individual lotteries were calculated based on the described probability (P) of its non-zero outcome and the subjective rescaled rewards ( Rscaled,t, calculated as for the learning task). For example, the expected value of lottery a was calculated as follows: EU(a) = R(a)scaled,t × P(a) Choice probabilities were also instantiated through a softmax rule, as follows (probability of choosing lottery a over lottery b): 1 P(a)t = 1 + e β×(EU(b)−EU(a)) Since the lottery task does not involve learning or memory processes, its model lacked any notion of learning rate and forgetting parameter. The RL task and lottery model shared the scaling parameter and inverse temperature that were fitted specifically for each task (νRL and νLOT, and βRL and βLOT). Model parameters were fitted using maximum likelihood estimation with gradient descent, as implemented in MATLAB. Finally, in the ‘Alternative models’ section in Supplementary Information, we compare SCALING with three alternative computational models to discard other possible interpretations of our data. These include the ABSOLUTE model, which encodes outcomes on an absolute scale independent of the decision context in which they were presented; the ABSOLUTE-RISK model, which rescales rewards for the RL task trials using the νLOT parameter fitted on lottery task trials, to evaluate whether risk aversion predicts preference reversal; and the NEGLECT model, which assumes that participants only learned the probabilities behind each choice, but ignored reward magnitude. Reporting summary Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article. 10p × ν, if Robj,t = 10p Rscaled,t = { Robj,t otherwise where Rscaled,t represents the scaled objective reward Robj,t at trial t and 0 ≤ ν ≤ 1. For RL task trials, we used a simple Q-learning model11 to estimate for each choice context (or state) the expected reward (Q) of each option and pick the one that maximized this expected reward Q. At trial t, option values (for example, of the chosen option c) were updated according to the delta rule: Data availability Data for the present study are available for free (for non-commercial use only) from our OSF.io repository (https://osf.io/yebm9/?view_ only=). Source data are provided with this paper. Code availability Main analysis scripts are available (for non-commercial use only) from the Human Reinforcement Learning Team GitHub repository (https://github.com/hrl-team/WEIRDbandit). Q(c)t+1 = Q(c)t + αc × (R(c)scaled,t − Q (c)t ) References where αc is the learning rate for the chosen option, which, multiplied by the difference between Rscaled,t and Qt, is the prediction error term. We Nature Human Behaviour 1. Ruggeri, K. et al. Replicating patterns of prospect theory for decision under risk. Nat. Hum. Behav. 4, 622–633 (2020). Article 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. Ruggeri, K. et al. The globalizability of temporal discounting. Nat. Hum. Behav. 6, 1386–1397 (2022). Hallsson, B. G., Siebner, H. R. & Hulme, O. J. Fairness, fast and slow: a review of dual process models of fairness. Neurosci. Biobehav. Rev. 89, 49–60 (2018). Kim, B., Sung, Y. S. & McClure, S. M. The neural basis of cultural differences in delay discounting. Phil. Trans. R. Soc. B 367, 650–656 (2012). Rieger, M. O., Wang, M. & Hens, T. Risk preferences around the world. Manag. Sci. 61, 637–648 (2013). Garcia, B., Cerrotti, F. & Palminteri, S. The description–experience gap: a challenge for the neuroeconomics of decision-making under uncertainty. Phil. Trans. R. Soc. B 376, 20190665 (2021). Hertwig, R. & Erev, I. The description–experience gap in risky choice. Trends Cogn. Sci. 13, 517–523 (2009). Wulff, D. U., Mergenthaler-Canseco, M. & Hertwig, R. A meta-analytic review of two modes of learning and the description–experience gap. Psychol. Bull. 144, 140–176 (2018). Niv, Y. Reinforcement learning in the brain. J. Math. Psychol. 53, 139–154 (2009). Wimmer, G. E., Daw, N. D. & Shohamy, D. Generalization of value in reinforcement learning by humans. Eur. J. Neurosci. 35, 1092–1104 (2012). Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction 2nd edn (MIT Press, 2018). Frank, M. J., Seeberger, L. C. & O’reilly, R. C. By carrot or by stick: cognitive reinforcement learning in parkinsonism. Science 306, 1940–1943 (2004). Vandendriessche, H. et al. Contextual influence of reinforcement learning performance of depression: evidence for a negativity bias? Psychol. Med. 53, 4696–4706 (2022). Plonsky, O., Roth, Y. & Erev, I. Underweighting of rare events in social interactions and its implications to the design of voluntary health applications. Judgm. Decis. Mak. 16, 267–289 (2021). Ho, T. H., Camerer, C. F. & Chong, J.-K. Self-tuning experience weighted attraction learning in games. J. Econ. Theory 133, 177–198 (2007). Palminteri, S. & Lebreton, M. Context-dependent outcome encoding in human reinforcement learning. Curr. Opin. Behav. Sci. 41, 144–151 (2021). Palminteri, S. & Lebreton, M. The computational roots of positivity and confirmation biases in reinforcement learning. Trends Cogn. Sci. 26, 607–621 (2022). Kahneman, D. Maps of bounded rationality: psychology for behavioural economics. Am. Econ. Rev. 93, 1449–1475 (2003). Todd, P. M. & Gigerenzer, G. Bounding rationality to the world. J. Econ. Psychol. 24, 143–165 (2003). Henrich, J., Heine, S. & Norenzayan, A. The weirdest people in the world? Behav. Brain Sci. 33, 61–83 (2010). Palminteri, S. et al. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015). Bavard, S. et al. Reference-point centering and range-adaptation enhance human reinforcement learning at the cost of irrational preferences. Nat. Commun. 9, 4503 (2018). Klein, T., Ullsperger, M. & Jocham, G. Learning relative values in the striatum induces violations of normative decision making. Nat. Commun. 8, 16033 (2017). Hayes, W. M. & Wedell, D. H.Reinforcement learning in and out of context: the effects of attentional focus. J. Exp. Psychol. Learn. Mem. Cogn. 49, 1193–1217 (2023). Juechems, K. & Summerfield, C. Where does value come from? Trends Cogn. Sci. 23, 836–850 (2019). Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 26. Bavard, S., Rustichini, A. & Palminteri, S. Two sides of the same coin: beneficial and detrimental consequences of range adaptation in human reinforcement learning. Sci. Adv. 7, eabe0340 (2021). 27. Hayes, W. M. & Wedell, D. H. Testing models of context-dependent outcome encoding in reinforcement learning. Cognition 230, 105280 (2023). 28. Rustichini, A., Soukupova, M. & Palminteri, S. Adaptive coding is optimal in reinforcement learning. SSRN https://doi.org/10.2139/ ssrn.4320894 (2023). 29. Padoa-Schioppa, C. & Rustichini, A. Rational attention and adaptive coding: a puzzle and a solution. Am. Econ. Rev. 104, 507–513 (2014). 30. Fairhall, A. et al. Efficiency and ambiguity in an adaptive neural code. Nature 412, 787–792 (2001). 31. Sato, T. et al. An excitatory basis for divisive normalization in visual cortex. Nat. Neurosci. 19, 568–570 (2016). 32. Carandini, M. & Heeger, D. J. Summation and division by neurons in primate visual cortex. Science 264, 1333–1336 (1994). 33. Freidin, E. & Kacelnik, A. Rational choice, context dependence, and the value of information in European starlings (Sturnus vulgaris). Science 334, 1000–1002 (2011). 34. Pompilio, L. & Kacelnik, A. Context-dependent utility overrides absolute memory as a determinant of choice. Proc. Natl Acad. Sci. USA 107, 508–512 (2010). 35. Garcia, B. Experiential values are underweighted in decisions involving symbolic options. Nat. Hum. Behav. 7, 611–626 (2023). 36. Gandelman, N. & Hernández-Murillo, R.Risk aversion at the country level. Fed. Res. Bank St. Louis Rev. 97, 53–66 (2015). 37. Haridon, O. & Vieider, F. All over the map: a worldwide comparison of risk preferences. Quant. Econ. 10, 185–215 (2019). 38. Juechems, K., Altun, T., Hira, R. & Jarvstad, A. Human value learning and representation reflect rational adaptation to task demands. Nat. Hum. Behav. 6, 1268–1279 (2022). 39. Human Development Report 2020: The Next Frontier: Human Development and the Anthropocene (United Nations Development Programme, 2020). 40. Muthukrishna, M. et al. Beyond Western, educated, industrial, rich, and democratic (WEIRD) psychology: measuring and mapping scales of cultural and psychological distance. Psychol. Sci. 31, 678–701 (2020). 41. Griskevicius, V. et al. When the economy falters, do people spend or save? Responses to resource scarcity depend on childhood environments. Psychol. Sci. 24, 197–205 (2013). 42. Triandis, H. C. & Gelfland, M. J. Converging measurement of horizontal and vertical individualism and collectivism. J. Pers. Soc. Psychol. 74, 118–128 (1998). 43. Huber, S. & Huber, O. The centrality of religiosity scale (CRS). Religions 3, 710–724 (2012). 44. Toplak, M. E., West, R. F. & Stanovich, K. E. Assessing miserly information processing: an expansion of the cognitive reflection test. Think. Reason. 20, 147–168 (2014). 45. Lichtenstein, S. & Slovic, P. The Construction of Preference (Cambridge Univ. Press, 2006). 46. Cartwrigth, E. Behavioural Economics (Routledge, 2018). 47. Alós-Ferrer, C. et al. Preference reversals: time and again. J. Risk Uncertain. 52, 65–97 (2016). 48. Alós-Ferrer, C. & Granic, G. D. Does choice change preferences? An incentivized test of the mere choice effect. Exp. Econ. 26, 499–521 (2023). 49. Smith, S. Cultural Anthropology (Allyn and Bacon, 1997). 50. Yates, F. & de Oliveira, S. Culture and decision making. Organ. Behav. Hum. Decis. Process. 136, 106–118 (2016). 51. Choi, I., Choi, J. A. & Norenzayan, A. in Blackwell Handbook of Judgment and Decision Making (eds Koehler, D. J. & Harvey, N.) 504–524 (Blackwell Publishing, 2004). Article 52. Gelfand, M. J. et al. Differences between tight and loose cultures: a 33-nation study. Science 332, 1100–1104 (2011). 53. Kitayama, S. & Cohen, D. Handbook of Cultural Psychology 2nd edn (Guilford Press, 2018). 54. Yates, J. F. et al. Indecisiveness and culture: Incidence, values, and thoroughness. J. Cross Cult. Psychol. 41, 428–444 (2010). 55. Arkes, H. R., Hirshleifer, D., Jiang, D. & Lim, S. S. A cross-cultural study of reference point adaptation: evidence from China, Korea, and the US. Organ. Behav. Hum. Decis. Process. 112, 99–111 (2010). 56. Spektor, M. & Seidler, H. Violations of economic rationality due to irrelevant information during learning in decision from experience. Judgm. Decis. Mak. 17, 425–448 (2022). 57. Barret, H. C.Towards a cognitive science of the human: cross-cultural approaches and their urgency. Trends Cogn. Sci. 24, 620–638 (2020). 58. Nielsen, M., Haun, D., Kartner, J. & Legare, C. H. The persistent sampling bias in developmental psychology: a call to action. J. Exp. Child Psychol. 162, 31–38 (2017). 59. Linnell, K. J. & Caparos, S. Urbanisation, the arousal system, and covert and overt attentional selection. Curr. Opin. Psychol. 32, 100–104 (2020). 60. Bavard, S. & Palminteri, S. The functional form of value normalization in human reinforcement learning. eLife 12, e83891 (2023). 61. Hayes, W. M. & Wedell, D. Effects of blocked versus interleaved training on relative value learning. Psychon. Bull. Rev. 30, 1895–1907 (2023). 62. Solvi, C. et al. Bumblebees retrieve only the ordinal ranking of foraging options when comparing memories obtained in distinct settings. eLife 11, e78525 (2022). 63. Kacelnik, A., Vasconcelos, M. & Monteiro, T. Testing cognitive models of decision-making: selected studies with starlings. Anim. Cogn. 26, 117–127 (2023). 64. Rangel, A. & Clithero, J. A. Value normalization in decision making: theory and evidence. Curr. Opin. Neurobiol. 22, 970–981 (2012). 65. Louie, K. & Glimcher, P. W. Efficient coding and the neural representation of value. Ann. NY Acad. Sci. 1251, 13–32 (2012). 66. McNamara, J. M., Trimmer, P. C. & Houston, A. I. The ecological rationality of state-dependent valuation. Psychol. Rev. 119, 114–119 (2012). 67. Hunter, L. E. & Daw, N. D. Context-sensitive valuation and learning. Curr. Opin. Behav. Sci. 41, 122–127 (2021). 68. Frey, R., Pedroni, A., Mata, R., Rieskamp, J. & Hertwig, R. Risk preference shares the psychometric structure of major psychological traits. Sci. Adv. 3, e1701381 (2017). 69. Madan, C. R., Ludvig, E. A. & Spetch, M. L. Comparative inspiration: from puzzles with pigeons to novel discoveries with humans in risky choice. Bahav. Process. 160, 10–19 (2019). 70. Zilker, V. & Pachur, T. Nonlinear probability weighting can reflect attentional biases in sequential sampling. Psychol. Rev. 129, 949–975 (2022). 71. Erev, I. et al. Choice prediction competition: choices from experience and from description. J. Behav. Decis. Mak. 23, 15–47 (2010). 72. Thaler, R. H. & Sunstein, C. R. Libertarian Paternalism Is Not an Oxymoron Public Law and Legal Theory Working Paper No. 43 (Univ. Chicago, 2003). 73. Grüne-Yanoff, T., Marchionni, C. & Feufel, M. Toward a framework for selecting behavioural policies: how to choose between boosts and nudges. Econ. Philos. 34, 243–266 (2018). 74. Brown, P., Cameron, L., Wilkinson, M. & Taylor, D. in The Handbook of Behaviour Change (eds Hagger, M. et al.) 617–631 (Cambridge Univ. Press, 2020). Nature Human Behaviour https://doi.org/10.1038/s41562-024-01894-9 75. Gosling, S. D., Rentfrow, P. J. & Swann, W. B. Jr. A very brief measure of the big five personality domains. J. Res. Pers. 37, 504–528 (2003). 76. Doll, B. B., Jacobs, W. J., Sanfey, A. G. & Frank, M. J. Instructional control of reinforcement learning: a behavioral and neurocomputational investigation. Brain Res. 1299, 74–94 (2009). 77. Li, J., Delgado, M. & Phelps, E. How instructed knowledge modulates the neural systems of reward learning. Proc. Natl Acad. Sci. USA 108, 55–60 (2010). 78. Wang, Z. & Taylor, M. E. Interactive reinforcement learning with dynamic reuse of prior knowledge from human and agent demonstrations. In Proc. 28th International Joint Conference on Artificial Intelligence (IJCAI'19) 3820–3827 (AAAI Press, 2019). 79. Bates, D., Maechler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015). 80. R Core Developmemt Team R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2014). 81. Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2009). 82. Rights, J. D. & Sterba, S. K. Quantifying explained variance in multilevel models: an integrative framework for defining R-squared measures. Psychol. Methods 24, 309–338 (2019). 83. Burnham, K. P., Anderson, D. R. & Huyvaert, K. P. AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav. Ecol. Sociobiol. 65, 23–35 (2011). 84. Wagenmakers, E. J. & Farrell, S. AIC model selection using Akaike weights. Psychonom. Bull. Rev. 11, 192–196 (2004). Acknowledgements We thank a number of colleagues and peers, including the members of the Human Reinforcement Learning laboratory and all of the senior researchers who provided feedback during the multiple conference presentations in which this work was featured. We also thank Waseda University and the École Normale Supérieure Department of Cognitive Studies for aiding us with the many logistical obstacles that we had to overcome in order to kickstart this study during the thick of the COVID-19 pandemic. We especially thank all of the participants who kindly contributed their time to make this study a reality. S.P. is supported by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (RaReMem: 101043804), the Agence Nationale de la Recherche (CogFinAgent: ANR-21-CE23-0002-02; RELATIVE: ANR-21-CE37-0008-01; RANGE: ANR-21-CE28-0024-01) and the Alexander von Humboldt-Stiftung. O.Z., D.K. and A.S. were supported by the Basic Research Program at the National Research University Higher School of Economics (HSE University). U.H. and M.C. were supported by the Israel Science Foundation (1532/20). K.W. was supported by JSPS KAKENHI (22H00090) and JST Moonshot Research and Development (JPMJMS2012). A.B.K., M.G. and D.B. were supported by the National Institute on Drug Abuse (R01DA053282 and R01DA054201 to A.B.K.). J.N. was supported by the James S. McDonnell Foundation 21st Century Science Initiative in Understanding Human Cognition—Scholar Award (#220020334) and by a Sponsored Research Agreement between Meta and Fundación Universidad Torcuato Di Tella (#INB2376941). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. Author contributions H.A. is the lead author and researcher responsible for study design, coordination and management between teams, data management and collection and analysis, visualization and writing of the paper. S.P. was the main senior supervisor, who worked hand in hand Article with H.A. on every aspect of this work, including collaboration management, design, hypothesis development, supervision of the analysis, interpretation of the results, visualization and writing. S.B. was the main author behind the original design that this study replicated and contributed greatly to ensuring that our design indeed reproduced theirs. F.B., D.B., F.C., M.C., M.G., E.J.G., D.K., M.K., G.L., M.S., J.Y and O.Z. reviewed and supported the design of the experiment and its hypotheses. They also took charge of translation and deployment of the experiment in each of their countries, collected data locally and revised the paper. B.B., J.S.C., U.H., A.B.K., J.L., C.O., J.N., G.R., A.S.-J., A.S., B.S. and K.W. are senior supervisors who monitored the study locally, providing insight on the experimental design and commentary on the final version of the paper. In addition, K.W. provided essential scientific and logistical support in deploying the experiment worldwide. Competing interests The authors declare no competing interests. https://doi.org/10.1038/s41562-024-01894-9 Correspondence and requests for materials should be addressed to Hernán Anlló or Stefano Palminteri. Peer review information Nature Human Behaviour thanks Thomas J. Faulkenberry and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. Additional information Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41562-024-01894-9. 1 © The Author(s), under exclusive licence to Springer Nature Limited 2024, corrected publication 2024 Human Reinforcement Learning Team, Laboratory of Cognitive and Computational Neuroscience, Paris, France. 2Faculty of Science and Engineering, Waseda University, Tokyo, Japan. 3Intercultural Cognitive Network, Paris, France. 4General Psychology Lab, Hamburg University, Hamburg, Germany. 5 School of Collective Intelligence, Université Mohammed VI Polytechnique, Rabat, Morocco. 6Department of Psychiatry, University Behavioral Health Care and Brain Health Institute, Rutgers University—New Brunswick, Piscataway, NJ, USA. 7Department of Cognitive Sciences, University of Haifa, Haifa, Israel. 8Facultad de Psicología, Universidad del Desarrollo, Santiago de Chile, Chile. 9International Laboratory for Social Neurobiology, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia. 10Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina. 11 School of Psychological and Cognitive Sciences and Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, China. 12Centre for Cognition and Decision Making, Institute for Cognitive Neuroscience, HSE University, Moscow, Russia. 13Department of Psychology, Ludwig Maximilian University, Munich, Germany. 14IDG/McGovern Institute for Brain Research, Peking University, Beijing, China. 15Escuela de Negocios, Universidad Torcuato Di Tella, Buenos Aires, Argentina. 16Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina. 17School of Cognitive Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran. 18Department of Clinical Psychology, SRM Medical College Hospital and Research Centre, Chennai, India. 19Departement d’études cognitives, Ecole normale supérieure, PSL Research University, Paris, France. e-mail: hernan.anllo@cri-paris.org; stefano.palminteri@ens.fr Nature Human Behaviour Last updated by author(s): Hernán Anlló (last edit 19/02/2024) Reporting Summary Nature Portfolio wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Portfolio policies, see our Editorial Policies and the Editorial Policy Checklist. nature portfolio | reporting summary Corresponding author(s): Stefano Palminteri, Hernán Anlló Statistics For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one- or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section. A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable. For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Our web collection on statistics for biologists contains articles on many of the points above. Software and code Policy information about availability of computer code Data collection Data was collected through the Gorilla.sc platform for online experiments (BUILD 20201002 ) Data analysis Statistical analyses and figure production was handled with R (ver 4.2). Additional instances of model fitting and sample power estimations were handled with Matlab (ver 2022b) For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information. Data Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: Data for the present study is available for free (for non-commercial use only), in our OSF.io repository (https://osf.io/yebm9/?view_only=). April 2023 - Accession codes, unique identifiers, or web links for publicly available datasets - A description of any restrictions on data availability - For clinical datasets or third party data, please ensure that the statement adheres to our policy 1 Policy information about studies with human participants or human data. See also policy information about sex, gender (identity/presentation), and sexual orientation and race, ethnicity and racism. Reporting on sex and gender Sex data was collected as part of a standard demographic survey, with the sole purpose of describing the tested population and ensuring our sample was balanced. No hypotheses concerning predictions based on sex or gender were made. Reporting on race, ethnicity, or Samples came from nationals of 11 different countries: USA, Iran, Israel, Chile, Argentina, Russia, Morocco, Japan, China, India, France. other socially relevant groupings Population characteristics Age (mean(SD), in years) = 24.4(4.6). Gender (% of females) = 60.9. Recruitment Recruitment was open, conducted through the standard university channels of each of the participating institutions (flyers, mailing lists). We are aware that this mode of recruitment implies an overrepresentation of university students in our sample to which we admit. This implies that generalizing the results obtained in this study to a broader population (e.g. rural populations, non-educated populations, elderly groups) cannot be done directly and requires further research targeting these groups exactly, Ethics oversight This protocol was approved by the INSERM Ethical Review Committee/IRB00003888 on 13 November 2018 nature portfolio | reporting summary Research involving human participants, their data, or biological material Note that full information on the approval of the study protocol must also be provided in the manuscript. Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf Life sciences study design All studies must disclose on these points even when the disclosure is negative. Sample size Describe how sample size was determined, detailing any statistical methods used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. Data exclusions Describe any data exclusions. If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established. Replication Describe the measures taken to verify the reproducibility of the experimental findings. If all attempts at replication were successful, confirm this OR if there are any findings that were not replicated or cannot be reproduced, note this and describe why. Randomization Describe how samples/organisms/participants were allocated into experimental groups. If allocation was not random, describe how covariates were controlled OR if this is not relevant to your study, explain why. Blinding Describe whether the investigators were blinded to group allocation during data collection and/or analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study. Behavioural & social sciences study design All studies must disclose on these points even when the disclosure is negative. This is an online behavioral experiment that consists of completing a reinforcement learning task, an economics decision-making task, and fill questionnaires concerning participants' cultural traits (e.g. collectivistic vs. individualistic tendencies). Thus, this experiment gathers both quantitative and qualitative data. Research sample Full description of the research sample is available on Table 1 of the manuscript. Samples are representative of their country's cultural traits (but see limitations outlined in Recruitment, above). In order to ensure that our samples would adequately represent the culture of the country to which they belonged, inclusion criteria required that participants: (1) had the target country nationality, (2) resided in the target country, (3) had completed at least the full basic education cycle in the target country, and (4) spoke the country’s official language as their native language. These criteria were assessed for each participant during a video meeting prior to launching the experiment. The meeting, task instructions, and questionnaires were delivered in each country’s official language, by local researchers. Rationale for this sample was to seek individuals both within and outside the WEIRD sphere. Number of countries included was ultimately limited by our resources. April 2023 Study description 2 Sample size was determined through a power analysis based on the behavioral results of Bavard et al, 2021. In the critical decision context of said experiment (expected value difference between options = 1.75), online participants reached a difference between choice rate and chance (0.5) of 0.27 +/- 0.30 (mean +/- SD). To obtain the same difference with a power of 0.95, the MATLAB function "samsizepwr.m" indicated that 46 participants per country were needed. Samples were allowed to exceed this limit by up to 20%, to ensure the desired power would be achieved regardless of potential exclusions. These numbers were also verified through simulation-based power analyses Sampling strategy was quota sampling (where each stratum was one of the selected countries). Quotas were determined by sample size. Data collection Testing was conducted in a hybrid face-to-face/online format, where participants met a local experimenter for an online live debrief held in their local language to verify identity and cultural affiliation. After the interview, participants received a personalized link to a Gorilla server (www.gorilla.sc) where the experiment was hosted. After clicking on the link, participants were sent to a consent form, which they had to complete in order to access the experiment. Once finished with the experiment, participants were given a personalized completion code, and were tasked with sending this code to the experimenter by email to signal completion and trigger payment. The online debrief, task instructions and questionnaires were all delivered in each country's official language, by local researchers. Experimenters in charge of providing the links and conducting the testing were not blinded to all conditions, since the main condition was belonging to a certain country, and that made blinding impossible in the context of our task. Said experimenters were also aware that the main hypothesis consisted of comparing decision strategies across countries (without further detail). They did not mention this to participants, who were unaware that the same experiment was being conducted in several countries simultaneously. Timing Data collection started on 06/2020 and finished on 04/2022 Data exclusions The only exclusion criterion established was task completion. Across all countries, 43 participants failed to complete the task and were excluded. An additional 19 participants were excluded because of troubleshooting/translation issues during the online task rollout. Non-participation Reasons for not finishing the online task were not provided by the participants (who just left their sessions open and never finished the task, or closed it mid-way). This is a common occurrence in online experiments. A total of 43 participants out of 623 across all countries dropped. An additional Randomization Participants were allocated to groups as afunction of the country they belonged to. This was done naturally, as separate recruitments took place locally at each of the 11 countries involved. nature portfolio | reporting summary Sampling strategy Ecological, evolutionary & environmental sciences study design All studies must disclose on these points even when the disclosure is negative. Study description Briefly describe the study. For quantitative data include treatment factors and interactions, design structure (e.g. factorial, nested, hierarchical), nature and number of experimental units and replicates. Research sample Describe the research sample (e.g. a group of tagged Passer domesticus, all Stenocereus thurberi within Organ Pipe Cactus National Monument), and provide a rationale for the sample choice. When relevant, describe the organism taxa, source, sex, age range and any manipulations. State what population the sample is meant to represent when applicable. For studies involving existing datasets, describe the data and its source. Sampling strategy Note the sampling procedure. Describe the statistical methods that were used to predetermine sample size OR if no sample-size calculation was performed, describe how sample sizes were chosen and provide a rationale for why these sample sizes are sufficient. Data collection Describe the data collection procedure, including who recorded the data and how. Timing and spatial scale Indicate the start and stop dates of data collection, noting the frequency and periodicity of sampling and providing a rationale for these choices. If there is a gap between collection periods, state the dates for each sample cohort. Specify the spatial scale from which the data are taken Data exclusions If no data were excluded from the analyses, state so OR if data were excluded, describe the exclusions and the rationale behind them, indicating whether exclusion criteria were pre-established. Reproducibility Describe the measures taken to verify the reproducibility of experimental findings. For each experiment, note whether any attempts to repeat the experiment failed OR state that all attempts to repeat the experiment were successful. Randomization Describe how samples/organisms/participants were allocated into groups. If allocation was not random, describe how covariates were controlled. If this is not relevant to your study, explain why. Blinding Describe the extent of blinding used during data acquisition and analysis. If blinding was not possible, describe why OR explain why blinding was not relevant to your study. Yes No April 2023 Did the study involve field work? 3 Field conditions Describe the study conditions for field work, providing relevant parameters (e.g. temperature, rainfall). Location State the location of the sampling or experiment, providing relevant parameters (e.g. latitude and longitude, elevation, water depth). Access & import/export Describe the efforts you have made to access habitats and to collect and import/export your samples in a responsible manner and in compliance with local, national and international laws, noting any permits that were obtained (give the name of the issuing authority, the date of issue, and any identifying information). Disturbance Describe any disturbance caused by the study and how it was minimized. Reporting for specific materials, systems and methods nature portfolio | reporting summary Field work, collection and transport We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. Materials & experimental systems Methods n/a Involved in the study n/a Involved in the study Antibodies ChIP-seq Eukaryotic cell lines Flow cytometry Palaeontology and archaeology MRI-based neuroimaging Animals and other organisms Clinical data Dual use research of concern Plants Antibodies Antibodies used Describe all antibodies used in the study; as applicable, provide supplier name, catalog number, clone name, and lot number. Validation Describe the validation of each primary antibody for the species and application, noting any validation statements on the manufacturer’s website, relevant citations, antibody profiles in online databases, or data provided in the manuscript. Eukaryotic cell lines Policy information about cell lines and Sex and Gender in Research Cell line source(s) State the source of each cell line used and the sex of all primary cell lines and cells derived from human participants or vertebrate models. Authentication Describe the authentication procedures for each cell line used OR declare that none of the cell lines used were authenticated. Mycoplasma contamination Confirm that all cell lines tested negative for mycoplasma contamination OR describe the results of the testing for mycoplasma contamination OR declare that the cell lines were not tested for mycoplasma contamination. Commonly misidentified lines Name any commonly misidentified cell lines used in the study and provide a rationale for their use. (See ICLAC register) Palaeontology and Archaeology Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the issuing authority, the date of issue, and any identifying information). Permits should encompass collection and, where applicable, export. Specimen deposition Indicate where the specimens have been deposited to permit free access by other researchers. April 2023 Specimen provenance 4 Dating methods Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information. Ethics oversight Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not. Note that full information on the approval of the study protocol must also be provided in the manuscript. Animals and other research organisms Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research, and Sex and Gender in Research Laboratory animals For laboratory animals, report species, strain and age OR state that the study did not involve laboratory animals. Wild animals Provide details on animals observed in or captured in the field; report species and age where possible. Describe how animals were caught and transported and what happened to captive animals after the study (if killed, explain why and describe method; if released, say where and when) OR state that the study did not involve wild animals. Reporting on sex Indicate if findings apply to only one sex; describe whether sex was considered in study design, methods used for assigning sex. Provide data disaggregated for sex where this information has been collected in the source data as appropriate; provide overall numbers in this Reporting Summary. Please state if this information has not been collected. Report sex-based analyses where performed, justify reasons for lack of sex-based analysis. Field-collected samples For laboratory work with field-collected samples, describe all relevant parameters such as housing, maintenance, temperature, photoperiod and end-of-experiment protocol OR state that the study did not involve samples collected from the field. Ethics oversight Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not. nature portfolio | reporting summary If new dates are provided, describe how they were obtained (e.g. collection, storage, sample pretreatment and measurement), where they were obtained (i.e. lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided. Note that full information on the approval of the study protocol must also be provided in the manuscript. Clinical data Policy information about clinical studies All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions. Clinical trial registration Provide the trial registration number from ClinicalTrials.gov or an equivalent agency. Study protocol Note where the full trial protocol can be accessed OR if not available, explain why. Data collection Describe the settings and locales of data collection, noting the time periods of recruitment and data collection. Outcomes Describe how you pre-defined primary and secondary outcome measures and how you assessed these measures. Dual use research of concern Policy information about dual use research of concern Hazards Could the accidental, deliberate or reckless misuse of agents or technologies generated in the work, or the application of information presented in the manuscript, pose a threat to: No Yes Public health National security Crops and/or livestock Ecosystems April 2023 Any other significant area 5 Does the work involve any of these experiments of concern: No Yes Demonstrate how to render a vaccine ineffective Confer resistance to therapeutically useful antibiotics or antiviral agents Enhance the virulence of a pathogen or render a nonpathogen virulent Increase transmissibility of a pathogen Alter the host range of a pathogen Enable evasion of diagnostic/detection modalities Enable the weaponization of a biological agent or toxin Any other potentially harmful combination of experiments and agents nature portfolio | reporting summary Experiments of concern Plants Seed stocks Report on the source of all seed stocks or other plant material used. If applicable, state the seed stock centre and catalogue number. If plant specimens were collected from the field, describe the collection location, date and sampling procedures. Novel plant genotypes Describe the methods by which all novel plant genotypes were produced. This includes those generated by transgenic approaches, gene editing, chemical/radiation-based mutagenesis and hybridization. For transgenic lines, describe the transformation method, the number of independent lines analyzed and the generation upon which experiments were performed. For gene-edited lines, describe the editor used, the endogenous sequence targeted for editing, the targeting guide RNA sequence (if applicable) and how the editor was applied. Describe any authentication procedures for each seed stock used or novel genotype generated. Describe any experiments used to assess the effect of a mutation and, where applicable, how potential secondary effects (e.g. second site T-DNA insertions, mosiacism, off-target gene editing) were examined. Authentication ChIP-seq Data deposition Confirm that both raw and final processed data have been deposited in a public database such as GEO. Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks. May remain private before publication. For "Initial submission" or "Revised version" documents, provide reviewer access links. For your "Final submission" document, provide a link to the deposited data. Files in database submission Provide a list of all files available in the database submission. Genome browser session Provide a link to an anonymized genome browser session for "Initial submission" and "Revised version" documents only, to enable peer review. Write "no longer applicable" for "Final submission" documents. Data access links (e.g. UCSC) Methodology Replicates Describe the experimental replicates, specifying number, type and replicate agreement. Sequencing depth Describe the sequencing depth for each experiment, providing the total number of reads, uniquely mapped reads, length of reads and whether they were paired- or single-end. Antibodies Describe the antibodies used for the ChIP-seq experiments; as applicable, provide supplier name, catalog number, clone name, and lot number. Peak calling parameters Specify the command line program and parameters used for read mapping and peak calling, including the ChIP, control and index files used. Data quality Describe the methods used to ensure data quality in full detail, including how many peaks are at FDR 5% and above 5-fold enrichment. Software Describe the software used to collect and analyze the ChIP-seq data. For custom code that has been deposited into a community repository, provide accession details. April 2023 6 Plots Confirm that: The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). All plots are contour plots with outliers or pseudocolor plots. A numerical value for number of cells or percentage (with statistics) is provided. Methodology Sample preparation Describe the sample preparation, detailing the biological source of the cells and any tissue processing steps used. Instrument Identify the instrument used for data collection, specifying make and model number. Software Describe the software used to collect and analyze the flow cytometry data. For custom code that has been deposited into a community repository, provide accession details. Cell population abundance Describe the abundance of the relevant cell populations within post-sort fractions, providing details on the purity of the samples and how it was determined. Gating strategy Describe the gating strategy used for all relevant experiments, specifying the preliminary FSC/SSC gates of the starting cell population, indicating where boundaries between "positive" and "negative" staining cell populations are defined. nature portfolio | reporting summary Flow Cytometry Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. Magnetic resonance imaging Experimental design Design type Indicate task or resting state; event-related or block design. Design specifications Specify the number of blocks, trials or experimental units per session and/or subject, and specify the length of each trial or block (if trials are blocked) and interval between trials. Behavioral performance measures State number and/or type of variables recorded (e.g. correct button press, response time) and what statistics were used to establish that the subjects were performing the task as expected (e.g. mean, range, and/or standard deviation across subjects). Acquisition Imaging type(s) Specify: functional, structural, diffusion, perfusion. Field strength Specify in Tesla Sequence & imaging parameters Specify the pulse sequence type (gradient echo, spin echo, etc.), imaging type (EPI, spiral, etc.), field of view, matrix size, slice thickness, orientation and TE/TR/flip angle. Area of acquisition State whether a whole brain scan was used OR define the area of acquisition, describing how the region was determined. Diffusion MRI Used Not used Preprocessing Provide detail on software version and revision number and on specific parameters (model/functions, brain extraction, segmentation, smoothing kernel size, etc.). Normalization If data were normalized/standardized, describe the approach(es): specify linear or non-linear and define image types used for transformation OR indicate that data were not normalized and explain rationale for lack of normalization. Normalization template Describe the template used for normalization/transformation, specifying subject space or group standardized space (e.g. original Talairach, MNI305, ICBM152) OR indicate that the data were not normalized. Noise and artifact removal Describe your procedure(s) for artifact and structured noise removal, specifying motion parameters, tissue signals and physiological signals (heart rate, respiration). April 2023 Preprocessing software 7 Volume censoring Define your software and/or method and criteria for volume censoring, and state the extent of such censoring. Model type and settings Specify type (mass univariate, multivariate, RSA, predictive, etc.) and describe essential details of the model at the first and second levels (e.g. fixed, random or mixed effects; drift or auto-correlation). Effect(s) tested Define precise effect in terms of the task or stimulus conditions instead of psychological concepts and indicate whether ANOVA or factorial designs were used. Specify type of analysis: Statistic type for inference Whole brain ROI-based Both Specify voxel-wise or cluster-wise and report all relevant parameters for cluster-wise methods. (See Eklund et al. 2016) Correction Describe the type of correction and how it is obtained for multiple comparisons (e.g. FWE, FDR, permutation or Monte Carlo). nature portfolio | reporting summary Statistical modeling & inference Models & analysis n/a Involved in the study Functional and/or effective connectivity Graph analysis Multivariate modeling or predictive analysis Functional and/or effective connectivity Report the measures of dependence used and the model details (e.g. Pearson correlation, partial correlation, mutual information). Graph analysis Report the dependent variable and connectivity measure, specifying weighted graph or binarized graph, subject- or group-level, and the global and/or node summaries used (e.g. clustering coefficient, efficiency, etc.). Multivariate modeling and predictive analysis Specify independent variables, features extraction and dimension reduction, model, training and evaluation metrics. April 2023 8