My research is focused on using computational approaches to the study of reinforcement learning and decision making to understand the function of the cortical and subcortical circuits in emotion, cognition, and disease. To do this I take a multidisciplinary approach, combining computational modeling of behavior, neurophysiology, neuroimaging, and chemogenetic methodologies in rhesus macaques and humans. In my doctoral work at the University of Florida I used psychophysiology and neuroimaging in humans to discover how imagining or perceiving emotional scenarios engages motivational circuits. During my postdoctoral training at the National Institute of Mental Health, I shifted the focus of my work from human to animal models to understand in greater detail how reward circuitry contributed to reinforcement learning and decision making in macaques. This led to my discovery that the amygdala plays an active role in reinforcement learning beyond signaling of Pavlovian value or the coordination of emotional responses. This work has relied on my development of a novel Bayesian framework for investigating neural mechanisms of reversal learning. In my laboratory at Emory University and the Emory National Primate CenterI am harnessing my unique set of skills in neurophysiology, neuropsychology, and computational modeling to probe how the frontoamygdalar and frontostriatal circuits act together to facilitate explore-exploit decision making.
Beliefs – attitudes toward some state of the environment – guide action selection and should be r... more Beliefs – attitudes toward some state of the environment – guide action selection and should be robust to variability but sensitive to meaningful change. Beliefs about volatility (expectation of change) are associated with paranoia in humans yet the brain regions responsible for volatility beliefs remain unknown. Orbitofrontal cortex (OFC) is central to adaptive behavior whereas magnocellular mediodorsal thalamus (MDmc) is essential for arbitrating between perceptions and action policies. We assessed belief updating in a three-choice probabilistic reversal-learning task following excitotoxic lesions of MDmc (n=3) or OFC (n=3) and compared performance with that of unoperated rhesus macaques (n=14). Computational analyses indicated that lesions of the MDmc, but not OFC, were associated with erratic switching behavior and heightened volatility belief (as in paranoia in humans). In contrast, OFC lesions were associated with increased lose-stay behavior and reward learning rates. Given t...
Policy search lets you discover rules and adapt behavior. In this issue of Neuron, Cohen et al. (... more Policy search lets you discover rules and adapt behavior. In this issue of Neuron, Cohen et al. (2021) demonstrate that the dynamics of neurons in primate anterior cingulate cortex and putamen indicate when a correct policy is discovered and confidence in executing decisions under that policy.
Understanding the unique functions of different subregions of primate prefrontal cortex has been ... more Understanding the unique functions of different subregions of primate prefrontal cortex has been a longstanding goal in cognitive neuroscience. Yet, the anatomy and function of one of its largest subregions (the frontopolar cortex) remain enigmatic and underspecified. Our Society for Neuroscience minisymposiumPrimate Frontopolar Cortex: From Circuits to Complex Behaviorswill comprise a range of new anatomic and functional approaches that have helped to clarify the basic circuit anatomy of the frontal pole, its functional involvement during performance of cognitively demanding behavioral paradigms in monkeys and humans, and its clinical potential as a target for noninvasive brain stimulation in patients with brain disorders. This review consolidates knowledge about the anatomy and connectivity of frontopolar cortex and provides an integrative summary of its function in primates. We aim to answer the question: what, if anything, does frontopolar cortex contribute to goal-directed cogn...
Flexible decision-making requires animals to forego immediate rewards (exploitation) and try nove... more Flexible decision-making requires animals to forego immediate rewards (exploitation) and try novel choice options (exploration) to discover if they are preferable to familiar alternatives. Using the same task and a partially observable Markov decision process (POMDP) model to quantify the value of choices, we first determined that the computational basis for managing explore-exploit tradeoffs is conserved across monkeys and humans. We then used fMRI to identify where in the human brain the immediate value of exploitative choices and relative uncertainty about the value of exploratory choices were encoded. Consistent with prior neurophysiological evidence in monkeys, we observed divergent encoding of reward value and uncertainty in prefrontal and parietal regions, including frontopolar cortex, and parallel encoding of these computations in motivational regions including the amygdala, ventral striatum, and orbitofrontal cortex. These results clarify the interplay between prefrontal and motivational circuits that supports adaptive explore-exploit decisions in humans and nonhuman primates.
Aberrant decision-making characterizes various pediatric psychopathologies; however, deliberative... more Aberrant decision-making characterizes various pediatric psychopathologies; however, deliberative choice strategies have not been investigated. A transdiagnostic sample of 95 youths completed a child-friendly sequential sampling paradigm. Participants searched for the best offer by sampling a finite list of offers. Participants’ willingness to explore was measured as the number of offers sampled, and ideal task performance was modeled using a Markov decision-process model. As in previous findings in adults, youths explored more offers when lists were long compared with short, yet participants generally sampled fewer offers relative to model-estimated ideal performance. Searching deeper into the list was associated with choosing better price options. Analyses examining the main and interactive effects of transdiagnostic anxiety and irritability symptoms indicated a negative correlation between anxiety and task performance ( p = .01, η p2 = .08). Findings suggest the need for more res...
Goal-directed behavior requires identifying objects in the environment that can satisfy internal ... more Goal-directed behavior requires identifying objects in the environment that can satisfy internal needs and executing actions to obtain those objects. The current study examines ventral and dorsal corticostriatal circuits that support complementary aspects of goal-directed behavior. We analyze activity from the amygdala, ventral striatum, orbitofrontal cortex, and lateral prefrontal cortex (LPFC) while monkeys perform a three-armed bandit task. Information about chosen stimuli and their value is primarily encoded in the amygdala, ventral striatum, and orbitofrontal cortex, while the spatial information is primarily encoded in the LPFC. Before the options are presented, information about the to-be-chosen stimulus is represented in the amygdala, ventral striatum, and orbitofrontal cortex; at the time of choice, the information is passed to the LPFC to direct a saccade. Thus, learned value information specifying behavioral goals is maintained throughout the ventral corticostriatal circuit, and it is routed through the dorsal circuit at the time actions are selected.
Explore-exploit decisions require us to trade off the benefits of exploring unknown options to le... more Explore-exploit decisions require us to trade off the benefits of exploring unknown options to learn more about them, with exploiting known options, for immediate reward. Such decisions are ubiquitous in nature, but from a computational perspective, they are notoriously hard. There is therefore much interest in how humans and animals make these decisions and recently there has been an explosion of research in this area. Here we provide a biased and incomplete snapshot of this field focusing on the major finding that many organisms use two distinct strategies to solve the explore-exploit dilemma: a bias for information (`directed exploration') and the randomization of choice (`random exploration'). We review evidence for the existence of these strategies, their computational properties, their neural implementations, as well as how directed and random exploration vary over the lifespan. We conclude by highlighting open questions in this field that are ripe to both explore and ...
Orbitofrontal cortex (OFC) predicts the consequences of our actions and updates our expectations ... more Orbitofrontal cortex (OFC) predicts the consequences of our actions and updates our expectations based on experienced outcomes. In this issue of Neuron, Groman et al. (2019) precisely ablate pathways between the OFC, amygdala, and nucleus accumbens to reveal their separable contributions to reinforcement learning.
Few studies have used matched affective paradigms to compare humans and non-human primates. In mo... more Few studies have used matched affective paradigms to compare humans and non-human primates. In monkeys with amygdala lesions and youth with anxiety disorders, we examined cross-species pupillary responses during a saccade-based, affective attentional capture task. Given evidence of enhanced amygdala function in anxiety, we hypothesized that opposite patterns would emerge in lesioned monkeys and anxious participants. A total of 53 unmedicated youths (27 anxious, 26 healthy) and 8 adult male rhesus monkeys (Macaca mulatta) completed matched behavioral paradigms. Four monkeys received bilateral excitotoxic amygdala lesions and four served as unoperated controls. Compared to healthy youth, anxious youth exhibited increased pupillary constriction in response to emotional and non-emotional distractors (F(1,48) = 6.28, P = 0.02, η2p = 0.12). Pupillary response was associated significantly with anxiety symptoms severity (F(1,48) = 5.59, P = 0.02, η2p = 0.10). As hypothesized, lesioned monke...
Beliefs – attitudes toward some state of the environment – guide action selection and should be r... more Beliefs – attitudes toward some state of the environment – guide action selection and should be robust to variability but sensitive to meaningful change. Beliefs about volatility (expectation of change) are associated with paranoia in humans yet the brain regions responsible for volatility beliefs remain unknown. Orbitofrontal cortex (OFC) is central to adaptive behavior whereas magnocellular mediodorsal thalamus (MDmc) is essential for arbitrating between perceptions and action policies. We assessed belief updating in a three-choice probabilistic reversal-learning task following excitotoxic lesions of MDmc (n=3) or OFC (n=3) and compared performance with that of unoperated rhesus macaques (n=14). Computational analyses indicated that lesions of the MDmc, but not OFC, were associated with erratic switching behavior and heightened volatility belief (as in paranoia in humans). In contrast, OFC lesions were associated with increased lose-stay behavior and reward learning rates. Given t...
Policy search lets you discover rules and adapt behavior. In this issue of Neuron, Cohen et al. (... more Policy search lets you discover rules and adapt behavior. In this issue of Neuron, Cohen et al. (2021) demonstrate that the dynamics of neurons in primate anterior cingulate cortex and putamen indicate when a correct policy is discovered and confidence in executing decisions under that policy.
Understanding the unique functions of different subregions of primate prefrontal cortex has been ... more Understanding the unique functions of different subregions of primate prefrontal cortex has been a longstanding goal in cognitive neuroscience. Yet, the anatomy and function of one of its largest subregions (the frontopolar cortex) remain enigmatic and underspecified. Our Society for Neuroscience minisymposiumPrimate Frontopolar Cortex: From Circuits to Complex Behaviorswill comprise a range of new anatomic and functional approaches that have helped to clarify the basic circuit anatomy of the frontal pole, its functional involvement during performance of cognitively demanding behavioral paradigms in monkeys and humans, and its clinical potential as a target for noninvasive brain stimulation in patients with brain disorders. This review consolidates knowledge about the anatomy and connectivity of frontopolar cortex and provides an integrative summary of its function in primates. We aim to answer the question: what, if anything, does frontopolar cortex contribute to goal-directed cogn...
Flexible decision-making requires animals to forego immediate rewards (exploitation) and try nove... more Flexible decision-making requires animals to forego immediate rewards (exploitation) and try novel choice options (exploration) to discover if they are preferable to familiar alternatives. Using the same task and a partially observable Markov decision process (POMDP) model to quantify the value of choices, we first determined that the computational basis for managing explore-exploit tradeoffs is conserved across monkeys and humans. We then used fMRI to identify where in the human brain the immediate value of exploitative choices and relative uncertainty about the value of exploratory choices were encoded. Consistent with prior neurophysiological evidence in monkeys, we observed divergent encoding of reward value and uncertainty in prefrontal and parietal regions, including frontopolar cortex, and parallel encoding of these computations in motivational regions including the amygdala, ventral striatum, and orbitofrontal cortex. These results clarify the interplay between prefrontal and motivational circuits that supports adaptive explore-exploit decisions in humans and nonhuman primates.
Aberrant decision-making characterizes various pediatric psychopathologies; however, deliberative... more Aberrant decision-making characterizes various pediatric psychopathologies; however, deliberative choice strategies have not been investigated. A transdiagnostic sample of 95 youths completed a child-friendly sequential sampling paradigm. Participants searched for the best offer by sampling a finite list of offers. Participants’ willingness to explore was measured as the number of offers sampled, and ideal task performance was modeled using a Markov decision-process model. As in previous findings in adults, youths explored more offers when lists were long compared with short, yet participants generally sampled fewer offers relative to model-estimated ideal performance. Searching deeper into the list was associated with choosing better price options. Analyses examining the main and interactive effects of transdiagnostic anxiety and irritability symptoms indicated a negative correlation between anxiety and task performance ( p = .01, η p2 = .08). Findings suggest the need for more res...
Goal-directed behavior requires identifying objects in the environment that can satisfy internal ... more Goal-directed behavior requires identifying objects in the environment that can satisfy internal needs and executing actions to obtain those objects. The current study examines ventral and dorsal corticostriatal circuits that support complementary aspects of goal-directed behavior. We analyze activity from the amygdala, ventral striatum, orbitofrontal cortex, and lateral prefrontal cortex (LPFC) while monkeys perform a three-armed bandit task. Information about chosen stimuli and their value is primarily encoded in the amygdala, ventral striatum, and orbitofrontal cortex, while the spatial information is primarily encoded in the LPFC. Before the options are presented, information about the to-be-chosen stimulus is represented in the amygdala, ventral striatum, and orbitofrontal cortex; at the time of choice, the information is passed to the LPFC to direct a saccade. Thus, learned value information specifying behavioral goals is maintained throughout the ventral corticostriatal circuit, and it is routed through the dorsal circuit at the time actions are selected.
Explore-exploit decisions require us to trade off the benefits of exploring unknown options to le... more Explore-exploit decisions require us to trade off the benefits of exploring unknown options to learn more about them, with exploiting known options, for immediate reward. Such decisions are ubiquitous in nature, but from a computational perspective, they are notoriously hard. There is therefore much interest in how humans and animals make these decisions and recently there has been an explosion of research in this area. Here we provide a biased and incomplete snapshot of this field focusing on the major finding that many organisms use two distinct strategies to solve the explore-exploit dilemma: a bias for information (`directed exploration') and the randomization of choice (`random exploration'). We review evidence for the existence of these strategies, their computational properties, their neural implementations, as well as how directed and random exploration vary over the lifespan. We conclude by highlighting open questions in this field that are ripe to both explore and ...
Orbitofrontal cortex (OFC) predicts the consequences of our actions and updates our expectations ... more Orbitofrontal cortex (OFC) predicts the consequences of our actions and updates our expectations based on experienced outcomes. In this issue of Neuron, Groman et al. (2019) precisely ablate pathways between the OFC, amygdala, and nucleus accumbens to reveal their separable contributions to reinforcement learning.
Few studies have used matched affective paradigms to compare humans and non-human primates. In mo... more Few studies have used matched affective paradigms to compare humans and non-human primates. In monkeys with amygdala lesions and youth with anxiety disorders, we examined cross-species pupillary responses during a saccade-based, affective attentional capture task. Given evidence of enhanced amygdala function in anxiety, we hypothesized that opposite patterns would emerge in lesioned monkeys and anxious participants. A total of 53 unmedicated youths (27 anxious, 26 healthy) and 8 adult male rhesus monkeys (Macaca mulatta) completed matched behavioral paradigms. Four monkeys received bilateral excitotoxic amygdala lesions and four served as unoperated controls. Compared to healthy youth, anxious youth exhibited increased pupillary constriction in response to emotional and non-emotional distractors (F(1,48) = 6.28, P = 0.02, η2p = 0.12). Pupillary response was associated significantly with anxiety symptoms severity (F(1,48) = 5.59, P = 0.02, η2p = 0.10). As hypothesized, lesioned monke...
Uploads
Papers by Vincent D . Costa