Abstract
This chapter provides an introduction to Bayesian models and their application in cognitive neuroscience. The central feature of Bayesian models, as opposed to other classes of models, is that Bayesian models represent the beliefs of an observer as probability distributions, allowing them to integrate information while taking its uncertainty into account. In the chapter, we will consider how the probabilistic nature of Bayesian models makes them particularly useful in cognitive neuroscience. We will consider two types of tasks in which we believe a Bayesian approach is useful: optimal integration of evidence from different sources, and the development of beliefs about the environment given limited information (such as during learning). We will develop some detailed examples of Bayesian models to give the reader a taste of how the models are constructed and what insights they may be able to offer about participants’ behavior and brain activity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In fact, the probability of each location given hearing and vision can only be obtained by multiplication if the variance in the two probability density functions is independent. In this case, we are talking about uncertainty that arises from noise in the sensory systems, which we can safely assume is independent between vision and hearing.
- 2.
In all the examples and exercises given here, we obtain an approximate solution by evaluating p(x) for discrete values of \((\mu, {\sigma^2})\). In the continuous case, Eq. 9.3 would become:
\(p(x) = \int {d\mu } \int {d{\sigma^2}} [p(x|x\sim N(\mu, {\sigma^2}))\times p(x\sim N(\mu, {\sigma^2})|{x_{1i}})]\)
References
Bayes T (1763) An essay towards solving a problem in the doctrine of chances. Phil Trans 53:370–418
Behrens TE, Woolrich MW, Walton ME, Rushworth MF (2007) Learning the value of information in an uncertain world. Nat Neurosci 10:1214–1221
Chater N, Oaksford M (eds) (2008) The probabilistic mind: Prospects for Bayesian cognitive science. Oxford University Press, Oxford
Courville AC, Daw ND, Touretzky DS (2006) Bayesian theories of conditioning in a changing world. Trends Cogn Sci 10:294–300
Cox RT (1946) Probability, frequency and reasonable expectation. Am J Phys 14:1–13
Dayan P, Kakade S, Montague PR (2000) Learning and selective attention. Nat Neurosci 3:1218–1223
Ernst MO, Banks MS (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415:429–433
Gregory R (1966) Eye and brain. Princeton University Press, Princeton
Jacobs RA (1999) Optimal integration of texture and motion cues to depth. Vis Res 39:3621–3629
Knight FH (1921) Risk, uncertainty and profit. Hart, Schaffner and Marx, Boston
Körding KP, Wolpert DM (2006) Bayesian decision theory in sensorimotor control. Trends Cogn Sci 10:319–326
MacKay DJC (2003) Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge
Mars RB et al (2008) Trial-by-trial fluctuations in the event-related electroencephalogram reflect dynamic changes in the degree of surprise. J Neurosci 28:12539-12545
McGrayne SB (2011) The theory that would not die: How Bayes’ rule cracked the enigma code, hunted down Russian submarines, and emerged triumphant from two centuries of controversy. Yale University Press, New Haven
O’Reilly JX (2013) Making predictions in a changing world-inference, uncertainty, and learning. Front Neurosci 7:105
O’Reilly JX, Mars RB (2011) Computational neuroimaging: Localising Greek letters? Trends Cogn Sci 15:450
O’Reilly JX, Jbabdi S, Behrens TE (2012) How can a Bayesian approach inform neuroscience? Eur J Neurosci 35:1169–1179
Payzan-LeNestour E, Bossaerts P (2011) Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comp Biol 7:e1001048
Posner MI, Snyder CRR, Davidson BJ (1980) Attention and the detection of signals. J Exp Psychol Gen 109:160–174
Real LA (1991) Animal choice behavior and the evolution of cognitive architecture. Science 253:980–986
Robbins H (1952) Some aspects of the sequential design of experiments. Bull Amer Math Soc 58:527–535
Segall MH, Campbell DT, Herskovits MJ (1963) Cultural differences in the perception of geometric illusions. Science 139:769–771
Silver N (2012) The signal and the noise: Why most predictions fail but some don’t. Penguin, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A: One-Armed Bandit Model
We can write down the generative model, by which the rewarded action (A or B) is selected as follows:
p(A rewarded on trial i)~ Bernoulli(qi)
… where J is a binary variable which determines whether there was a jump in the value of q between trial i-1 and trial i; J itself is determined by
… where v is the probability of a jump, e.g. if a jump occurs on average every 15 trials, v =\({{}^{1}\!/\!{}_{{15}}}.\)
Then we can construct a Bayesian computer participant which infers the values of q and v on trial i as follows:
where the prior at trial i, \(p({q_i},v),\) is given by
and the transition function \(p({q_i}\left|{{q_{i-1}},v)}\right.\) is given by
Exercises
Exercise 1. Look at Fig. 9.5. How do you interpret the shadow on the surface shapes? Most people see the left hand side bumps as convex and the right hand bumps as concaves. Can you explain why that might be, using your Bayesian perspective? Hint: think of the use of priors.
Exercise 2. In Fig. 9.4 we saw some interesting behavior by a Bayesian learner. For instance, at point c the model very quickly changed its belief of an environment where left was rewarded into one where right was rewarded. One important goal of model-based cognitive neuroscience is to link this type of changes probability distributions to observed neural phenomena. Can you come up with some phenomena that can be linked with changes in the model’s parameters?
Exercise 3. In this final exercise we will ask you to construct a simple Bayesian model. The solutions include example Matlab code, although they are platform independent. Consider the following set of observations of apple positions x, which Isaac made in his garden:
i | xi |
1 | 63 |
2 | 121 |
3 | 148 |
4 | 114 |
5 | 131 |
6 | 121 |
7 | 90 |
8 | 108 |
9 | 76 |
10 | 126 |
-
1.
Find the mean, E(x), and variance, E(x 2 )–E(x) 2, of this set of observations using the formulae
-
2.
If I tell you that these samples were drawn from a normal distribution, x~N(μ, σ 2 ) how could you use Bayes’ theorum to find the mean and variance of x? Or more precisely, how could you use Bayes’ theorem to estimate the parameters, μ and σ 2, of the normal distribution from which the samples are drawn?
Hint: remember from the text that we can write
…where the likelihood function, \(p({x_i}\left|{x\tilde{\ }N(\mu,\sigma))}\right.\), is given by the standard probability density function for a normal distribution:
…and you can assume:
-
1. The prior probability p(x~N(μ, σ 2 )) is equal for all possible values of μ; and σ 2,and
-
2. The observations are independent samples such that p(xi∩xj) = p(xi)p(xj) for all pairs of samples {xi, xj}.
Now use MATLAB to work out the posterior probability for a range of pairs of parameter values μ and σ 2, and find the pair with the highest joint posterior probability. This gives a maximum likelihood estimate for μ and σ 2.
-
3.
Can you adapt this model to process each data point sequentially, so that the posterior after observation i becomes the prior for observation i + 1?
Hint: remember from the text that (assuming the underlying values of μ and σ 2 cannot change between observations), we can write:
… where the prior at trial i, \(p(x\tilde{\ }N(\mu,{\sigma^2})\left|{{x_1}}\right.\ldots {x_{i-1}})\) is the posterior from trial i-1.
-
4.
If you have done parts 2 and 3 correctly, the final estimates of {μ, σ 2 } should be the same whether you process thedata points sequentially, or all at once. Why is this?
Further Reading
-
1.
McGrayne [14] provides an historical overview of the development of Bayes’ theorem, its applications, and its gradual acceptance in the scientific community;
-
2.
Daniel Wolpert’s TED talk (available at http://www.ted.com/talks/daniel_wolpert_the_real_reason_for_brains.html) provides a nice introduction in to consequences of noise in neural systems and the Bayesian way of dealing with it;
-
3.
O’Reilly [15] discusses Bayesian approaches to dealing with changes in the environment and how different types of uncertainty are incorporated into Bayesian models and dealt with in the brain.
-
4.
Nate Silver’s book The signal and the noise [23] contains some nice example about how humans make predictions and establish beliefs. Silver advocates a Bayesian approach to dealing with uncertainty. It served him very well in the 2012 USA presidential elections, when he correctly predicted for each of the 50 states whether they would be carried by Obama or Romney.
-
5.
David MacKay’s book Information theory, inference, and learning algorithms [12] is a much more advanced treatment of many of the principle of Bayesian thinking. It is available for free at http://www.inference.phy.cam.ac.uk/itprnn/book.html.
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
O’Reilly, J., Mars, R. (2015). Bayesian Models in Cognitive Neuroscience: A Tutorial. In: Forstmann, B., Wagenmakers, EJ. (eds) An Introduction to Model-Based Cognitive Neuroscience. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2236-9_9
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2236-9_9
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2235-2
Online ISBN: 978-1-4939-2236-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)