Search | arXiv e-print repository

On the Minimal Theory of Consciousness Implicit in Active Inference

Authors: Christopher J. Whyte, Andrew W. Corcoran, Jonathan Robinson, Ryan Smith, Rosalyn J. Moran, Thomas Parr, Karl J. Friston, Anil K. Seth, Jakob Hohwy

Abstract: The multifaceted nature of experience poses a challenge to the study of consciousness. Traditional neuroscientific approaches often concentrate on isolated facets, such as perceptual awareness or the global state of consciousness and construct a theory around the relevant empirical paradigms and findings. Theories of consciousness are, therefore, often difficult to compare; indeed, there might be… ▽ More The multifaceted nature of experience poses a challenge to the study of consciousness. Traditional neuroscientific approaches often concentrate on isolated facets, such as perceptual awareness or the global state of consciousness and construct a theory around the relevant empirical paradigms and findings. Theories of consciousness are, therefore, often difficult to compare; indeed, there might be little overlap in the phenomena such theories aim to explain. Here, we take a different approach: starting with active inference, a first principles framework for modelling behaviour as (approximate) Bayesian inference, and building up to a minimal theory of consciousness, which emerges from the shared features of computational models derived under active inference. We review a body of work applying active inference models to the study of consciousness and argue that there is implicit in all these models a small set of theoretical commitments that point to a minimal (and testable) theory of consciousness. △ Less

Submitted 9 October, 2024; originally announced October 2024.

arXiv:2409.20318 [pdf, other]

A Mathematical Perspective on Neurophenomenology

Authors: Lancelot Da Costa, Lars Sandved-Smith, Karl Friston, Maxwell J. D. Ramstead, Anil K. Seth

Abstract: In the context of consciousness studies, a key challenge is how to rigorously conceptualise first-person phenomenological descriptions of lived experience and their relation to third-person empirical measurements of the activity or dynamics of the brain and body. Since the 1990s, there has been a coordinated effort to explicitly combine first-person phenomenological methods, generating qualitative… ▽ More In the context of consciousness studies, a key challenge is how to rigorously conceptualise first-person phenomenological descriptions of lived experience and their relation to third-person empirical measurements of the activity or dynamics of the brain and body. Since the 1990s, there has been a coordinated effort to explicitly combine first-person phenomenological methods, generating qualitative data, with neuroscientific techniques used to describe and quantify brain activity under the banner of "neurophenomenology". Here, we take on this challenge and develop an approach to neurophenomenology from a mathematical perspective. We harness recent advances in theoretical neuroscience and the physics of cognitive systems to mathematically conceptualise first-person experience and its correspondence with neural and behavioural dynamics. Throughout, we make the operating assumption that the content of first-person experience can be formalised as (or related to) a belief (i.e. a probability distribution) that encodes an organism's best guesses about the state of its external and internal world (e.g. body or brain) as well as its uncertainty. We mathematically characterise phenomenology, bringing to light a tool-set to quantify individual phenomenological differences and develop several hypotheses including on the metabolic cost of phenomenology and on the subjective experience of time. We conceptualise the form of the generative passages between first- and third-person descriptions, and the mathematical apparatus that mutually constrains them, as well as future research directions. In summary, we formalise and characterise first-person subjective experience and its correspondence with third-person empirical measurements of brain and body, offering hypotheses for quantifying various aspects of phenomenology to be tested in future work. △ Less

Submitted 30 September, 2024; originally announced September 2024.

Comments: 15 pages, 4 figures

arXiv:2406.19201 [pdf, other]

Evolving reservoir computers reveals bidirectional coupling between predictive power and emergent dynamics

Authors: Hanna M. Tolle, Andrea I Luppi, Anil K. Seth, Pedro A. M. Mediano

Abstract: Biological neural networks can perform complex computations to predict their environment, far above the limited predictive capabilities of individual neurons. While conventional approaches to understanding these computations often focus on isolating the contributions of single neurons, here we argue that a deeper understanding requires considering emergent dynamics - dynamics that make the whole s… ▽ More Biological neural networks can perform complex computations to predict their environment, far above the limited predictive capabilities of individual neurons. While conventional approaches to understanding these computations often focus on isolating the contributions of single neurons, here we argue that a deeper understanding requires considering emergent dynamics - dynamics that make the whole system "more than the sum of its parts". Specifically, we examine the relationship between prediction performance and emergence by leveraging recent quantitative metrics of emergence, derived from Partial Information Decomposition, and by modelling the prediction of environmental dynamics in a bio-inspired computational framework known as reservoir computing. Notably, we reveal a bidirectional coupling between prediction performance and emergence, which generalises across task environments and reservoir network topologies, and is recapitulated by three key results: 1) Optimising hyperparameters for performance enhances emergent dynamics, and vice versa; 2) Emergent dynamics represent a near sufficient criterion for prediction success in all task environments, and an almost necessary criterion in most environments; 3) Training reservoir computers on larger datasets results in stronger emergent dynamics, which contain task-relevant information crucial for performance. Overall, our study points to a pivotal role of emergence in facilitating environmental predictions in a bio-inspired computational architecture. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2204.02169 [pdf, other]

Hybrid Predictive Coding: Inferring, Fast and Slow

Authors: Alexander Tschantz, Beren Millidge, Anil K Seth, Christopher L Buckley

Abstract: Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors" - the differences between predicted and observed data. Implicit in this proposal is the idea that perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception - incl… ▽ More Predictive coding is an influential model of cortical neural activity. It proposes that perceptual beliefs are furnished by sequentially minimising "prediction errors" - the differences between predicted and observed data. Implicit in this proposal is the idea that perception requires multiple cycles of neural activity. This is at odds with evidence that several aspects of visual perception - including complex forms of object recognition - arise from an initial "feedforward sweep" that occurs on fast timescales which preclude substantial recurrent activity. Here, we propose that the feedforward sweep can be understood as performing amortized inference and recurrent processing can be understood as performing iterative inference. We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner by describing both in terms of a dual optimization of a single objective function. We show that the resulting scheme can be implemented in a biologically plausible neural architecture that approximates Bayesian inference utilising local Hebbian update rules. We demonstrate that our hybrid predictive coding model combines the benefits of both amortized and iterative inference -- obtaining rapid and computationally cheap perceptual inference for familiar data while maintaining the context-sensitivity, precision, and sample efficiency of iterative inference schemes. Moreover, we show how our model is inherently sensitive to its uncertainty and adaptively balances iterative and amortized inference to obtain accurate beliefs using minimum computational expense. Hybrid predictive coding offers a new perspective on the functional relevance of the feedforward and recurrent activity observed during visual perception and offers novel insights into distinct aspects of visual phenomenology. △ Less

Submitted 6 April, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 05/04/22 initial upload. 06/04/22 added acknowledgements section

arXiv:2111.06518 [pdf, other]

Greater than the parts: A review of the information decomposition approach to causal emergence

Authors: Pedro A. M. Mediano, Fernando E. Rosas, Andrea I. Luppi, Henrik J. Jensen, Anil K. Seth, Adam B. Barrett, Robin L. Carhart-Harris, Daniel Bor

Abstract: Emergence is a profound subject that straddles many scientific disciplines, including the formation of galaxies and how consciousness arises from the collective activity of neurons. Despite the broad interest that exists on this concept, the study of emergence has suffered from a lack of formalisms that could be used to guide discussions and advance theories. Here we summarise, elaborate on, and e… ▽ More Emergence is a profound subject that straddles many scientific disciplines, including the formation of galaxies and how consciousness arises from the collective activity of neurons. Despite the broad interest that exists on this concept, the study of emergence has suffered from a lack of formalisms that could be used to guide discussions and advance theories. Here we summarise, elaborate on, and extend a recent formal theory of causal emergence based on information decomposition, which is quantifiable and amenable to empirical testing. This theory relates emergence with information about a system's temporal evolution that cannot be obtained from the parts of the system separately. This article provides an accessible but rigorous introduction to the framework, discussing the merits of the approach in various scenarios of interest. We also discuss several interpretation issues and potential misunderstandings, while highlighting the distinctive benefits of this formalism. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 8 pages, 2 figures

arXiv:2109.13186 [pdf, other]

Towards an extended taxonomy of information dynamics via Integrated Information Decomposition

Authors: Pedro A. M. Mediano, Fernando E. Rosas, Andrea I Luppi, Robin L. Carhart-Harris, Daniel Bor, Anil K. Seth, Adam B. Barrett

Abstract: Complex systems, from the human brain to the global economy, are made of multiple elements that interact in such ways that the behaviour of the `whole' often seems to be more than what is readily explainable in terms of the `sum of the parts.' Our ability to understand and control these systems remains limited, one reason being that we still don't know how best to describe -- and quantify -- the h… ▽ More Complex systems, from the human brain to the global economy, are made of multiple elements that interact in such ways that the behaviour of the `whole' often seems to be more than what is readily explainable in terms of the `sum of the parts.' Our ability to understand and control these systems remains limited, one reason being that we still don't know how best to describe -- and quantify -- the higher-order dynamical interactions that characterise their complexity. To address this limitation, we combine principles from the theories of Information Decomposition and Integrated Information into what we call Integrated Information Decomposition, or $Φ$ID. $Φ$ID provides a comprehensive framework to reason about, evaluate, and understand the information dynamics of complex multivariate systems. $Φ$ID reveals the existence of previously unreported modes of collective information flow, providing tools to express well-known measures of information transfer and dynamical complexity as aggregates of these modes. Via computational and empirical examples, we demonstrate that $Φ$ID extends our explanatory power beyond traditional causal discovery methods -- with profound implications for the study of complex systems across disciplines. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Comments: arXiv admin note: text overlap with arXiv:1909.02297

arXiv:2009.05359 [pdf, other]

Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain

Authors: Beren Millidge, Alexander Tschantz, Anil K Seth, Christopher L Buckley

Abstract: The backpropagation of error algorithm (backprop) has been instrumental in the recent success of deep learning. However, a key question remains as to whether backprop can be formulated in a manner suitable for implementation in neural circuitry. The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global signals as in standard backpr… ▽ More The backpropagation of error algorithm (backprop) has been instrumental in the recent success of deep learning. However, a key question remains as to whether backprop can be formulated in a manner suitable for implementation in neural circuitry. The primary challenge is to ensure that any candidate formulation uses only local information, rather than relying on global signals as in standard backprop. Recently several algorithms for approximating backprop using only local signals have been proposed. However, these algorithms typically impose other requirements which challenge biological plausibility: for example, requiring complex and precise connectivity schemes, or multiple sequential backwards phases with information being stored across phases. Here, we propose a novel algorithm, Activation Relaxation (AR), which is motivated by constructing the backpropagation gradient as the equilibrium point of a dynamical system. Our algorithm converges rapidly and robustly to the correct backpropagation gradients, requires only a single type of computational unit, utilises only a single parallel backwards relaxation phase, and can operate on arbitrary computation graphs. We illustrate these properties by training deep neural networks on visual classification tasks, and describe simplifications to the algorithm which remove further obstacles to neurobiological implementation (for example, the weight-transport problem, and the use of nonlinear derivatives), while preserving performance. △ Less

Submitted 10 October, 2020; v1 submitted 11 September, 2020; originally announced September 2020.

Comments: initial upload; revised version (updated abstract, related work) 28-09-20; 05/10/20: revised for ICLR submission; 10/10/20: minor revisions

arXiv:2004.08220 [pdf, other]

doi 10.1371/journal.pcbi.1008289

Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data

Authors: Fernando E. Rosas, Pedro A. M. Mediano, Henrik J. Jensen, Anil K. Seth, Adam B. Barrett, Robin L. Carhart-Harris, Daniel Bor

Abstract: The broad concept of emergence is instrumental in various of the most challenging open scientific questions -- yet, few quantitative theories of what constitutes emergent phenomena have been proposed. This article introduces a formal theory of causal emergence in multivariate systems, which studies the relationship between the dynamics of parts of a system and macroscopic features of interest. Our… ▽ More The broad concept of emergence is instrumental in various of the most challenging open scientific questions -- yet, few quantitative theories of what constitutes emergent phenomena have been proposed. This article introduces a formal theory of causal emergence in multivariate systems, which studies the relationship between the dynamics of parts of a system and macroscopic features of interest. Our theory provides a quantitative definition of downward causation, and introduces a complementary modality of emergent behaviour -- which we refer to as causal decoupling. Moreover, the theory allows practical criteria that can be efficiently calculated in large systems, making our framework applicable in a range of scenarios of practical interest. We illustrate our findings in a number of case studies, including Conway's Game of Life, Reynolds' flocking model, and neural activity as measured by electrocorticography. △ Less

Submitted 17 April, 2020; originally announced April 2020.

Comments: 18 pages, 7 figures

arXiv:1909.02297 [pdf, other]

Beyond integrated information: A taxonomy of information dynamics phenomena

Authors: Pedro A. M. Mediano, Fernando Rosas, Robin L. Carhart-Harris, Anil K. Seth, Adam B. Barrett

Abstract: Most information dynamics and statistical causal analysis frameworks rely on the common intuition that causal interactions are intrinsically pairwise -- every 'cause' variable has an associated 'effect' variable, so that a 'causal arrow' can be drawn between them. However, analyses that depict interdependencies as directed graphs fail to discriminate the rich variety of modes of information flow t… ▽ More Most information dynamics and statistical causal analysis frameworks rely on the common intuition that causal interactions are intrinsically pairwise -- every 'cause' variable has an associated 'effect' variable, so that a 'causal arrow' can be drawn between them. However, analyses that depict interdependencies as directed graphs fail to discriminate the rich variety of modes of information flow that can coexist within a system. This, in turn, creates problems with attempts to operationalise the concepts of 'dynamical complexity' or `integrated information.' To address this shortcoming, we combine concepts of partial information decomposition and integrated information, and obtain what we call Integrated Information Decomposition, or $Φ$ID. We show how $Φ$ID paves the way for more detailed analyses of interdependencies in multivariate time series, and sheds light on collective modes of information dynamics that have not been reported before. Additionally, $Φ$ID reveals that what is typically referred to as 'integration' is actually an aggregate of several heterogeneous phenomena. Furthermore, $Φ$ID can be used to formulate new, tailored measures of integrated information, as well as to understand and alleviate the limitations of existing measures. △ Less

Submitted 5 September, 2019; originally announced September 2019.

arXiv:1806.09373 [pdf, other]

doi 10.3390/e21010017

Measuring Integrated Information: Comparison of Candidate Measures in Theory and Simulation

Authors: Pedro A. M. Mediano, Anil K. Seth, Adam B. Barrett

Abstract: Integrated Information Theory (IIT) is a prominent theory of consciousness that has at its centre measures that quantify the extent to which a system generates more information than the sum of its parts. While several candidate measures of integrated information (`$Φ$') now exist, little is known about how they compare, especially in terms of their behaviour on non-trivial network models. In this… ▽ More Integrated Information Theory (IIT) is a prominent theory of consciousness that has at its centre measures that quantify the extent to which a system generates more information than the sum of its parts. While several candidate measures of integrated information (`$Φ$') now exist, little is known about how they compare, especially in terms of their behaviour on non-trivial network models. In this article we provide clear and intuitive descriptions of six distinct candidate measures. We then explore the properties of each of these measures in simulation on networks consisting of eight interacting nodes, animated with Gaussian linear autoregressive dynamics. We find a striking diversity in the behaviour of these measures -- no two measures show consistent agreement across all analyses. Further, only a subset of the measures appear to genuinely reflect some form of dynamical complexity, in the sense of simultaneous segregation and integration between system components. Our results help guide the operationalisation of IIT and advance the development of measures of integrated information that may have more general applicability. △ Less

Submitted 25 June, 2018; originally announced June 2018.

arXiv:1705.09156 [pdf, other]

The free energy principle for action and perception: A mathematical review

Authors: Christopher L. Buckley, Chang Sub Kim, Simon McGregor, Anil K. Seth

Abstract: The 'free energy principle' (FEP) has been suggested to provide a unified theory of the brain, integrating data and theory relating to action, perception, and learning. The theory and implementation of the FEP combines insights from Helmholtzian 'perception as inference', machine learning theory, and statistical thermodynamics. Here, we provide a detailed mathematical evaluation of a suggested bio… ▽ More The 'free energy principle' (FEP) has been suggested to provide a unified theory of the brain, integrating data and theory relating to action, perception, and learning. The theory and implementation of the FEP combines insights from Helmholtzian 'perception as inference', machine learning theory, and statistical thermodynamics. Here, we provide a detailed mathematical evaluation of a suggested biologically plausible implementation of the FEP that has been widely used to develop the theory. Our objectives are (i) to describe within a single article the mathematical structure of this implementation of the FEP; (ii) provide a simple but complete agent-based model utilising the FEP; (iii) disclose the assumption structure of this implementation of the FEP to help elucidate its significance for the brain sciences. △ Less

Submitted 24 May, 2017; originally announced May 2017.

Comments: 77 pages 2 fugures

arXiv:1002.0299 [pdf, ps, other]

doi 10.1103/PhysRevE.81.041907

Multivariate Granger Causality and Generalized Variance

Authors: Adam B. Barrett, Lionel Barnett, Anil K. Seth

Abstract: Granger causality analysis is a popular method for inference on directed interactions in complex systems of many variables. A shortcoming of the standard framework for Granger causality is that it only allows for examination of interactions between single (univariate) variables within a system, perhaps conditioned on other variables. However, interactions do not necessarily take place between sing… ▽ More Granger causality analysis is a popular method for inference on directed interactions in complex systems of many variables. A shortcoming of the standard framework for Granger causality is that it only allows for examination of interactions between single (univariate) variables within a system, perhaps conditioned on other variables. However, interactions do not necessarily take place between single variables, but may occur among groups, or "ensembles", of variables. In this study we establish a principled framework for Granger causality in the context of causal interactions among two or more multivariate sets of variables. Building on Geweke's seminal 1982 work, we offer new justifications for one particular form of multivariate Granger causality based on the generalized variances of residual errors. Taken together, our results support a comprehensive and theoretically consistent extension of Granger causality to the multivariate case. Treated individually, they highlight several specific advantages of the generalized variance measure, which we illustrate using applications in neuroscience as an example. We further show how the measure can be used to define "partial" Granger causality in the multivariate context and we also motivate reformulations of "causal density" and "Granger autonomy". Our results are directly applicable to experimental data and promise to reveal new types of functional relations in complex systems, neural and otherwise. △ Less

Submitted 13 April, 2010; v1 submitted 1 February, 2010; originally announced February 2010.

Comments: added 1 reference, minor change to discussion, typos corrected; 28 pages, 3 figures, 1 table, LaTeX

Journal ref: Physical Rev E, Vol 81, 041907 (2010)

Showing 1–12 of 12 results for author: Seth, A K