Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

data processing inequality
Recently Published Documents


TOTAL DOCUMENTS

40
(FIVE YEARS 14)

H-INDEX

10
(FIVE YEARS 1)

Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Ina Maria Deutschmann ◽  
Gipsi Lima-Mendez ◽  
Anders K. Krabberød ◽  
Jeroen Raes ◽  
Sergio M. Vallina ◽  
...  

Abstract Background Ecological interactions among microorganisms are fundamental for ecosystem function, yet they are mostly unknown or poorly understood. High-throughput-omics can indicate microbial interactions through associations across time and space, which can be represented as association networks. Associations could result from either ecological interactions between microorganisms, or from environmental selection, where the association is environmentally driven. Therefore, before downstream analysis and interpretation, we need to distinguish the nature of the association, particularly if it is due to environmental selection or not. Results We present EnDED (environmentally driven edge detection), an implementation of four approaches as well as their combination to predict which links between microorganisms in an association network are environmentally driven. The four approaches are sign pattern, overlap, interaction information, and data processing inequality. We tested EnDED on networks from simulated data of 50 microorganisms. The networks contained on average 50 nodes and 1087 edges, of which 60 were true interactions but 1026 false associations (i.e., environmentally driven or due to chance). Applying each method individually, we detected a moderate to high number of environmentally driven edges—87% sign pattern and overlap, 67% interaction information, and 44% data processing inequality. Combining these methods in an intersection approach resulted in retaining more interactions, both true and false (32% of environmentally driven associations). After validation with the simulated datasets, we applied EnDED on a marine microbial network inferred from 10 years of monthly observations of microbial-plankton abundance. The intersection combination predicted that 8.3% of the associations were environmentally driven, while individual methods predicted 24.8% (data processing inequality), 25.7% (interaction information), and up to 84.6% (sign pattern as well as overlap). The fraction of environmentally driven edges among negative microbial associations in the real network increased rapidly with the number of environmental factors. Conclusions To reach accurate hypotheses about ecological interactions, it is important to determine, quantify, and remove environmentally driven associations in marine microbial association networks. For that, EnDED offers up to four individual methods as well as their combination. However, especially for the intersection combination, we suggest using EnDED with other strategies to reduce the number of false associations and consequently the number of potential interaction hypotheses.


Entropy ◽  
2021 ◽  
Vol 23 (9) ◽  
pp. 1148
Author(s):  
Łukasz Dębowski

We present a hypothetical argument against finite-state processes in statistical language modeling that is based on semantics rather than syntax. In this theoretical model, we suppose that the semantic properties of texts in a natural language could be approximately captured by a recently introduced concept of a perigraphic process. Perigraphic processes are a class of stochastic processes that satisfy a Zipf-law accumulation of a subset of factual knowledge, which is time-independent, compressed, and effectively inferrable from the process. We show that the classes of finite-state processes and of perigraphic processes are disjoint, and we present a new simple example of perigraphic processes over a finite alphabet called Oracle processes. The disjointness result makes use of the Hilberg condition, i.e., the almost sure power-law growth of algorithmic mutual information. Using a strongly consistent estimator of the number of hidden states, we show that finite-state processes do not satisfy the Hilberg condition whereas Oracle processes satisfy the Hilberg condition via the data-processing inequality. We discuss the relevance of these mathematical results for theoretical and computational linguistics.


2021 ◽  
Author(s):  
Ina Maria Deutschmann ◽  
Gipsi Lima-Mendez ◽  
Anders K. Krabberod ◽  
Jeroen Raes ◽  
Sergio M. Vallina ◽  
...  

Background Ecological interactions among microorganisms are fundamental for ecosystem function, yet they are mostly unknown or poorly understood. High-throughput-omics can indicate microbial interactions through associations across time and space, which can be represented as association networks. Associations could result from either ecological interactions between microorganisms, or from environmental selection, where the associations are environmentally-driven. Therefore, before downstream analysis and interpretation, we need to distinguish the nature of the association, particularly if it is due to environmental selection or not. Results We present EnDED (Environmentally-Driven Edge Detection), an implementation of four approaches as well as their combination to predict which links between microorganisms in an association network are environmentally-driven. The four approaches are Sign Pattern, Overlap, Interaction Information, and Data Processing Inequality. We tested EnDED on networks from simulated data of 50 microorganisms. The networks contained on average 50 nodes and 1087 edges, of which 60 were true interactions but 1026 false associations (i.e. environmentally-driven or due to chance). Applying each method individually, we detected a moderate to high number of environmentally-driven edges - 87% Sign Pattern and Overlap, 67% Interaction Information, and 44% Data Processing Inequality. Combining these methods in an intersection approach resulted in retaining more interactions, both true and false (32% of environmentally-driven associations). After validation with the simulated datasets, we applied EnDED on a marine microbial network inferred from 10 years of monthly observations of microbial-plankton abundance. The intersection combination predicted that 8.3% of the associations were environmentally-driven, while individual methods predicted 24.8% (Data Processing Inequality), 25.7% (Interaction Information), and up to 84.6% (Sign Pattern as well as Overlap). The fraction of environmentally-driven edges among negative microbial associations in the real network increased rapidly with the number of environmental factors. Conclusions To reach accurate hypotheses about ecological interactions, it is important to determine, quantify, and remove environmentally-driven associations in marine microbial association networks. For that, EnDED offers up to four individual methods as well as their combination. However, especially for the intersection combination, we suggest using EnDED with other strategies to reduce the number of false associations and consequently the number of potential interaction hypotheses.


2021 ◽  
Author(s):  
Chuteng Zhou ◽  
Quntao Zhuang ◽  
Matthew Mattina ◽  
Paul N. Whatmough

Econometrica ◽  
2021 ◽  
Vol 89 (1) ◽  
pp. 475-506
Author(s):  
Xiaosheng Mu ◽  
Luciano Pomatto ◽  
Philipp Strack ◽  
Omer Tamuz

We study repeated independent Blackwell experiments; standard examples include drawing multiple samples from a population, or performing a measurement in different locations. In the baseline setting of a binary state of nature, we compare experiments in terms of their informativeness in large samples. Addressing a question due to Blackwell (1951), we show that generically an experiment is more informative than another in large samples if and only if it has higher Rényi divergences. We apply our analysis to the problem of measuring the degree of dissimilarity between distributions by means of divergences. A useful property of Rényi divergences is their additivity with respect to product distributions. Our characterization of Blackwell dominance in large samples implies that every additive divergence that satisfies the data processing inequality is an integral of Rényi divergences.


2020 ◽  
Author(s):  
Ina Maria Deutschmann ◽  
Gipsi Lima-Mendez ◽  
Anders K. Krabberød ◽  
Jeroen Raes ◽  
Sergio M. Vallina ◽  
...  

Abstract Background: Ecolocial interctions among microorganisms are fundamental for ecosystem function, yet they are mostly unknown or poorly understood. High-throughput-omics can indicate microbial interactions by associations across time and space, which can be represented as association networks. Links in these networks could result from either ecological interactions between microorganisms, or from environmental selection, where the association is environmentally-driven. Therefore, before downstream analysis and interpretation, we need to distinguish the nature of the association, particularly if it is due to environmental selection or not.Results: We present EnDED (Environmentally-Driven Edge Detection), an implementation of four approaches as well as their combination to predict which links between microorganisms in an association network are environmentally-driven. The four approaches are Sign Pattern, Overlap, Interaction Information, and Data Processing Inequality. We tested EnDED on networks from simulated data of 50 microorganisms. The networks contained on average 50 nodes and 1,087 edges, of which 60 were true interactions but 1,026 false associations (i.e. environmentally-driven or due to chance). Applying each method individually, we detected a moderate to high number of environmentally-driven edges—87% Sign Pattern and Overlap, 67% Interaction Information, and 44% Data Processing Inequality. Combining these methods in an intersection approach resulted in retaining more interactions, both true and false (32% of environmentally-driven associations). The addition of noise to the simulated datasets did not alter qualitatively these results. After validation with the simulated datasets, we applied EnDED on a marine microbial network inferred from 10 years of monthly observations of microbial-plankton abundance. The intersection combination predicted that 14.2% of the associations were environmentally-driven, while individual methods predicted 31.4% (Data Processing Inequality), 38.3% (Interaction Information), and up to 83.4% (Sign Pattern as well as Overlap).Conclusions: To reach accurate hypotheses about ecological interactions, it is important to determine, quantify, and remove environmentally-driven associations in marine microbial association networks. For that, EnDED offers up to four individual methods as well as their combination. However, especially for the intersection combination, we suggest to use EnDED with other strategies to reduce the number of false associations and consequently the number of potential interaction hypotheses.


Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 426 ◽  
Author(s):  
David Sigtermans

We propose a tensor based approach to infer causal structures from time series. An information theoretical analysis of transfer entropy (TE) shows that TE results from transmission of information over a set of communication channels. Tensors are the mathematical equivalents of these multichannel causal channels. The total effect of subsequent transmissions, i.e., the total effect of a cascade, can now be expressed in terms of the tensors of these subsequent transmissions using tensor multiplication. With this formalism, differences in the underlying structures can be detected that are otherwise undetectable using TE or mutual information. Additionally, using a system comprising three variables, we prove that bivariate analysis suffices to infer the structure, that is, bivariate analysis suffices to differentiate between direct and indirect associations. Some results translate to TE. For example, a Data Processing Inequality (DPI) is proven to exist for transfer entropy.


Author(s):  
David Sigtermans

We propose a partial information decomposition based on the newly introduced framework of causal tensors, i.e., multilinear stochastic maps that transform source data into destination data. The innovation that causal tensors introduce is that the framework allows for an exact expression of an indirect association in terms of the constituting, direct associations. This is not possible when expressing associations only in measures like mutual information or transfer entropy. Instead of a priori expressing associations in terms of mutual information or transfer entropy, the a posteriori expression of associations in these terms results in an intuitive definition of a nonnegative and left monotonic redundancy, which also meets the identity property. Our proposed redundancy satisfies the three axioms introduced by Williams and Beer. Symmetry and self-redundancy axioms follow directly from our definition. The data processing inequality ensures that the monotonicity axiom is satisfied. Because causal tensors can describe both mutual information as transfer entropy, the partial information decomposition applies to both measures. Results show that the decomposition closely resembles the decomposition of other another approach that expresses associations in terms of mutual information a posteriori. A negative synergistic term could indicate that there is an unobserved common cause.


Author(s):  
David Sigtermans

We propose a partial information decomposition based on the newly introduced framework of causal tensors, i.e., multilinear stochastic maps that transform source data into destination data. The innovation that causal tensors introduce is that the framework allows for an exact expression of an indirect association in terms of the constituting, direct associations. This is not possible when expressing associations only in measures like mutual information or transfer entropy. Instead of a priori expressing associations in terms of mutual information or transfer entropy, the a posteriori expression of associations in these terms results in an intuitive definition of a nonnegative and left monotonic redundancy, which also meets the identity property. Our proposed redundancy satisfies the three axioms introduced by Williams and Beer. Symmetry and self-redundancy axioms follow directly from our definition. The data processing inequality ensures that the monotonicity axiom is satisfied. Because causal tensors can describe both mutual information as transfer entropy, the partial information decomposition applies to both measures. Results show that the decomposition closely resembles the decomposition of other another approach that expresses associations in terms of mutual information a posteriori. A negative synergistic term could indicate that there is an unobserved common cause.


Export Citation Format

Share Document