Dyncomp Preprint
Dyncomp Preprint
Dyncomp Preprint
Abstract As non-linear time series analysis becomes more and more wide-spread, measures that can
be applied to short time series with relatively low temporal resolution are in demand. The author
introduces a complexity parameter for time series based on fluctuation and distribution of values, as
well as its R implementation. This parameter is validated with a known chaotic dynamic system. It is
shown that the parameter’s validity approaches or even surpasses that of most similar measures. In
another step of validation, data from time series of daily ratings of anxiety and depression symptoms
is used to show the utility of the proposed measure.
Introduction
The study of complex systems is a relatively new scientific field that seeks to understand the interaction
of a system’s components and the interaction of a system with its environment. According to Foote
(2007), the term "complex system" is used to describe
Because it offers a theoretical framework that cuts across all scientific disciplines, and because
many phenomena approach the behavior of a complex dynamic system, the concept has been applied
to various fields of study. It has proved useful e.g. in understanding economic processes (Misra
et al., 2011) or social networks (Butts, 2001). The application of dynamic systems theory to the human
brain and human behavior also created a promising body of research with strong implications for
neuroscience (Orsucci, 2006) or psychotherapy (Hayes et al., 2007). As implied by the definition,
complex systems are subject to change over time, but the changes occur in a non-stationary, non-linear
fashion. However, non-linear change processes are notoriously hard to predict, so various methods
have been developed to capture indicators of the change of a complex system (Scheffer et al., 2009).
Their main use lies in detecting early signs of so-called "critical transitions", at which a system will
rapidly reorganize into a different state. Every phenomenon that is studied as a complex system is
prone to this behavior (see Scheffer et al., 2012, for an overview). Detecting critical instabilities in
complex systems is a worthwhile endeavor, as it facilitates taking certain actions for all possible forms
of catastrophic change. Depending on the desired outcome, actions can be taken to either promote
change processes, reduce their probability or make preparations for when the anticipated change
occurs. This is useful on many levels of analysis. Examples include earthquake warnings (Ramos,
2010) or interactions between ecological conditions and societal changes (Caticha et al., 2016). In
medicine, early signs of epileptic seizures (demonstrated by Martinerie et al., 1998) can enable medical
practitioners or patients to take precautions. Early warning signals have also been studies in bipolar
disorder (Glenn et al., 2006), preceding an outbreak of mania or depression by 60 days.
While the diversity of fields that profit from this theoretical approach is high, a major weakness
of most of the proposed methods is their need for relatively large amounts of data and their lack of
validity, especially when only short time series are available. For example, the study of human change
processes in psychology mainly relies on measuring constructs with methods at a low sample rate,
like questionnaires. Additionally, the widespread use of Likert scales that have a limited range of
values (like 1 to 10 or even 1 to 5) produces time series with a very limited distribution. This would
not be a problem if only post-hoc analyses of complexity were conducted, but the monitoring of states
and transitions of complex systems depends on valid methods that can be used in real time. Methods
that allow for the real-time analysis of short time series with low sample rate are scarce. Bandt and
Pompe (2002) were the first to propose a measure that reached acceptable validity for these scenarios:
Permutation Entropy (PE). An R function based on their approach was provided by Sippel et al. (2016).
It is the goal of the proposed package to provide researchers with measures that accomplish this goal
in a reliable and valid manner.
DYNCOMP 2
Figure 1: Moving window of analysis. The window, as indicated by solid blue lines, moves to the
right by one value.
The x argument can be any numeric vector, but the function will only provide meaningful results
when values are ordered chronologically from oldest to newest. The function will fail if x is not
numeric.
The scaleMax and scaleMin arguments determine the theoretical maximum and minimum of the
vector. If they are not provided, the function will take the observed maximum and minimum.
The width argument determines the size of the moving window as described in figure 1.
The measure argument determines if the compound measure of complexity should be returned (the
default), or if one of its components (either "fluctuation" or "distribution") should be returned.
All measures are described in greater detail in the following section.
The rescale argument determines if the returned vector of complexity estimates should be rescaled
using the scale minimum and maximum. This is especially useful if the resulting values are to be
plotted in the same graph.
Fluctuation
The fluctuation measure used in this function is based upon a well-known measure for fluctuation in
time series: the mean square of summed differences (MSSD) (Von Neumann et al., 1941). It is defined
DYNCOMP 3
as follows:
∑in=−11 ( xi+1 − xi )2
MSSD =
n−1
For each time window, the MSSD is calculated and divided by the MSSD for a hypothetical
distribution with "perfect" fluctuation between the scale minimum and maximum value. As a simple
example, let us assume the following vector:
The maximum possible MSSD for a vector with the length 10, a minimum value of 1 and a
maximum value of 10 is calculated. The MSSD of the observed values is then divided by this
theoretical maximum.
Distribution
While a time series that fluctuates between its extremes would result in a high fluctuation coefficient,
one could hardly argue that chaos can be identified by fluctuation alone, because a perfect fluctuation
between two values is predictable and orderly. If a system destabilizes, it should become open to a
wide variety of possible system states. These states are represented in time series by different values.
Thus, a coefficient for measuring the degree of dispersion is proposed. Its intent is to capture the
irregularity of a time series by comparing the distribution of values in the moving windows with a
distribution that would be observed if all values in the window were uniformly distributed. The
calculation of the distribution parameter will be demonstrated using the same test vector. First, we
generate a hypothetical uniform distribution by building a sequence from the smallest to the highest
value and the width of the analysis window. Then, we calculate the differences between successive
values.
Next, we order the observed values from the smallest to the highest value and calculate the differences
between successive values as well.
> mean(div.diff)
[1] 0.2222222
DYNCOMP 4
Table 1: Results of the validation of dyncomp f and d. Correlations of both coefficients with PE
with a word length of 6 and a window size of 100 (h61 00 ) and positive Lyapunov exponent (λ) for
different window widths are shown. For comparison, correlations of PE with Lyapunov exponent
were included in the rightmost column. All correlations are statistically significant, p < 10−5 .
h6,100 λ
Window d f d f h6
100 .931 -.955 .932 -.869 .932
80 .932 -.955 .930 -.868 .928
60 .935 -.954 .928 -.867 .920
40 .936 -.952 .915 -.864 .901
20 .940 -.943 .886 -.853 .842
10 .925 -.910 .817 -.811 .661
7 .898 -.879 .769 -.768 n/a
5 .840 -.857 .688 -.748 n/a
Compound Measure
Fluctuation and distribution values can be combined into a compound measure of overall complexity
by multiplying them. This way, information on both aspects of chaos are contained in one measure.
For our test vector, this would be
Method
In order to generate data comparable with other measures of complexity that were already published,
the validation approach follows the one used by Bandt and Pompe (2002) as well as Schiepek and
Strunk (2010). First, a data set was simulated using the iterative logistic map x[ n + 1] = rxn (1 − xn )
and increasing r continuously by steps of .001 from 3.1 to 4.001 . This resulted in 901 sequences of
100 data points. For each of these sequences, different, well-established, complexity measures were
calculated along with the newly proposed measures:
• The Lyapunov exponent (λ) (Osedelec, 1968; Wolf et al., 1985)) serves as a "gold standard" in
this validation approach because it was calculated directly from the logistic map equations.
Correlations with this measure will be interpreted as validity coefficients.
• Fluctuation (f ) and distribution (d) coefficients proposed in this paper were calculated for the
window sizes 5, 7, 10, 20, 40, 60, 80 and 1002 . While smaller window sizes are important in
scenarios with short time series, larger window sizes show the potential accuracy and scalability
of the algorithm.
• Permutation Entropy (PE) (Bandt and Pompe, 2002) (h) was calculated using the statcomp R
package (Sippel et al., 2016). This is a well validated and robust measure. PE combines the
available data to so-called words of a given maximum length. PE is then derived from the
frequency distribution of these words. PE values with a word length of 6 were calculated for the
same window sizes like f and d. However, window sizes of 5 or 7 are too small for calculating
PE with a word length of 6, so they were omitted.
Results
Correlations between the Lyapunov exponent, PE and fluctuation and distribution generated by
dyncomp were calculated and interpreted as validity coefficients. They are shown in Table 1.
Fluctuation correlated negatively with the other two measures of complexity. This behavior is
expected due to the relatively large fluctuation intensity for small r values in the logistic map, while
1 R code is provided as an online supplement that includes all calculations presented here, including simulation
Figure 2: Comparing different complexity estimates. a: simulated logistic map. b: positive Lyapunov
exponents, calculated from the logistic map equation. c to f: dyncompdistribution and fluctuation
coefficients with window sizes 20 and 7. g: PE with a word length of 6 and a window size of 100.
DYNCOMP 6
Figure 3: Daily means of affect scores. The gray area marks the "transition phase" identified by Wichers
et al. (2016). Different colours indicate different study phases.
extreme fluctuation diminishes for large r values. The opposite was true for the distribution of possible
values, indicating that the d coefficient indeed validly measures the available range of values that
increases with r. In Figure 2, this phenomenon can be visually identified by noticing the increased
density of the plot for larger r values. All validity coefficients were high (i.e. larger than .60) even for
window sizes as small as 5. For smaller window sizes, validity coefficients were substantially higher
than those of PE.
• The transition between the second and third study phase is marked by spikes in complexity on
3 Item text: "I feel. . . " + "satisfied", "enthusiastic", "cheerful", "strong" for positive affect,
Summary
A new measure for the complexity of time series was introduced and validated. It was demonstrated
that this measure has a high validity even when analyzing relatively short time series. The proposed
measures performed well compared to PE, especially in small window sizes. For larger window
sizes, validity coefficients were comparable. The fluctuation coefficient proposed by Schiepek and
Strunk (2010) was more robust against the reduction of window size. Its validity surpassed that of the
fluctuation measure used for dyncopmp in most cases. However, the distribution coefficient proposed
reached substantially higher validity for small windows and all measures maintained satisfactory
reliability even for analysis window sizes as small as 5. Using a publicly available data set from a
well-documented case study, it could be shown that the compound measure of dynamic complexity
was able to detect critical transitions that lead to successive symptom change. All in all, the measures
introduced in this publication can be considered suitable for the study of complex dynamical systems,
especially when observations have to be made with low temporal resolution and only a small amount
of data is available. Possible fields of application for this newly introduced measure are manifold,
because the analysis of non-linear time series has spread to various disciplines. Thus, future studies
should focus on studying the utility of the proposed measures in every field that relies on complexity
estimates. For example, the proposed measure of complexity is used in an open-source software
platform for real-time monitoring of psychotherapeutic processes that enables both researchers and
clinical practitioners to predict and study critical transitions in human change processes (Kaiser and
Laireiter, 2017). The author hopes that the tools provided will advance the field of complex systems
research.
Bibliography
C. Bandt and B. Pompe. Permutation entropy: A natural complexity measure for time series. Physical
Review Letters, 88(17):174102, Apr 2002. doi: 10.1103/PhysRevLett.88.174102. [p1, 4]
C. T. Butts. The complexity of social networks: theoretical and empirical findings. Social Networks, 23
(1):31–72, 2001. [p1]
N. Caticha, R. Calsaverini, and R. Vicente. Phase transition from egalitarian to hierarchical societies
driven by competition between cognitive and social constraints. arXiv:1608.03637 [physics], Aug
2016. URL http://arxiv.org/abs/1608.03637. arXiv: 1608.03637. [p1]
R. Foote. Mathematics and complex systems. Science, 318(5849):410–412, Oct 2007. ISSN 0036-8075,
1095-9203. doi: 10.1126/science.1141754. [p1]
T. Glenn, P. C. Whybrow, N. Rasgon, P. Grof, M. Alda, C. Baethge, and M. Bauer. Approximate entropy
of self-reported mood prior to episodes in bipolar disorder. Bipolar Disorders, 8(5p1):424–429, Oct
2006. ISSN 1399-5618. doi: 10.1111/j.1399-5618.2006.00373.x. [p1]
A. M. Hayes, J.-P. Laurenceau, G. Feldman, J. L. Strauss, and L. Cardaciotto. Change is not always
linear: The study of nonlinear and discontinuous patterns of change in psychotherapy. Clinical
Psychology Review, 27(6):715–723, Jul 2007. ISSN 02727358. doi: 10.1016/j.cpr.2007.01.008. [p1]
T. Kaiser and A. R. Laireiter. Dynamo: A modular platform for monitoring process, outcome, and
algorithm-based treatment planning in psychotherapy. JMIR Medical Informatics, 5(3):e20, Jul 2017.
ISSN 2291-9694. doi: 10.2196/medinform.6808. [p7]
DYNCOMP 8
Figure 4: Complexity measure for daily means of affect scores and weekly depression symptom
ratings. The gray area marks the "transition phase" identified by Wichers et al. (2016). The black
horizontal line marks a critical threshold of complexity.
DYNCOMP 9
J. Kossakowski, P. Groot, J. Haslbeck, D. Borsboom, and M. Wichers. Data from “critical slowing down
as a personalized early warning signal for depression”. Journal of Open Psychology Data, 5(1), 2017.
ISSN 2050-9863. doi: 10.5334/jopd.29. URL http://openpsychologydata.metajnl.com/articles/
10.5334/jopd.29/. [p6]
V. Misra, M. Lagi, and Y. Bar-Yam. Evidence of market manipulation in the financial crisis. arXiv
preprint arXiv:1112.3095, 2011. [p1]
F. F. Orsucci. The paradigm of complexity in clinical neurocognitive science. The Neuroscientist, 12(5):
390–397, Oct 2006. ISSN 1073-8584. doi: 10.1177/1073858406290266. [p1]
V. Osedelec. Multiplicative ergodic theorem: Lyapunov characteristic exponent for dynamical systems.
In Moscow Math. Soc., volume 19, pages 539–575, 1968. [p4]
G. Schiepek and G. Strunk. The identification of critical fluctuations and phase transitions in short term
and coarse-grained time series—a method for the real-time monitoring of human change processes.
Biological Cybernetics, 102(3):197–207, Mar 2010. ISSN 0340-1200, 1432-0770. doi: 10.1007/s00422-
009-0362-1. [p2, 4, 7]
S. Sippel, H. Lange, and F. Gans. statcomp: Statistical Complexity and Information Measures for Time
Series Analysis, 2016. URL https://CRAN.R-project.org/package=statcomp. R package version
0.0.1.1000. [p1, 4]
J. Von Neumann, R. Kent, H. Bellinson, and B. t. Hart. The mean square successive difference. The
Annals of Mathematical Statistics, 12(2):153–162, 1941. [p2]
M. Wichers, P. C. Groot, and Psychosystems, ESM Group. Critical slowing down as a personalized
early warning signal for depression. Psychotherapy and Psychosomatics, 85(2):114–116, 2016. ISSN
0033-3190, 1423-0348. doi: 10.1159/000441458. [p6, 7, 8]
A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano. Determining lyapunov exponents from a time
series. Physica D: Nonlinear Phenomena, 16(3):285–317, Jul 1985. ISSN 0167-2789. doi: 10.1016/0167-
2789(85)90011-9. [p4]
Tim Kaiser
University of Salzburg
Department of Psychology
Hellbrunnerstrasse 34
5020 Salzburg
Austria
Tim.Kaiser@sbg.ac.at