1. Introduction
Time series data, characterized by its sequential nature, underpins a vast array of applications, from economics [
1] and finance [
2] to climate modeling [
3] and signal processing [
4]. Thus, understanding and extracting meaningful insights from the data hinges on our ability to grapple with its inherent temporal dependencies. The Autocorrelation Function (ACF) and the Partial Autocorrelation Function (PACF) emerge as central pillars in this endeavor [
5,
6,
7].
The power of ACF and PACF do not rest solely in their application; their theoretical and empirical distribution form the cornerstone of statistical inference in time series analysis. Therefore, understanding the expected distribution of ACF and PACF values under varying conditions serves as a compass guiding analysts through hypothesis testing, confidence interval estimation, model diagnostics, and forecasting [
8].
For the time series analysis, a fundamental principle has long been established: the theoretical sum of ACF values is invariably zero for any white noise process characterized by a mean of zero and an arbitrary value of variance, denoted as
. However, a compelling and intriguing discovery emerges when we delve into empirical observations: the sum of sample ACF consistently converges to a constant, specifically
(Hassani’s
Theorem) [
6,
7]. This intriguing constancy persists regardless of diverse time series attributes, such as length, mean, variance, dependencies, or any other unique characteristics they may possess. Numerous research studies have underscored the significance of Hassani’s
Theorem for practical applications and its integration into time series analysis and model development (refer, for example, to [
9,
10,
11,
12]).
The implications of this remarkable consistency are profound, particularly for the fields of time series model building and analysis [
13,
14,
15]. For a selected recent work highlighting the importance of considering the sample ACF in analysis, especially in the context of Hassani’s
Theorem, see, for instance, [
16,
17,
18].
However, the sum of sample PACF has received relatively little attention in comparison to its counterpart, ACF. Therefore, this paper will primarily focus on exploring the PACF. It is essential to note that there exists a robust relationship between ACF and PACF for time series analysis and modeling.
This paper goes beyond theoretical conjectures, illustrating our findings with a rich array of practical examples. We employ both simulated data and real-world time series to validate and emphasize the practical relevance of the results obtained in our study. Through this interdisciplinary journey, we aim to contribute valuable insights that enhance the understanding and application of PACF in the field of time series analysis.
In
Section 2, we provide a foundational understanding of PACF, its distribution, and the sum of sample ACF as the backbone of this research. Within this section, the definitions and mathematical derivations of these concepts are elucidated, offering readers a view into the mechanics of the PACF. The distribution of sample PACF subsection advances the discourse by focusing on the distribution aspects of the sample PACF. This subsection paints a detailed portrait of the PACF’s behavior, showcasing its inherent properties and tendencies.
In
Section 3, the paper shifts its focus to assess the behavior of the sample PACF and the normality of the sum of sample PACF. Through simulation studies, this section juxtaposes theoretical PACF with sample PACF, highlighting potential disparities and deviations. Building on prior findings, the section offers a fresh perspective on the distribution of sample PACF, underscored by empirical evidence. Additionally, this section evaluates the application of observations from the simulations using real data sets, specifically the 185 years of monthly Bank of England Rate data.
Section 4 summarizes the main themes and findings of the paper, bringing the discussion to a close. By synthesizing the concepts and reiterating the key discoveries, this section provides a cohesive wrap-up of the paper’s central arguments and insights. Additionally, it offers several ideas for further research based on the findings of this paper.
2. Partial Autocorrelation Function
In order to establish the definition of the PACF, it is necessary to initially examine the notion of a linear regression, wherein a random variable
is predicted using its own previous values up to a certain lag
h [
5].
The partial autocorrelation coefficient, denoted as , represents the correlation between the time series variable and its lagged value , while controlling for the influence of all other preceding lags. The aforementioned number is derived by the resolution of the Yule–Walker equations pertaining to an autoregressive process with a lag of h. By utilizing empirical data, it is possible to derive an estimation for the PACF, conventionally represented as . The formula for estimating the PACF, , for a given realization of length T, , is expressed as the estimation of coefficients in the linear regression of on .
The estimation of these coefficients can be achieved by several ways, one of which is the Durbin–Levinson algorithm [
19]. This algorithm is a recursive technique employed to solve the Yule–Walker equations specifically for autoregressive (AR) models. It is important to acknowledge that in order to obtain a precise estimation of the PACF, it is necessary to estimate the ACF for the given data. The utilization of PACF estimates, generated from sample PACF, is commonly employed to ascertain the appropriate number of lags to include in an autoregressive (AR) model [
5,
20].
2.1. Distribution of PACF
Let us now start this subsection by evaluating the distribution of PACF. Understanding the distribution of PACF is vital for any time series modeling and forecasting, as it provides crucial insights into the underlying dependencies within the data. By examining the distribution, we can better assess the behavior of the time series, identify patterns, and improve the accuracy of our models. This foundational knowledge is essential for developing robust forecasting techniques and ensuring the reliability of predictions. Let us first concentrate on the process.
Remark 1. For a causal process, asymptotically , for [19]. The above remark implies that, for large sample sizes, the estimated partial autocorrelation coefficients beyond the order
p follow a standard normal distribution, providing a basis for statistical inference in time series analysis. Consequently, the asymptotic normality of the sample PACF suggests that the sum of the sample PACF would also be asymptotically normal. This result is derived by applying the delta method to the sample PACF at different lags (see [
19] for details on the delta method). Please note that the
process, when
, is a white noise series. Thus, this remark also applies to a white noise series. Furthermore, under the null hypothesis for statistical tests, we assume that we have a noise time series. Thus, the normality assumption plays a vital role.
The above remark indicates that, for an
process,
for
follow a normal distribution. Our aim here is to study this sum for various lags, even for
. While there is not enough research on
, recent studies on the sum of sample ACF reveal several important observations for time series analysis and forecasting [
6,
7,
21,
22,
23,
24,
25].
As mentioned in Remark 1, the sample PACF of any white noise process (as process) is asymptotically normal in all lags. In testing the null hypothesis , this characteristic of sample PACF of white noise comes in handy, since under this null hypothesis, the time series will be a white noise process.
As mentioned above, in practical applications, the accurate values of the PACF, , are often unknown and are approximated using their sample equivalents, . Thus, in practical terms, our main interest lies with the sample data at hand. It is critical to underscore that the theoretical notions, inclusive of all peripheral theory and approximations, should ideally be observable in the application of these methods to real-world data. This practical application is pivotal, as it allows us to validate the theoretical models and ensure their relevance and utility in empirical analysis.
2.2. Sum of the Sample ACF
Let us now examine the sum of the Sample ACF (SACF). We will explore various properties of the SACF, largely building upon previous research, as exemplified in [
6,
7,
21,
22,
23,
24,
25].
Hassani Theorem [6]
The sum of the sample ACF,
, with lag
is always
for any stationary time series with arbitrary length
:
The has the following properties:
It remains independent of the time series length, denoted as T; specifically, equals for .
This property is intriguing, because it implies that the total level of autocorrelation in a stationary time series, quantified by the sum of the ACF values, remains unaffected by the time series’ length.
The value of remains constant at for any stationary time series. Consequently, for instance, the for any order of an process is equivalent to that of a Gaussian white noise process, with both being equal to .
The second property of the theorem asserts that, for any stationary time series, the value of is consistently . In practical terms, this signifies that the sum of sample ACF at each lag remains constant, irrespective of the time series’ length. For instance, whether it is the sample ACF of an process of any order or a Gaussian white noise process, both exhibit an unchanging value of for .
This result holds significant implications for the process of constructing ARMA models and conducting forecasting. Consequently, improper order detection may occur, leading to potential modeling inaccuracies.
The values of
are linearly dependent:
This equation demonstrates that the value of can be expressed as a linear combination of the other sample ACF values, with a constant term of . In essence, it reveals that the sample ACF values are not independent of one another; instead, they are systematically interconnected.
There is at least one negative
for any stationary time series, even for
with a positive ACF [
6].
This property asserts that, in the context of any stationary time series, there is guaranteed to be at least one negative sample ACF value, even when dealing with autoregressive (AR) models characterized by positive ACF values.
The property of
being constant and equal to
for any stationary time series has important implications for time series analysis and modeling (see, for example, [
24,
25]).
In conclusion, while the theoretical sum of ACF is always zero for any white noise process with a mean of zero and variance (for example), it is important to note that the sum of sample ACF consistently remains at . This holds true regardless of the time series length, mean, variance values, dependencies, or any other time series characteristics.
Although the patterns of the sum of the sample ACF vary across models, the sum always converges to
as the sample size expands. This indicates that the sum of the sample ACF may not be a reliable metric for discerning long memory, especially when compared to the theoretical definition rooted in the ACF [
25]. This insight is crucial, highlighting that exclusive reliance on the sample ACF for identifying long-memory processes might lead to erroneous conclusions.
3. Assessing Normality of Sum of Sample PACF
Let us now consider the behavior of the sample PACF, examining it both through simulation studies and by using real data sets. By leveraging simulations, we can explore the theoretical properties and potential anomalies of the sample PACF in a controlled environment. Additionally, applying our findings to real-world data, such as the 185 years of monthly Bank of England Rate, allows us to validate our theoretical insights and assess their practical implications. This dual approach not only enhances our understanding of the sample PACF but also strengthens the robustness of our conclusions in the context of time series modeling and forecasting.
3.1. Simulation
In order to investigate the distribution of the sample PACF, a simulation study is conducted. The simulation consists of 5000 sample paths for each of the time series lengths
, and 1000. Each sample path is generated from Gaussian white noise, and the sample PACF is calculated for all possible lags (lags 1 to
for the generated sample path of length
T). Once the sample PACFs are generated for simulated sample paths, the Shapiro–Wilk test [
26] is employed to test the normality of the cumulative sum of sample PACF in each lag. The null hypothesis in the normality test is as follows:
where
and
are the mean and standard deviation of
in the simulated data. The
p-values of each test and the first lag in which the normality assumption is rejected are recorded. In order to account for the stochastic nature of the simulation study, the above-described simulation is repeated 100 times, and the median, 2.5, and 97.5 percentiles for
p-values and the first lag in which the normality assumption is rejected are calculated. Furthermore, the density of the sum of sample PACF is estimated using the kernel method, and the median and 2.5 and 97.5 percentiles of the fitted densities are calculated as well. Estimated percentiles for the first lags in which the sum of sample PACF’s normality is rejected are presented in
Table 1.
Figure 1 also shows the distribution of the first lags where the sum of the sample PACF’s distribution significantly departs from a normal distribution. As can be seen, after a certain number of lags, in all the simulations, the distribution of the sum of the sample PACF departs from a normal distribution.
Figure 2 represents the trend of Shapiro–Wilk’s
p-value for testing the normality of cumulative sum of sample PACFs in different lags. The solid line shows the median of
p-values from different simulations and the bound shows the area between 2.5 and 97.5 percentiles of
p-values from different simulations (representing 95% confidence interval).
To better visualize the departure of the cumulative sum of sample PACF from a normal distribution, the density of cumulative sum of sample PACF are estimated for the lags in
Table 1. Please note that the use of the keyword “cumulative” here indicates that the sum is taken from lag 1 to lag
h. Otherwise, the meaning remains the same without using the term “cumulative”. This is just for a more precise representation of the results.
Since, in practical applications of the sample PACF (e.g., finding the order of
component in a model), it is common to use the sample PACF up to the lag that is
(were
T is the length of observed time series), the density is also estimated for lag
(for further discussion on this selection, see, for instance, [
23]). The median and 95% confidence intervals for estimated densities are presented in
Figure 3,
Figure 4,
Figure 5 and
Figure 6.
As evident in
Figure 3,
Figure 4,
Figure 5 and
Figure 6, as the number of lags in the sum of the sample PACF increases, the distribution departs from normality. However, the rate of this departure depends on the length of the time series.
As discussed by Fuller [
27], when the coefficients of a stationary AR model are close to the unit root boundaries, the asymptotic normality of the estimated parameters converges at a slower rate, necessitating a very large number of observations for accurate inference. Fuller’s argument is particularly relevant to the sample PACF distribution, especially in the
case, since the first lag sample PACF, the first lag sample ACF, and the coefficient estimate are equal.
As shown in
Figure 3,
Figure 4,
Figure 5 and
Figure 6, the departure of the sample PACF distribution from normality continues at later lags, even in white noise.
Figure 7 also illustrates the cumulative sum of sample PACFs and their density across different lags, for 1000 simulated sample paths from the following
time series:
As can be seen, the distribution is skewed from the first lag. Furthermore, it is evident that the cumulative sum of sample PACF follows a stochastic decreasing pattern.
3.2. Monthly Bank of England Rate for 185 Years
In our investigation of time series analysis, particularly focusing on PACF, it was imperative to apply our theoretical insights to a real-world context. For this purpose, we selected a comprehensive and historically significant dataset: 185 years of monthly Bank of England Rate (
Figure 8, top). This dataset not only offered a vast span of data, but also presented a unique opportunity to examine economic trends over a long period, providing a practical ground for testing our theoretical propositions. Full description and comprehensive analysis of the data are presented in [
28].
Our approach involved decomposing the time series data to isolate the random noise component, thereby enabling a clearer analysis of the PACF. This was essential in understanding how theoretical models of PACF hold up against actual, observed data. The residual series, after confirming that they are white noise, is then simulated 1000 times in order to generate the distribution of the sample PACF. We applied the Shapiro–Wilk test to evaluate the normality of the sum of sample PACF distribution. This analysis was pivotal in confirming or challenging our earlier simulations and theoretical deductions regarding the behavior of sample PACF in real-world data.
The analysis of this dataset highlighted the deviations of empirical PACF from its theoretical expectations, especially beyond certain lags (see
Figure 8, bottom). Thus, this real-world application of PACF not only validated our previous findings, but also provided a deeper understanding of its practical implications. It underscored the necessity for analysts and economists to consider these deviations in their models and forecasts. Moreover, it opened up avenues for future research, particularly in refining PACF models to better account for the complexities and idiosyncrasies of real-world economic data.
4. Conclusions
Our analysis has demonstrated that while the PACF is a fundamental tool in uncovering the intricacies of time series data, its behavior in empirical scenarios can deviate significantly from theoretical expectations. Notably, we observed that the sum of the sample PACF and, consequently, its components, diverges from the expected normal distribution beyond a certain lag.
This deviation becomes more pronounced at larger lags, thus posing challenges to the underlying assumptions of normality in time series modeling.
The implications of these findings are profound. They compel us to reconsider and possibly revise standard practices in time series analysis, especially in the context of model selection and forecasting. The recognition that sample PACF can deviate from theoretical predictions underscores the need for a more nuanced approach in analyzing time series data, one that accounts for potential deviations and adapts to the specific characteristics of the dataset.
Furthermore, our study has highlighted the importance of a comprehensive approach that integrates both theoretical understanding and empirical analysis. By applying our findings to a real-world dataset of such historical significance, we have managed to bridge the gap between theory and practice, providing insights that are not only academically intriguing but also practically relevant.
In conclusion, our journey through the complex world of time series analysis, guided by the lens of the PACF, has revealed new challenges and opportunities. It underscores the need for continuous exploration and adaptation in the field, encouraging future research to delve deeper into these findings and further refine our understanding of time series analysis.