18 BJPS421
18 BJPS421
18 BJPS421
Abstract. To predict time series of counts with small values and remarkable
fluctuations, an available model is the r states random environment process
based on the negative binomial thinning operator and the geometric marginal.
However, we argue that the aforementioned model may suffer from the fol-
lowing two drawbacks. First, under the condition of no prior information,
the overdispersed property of the geometric distribution may cause the pre-
dictions fluctuate greatly. Second, because of the constraints on the model
parameters, some estimated parameters are close to zero in real-data exam-
ples, which may not objectively reveal the correlation relationship. For the
first drawback, an r states random environment process based on the bi-
nomial thinning operator and the Poisson marginal is introduced. For the
second drawback, we propose a generalized r states random environment
integer-valued autoregressive model based on the binomial thinning oper-
ator to model fluctuations of data. Yule–Walker and conditional maximum
likelihood estimates are considered and their performances are assessed via
simulation studies. Two real-data sets are conducted to illustrate the better
performances of the proposed models compared with some existing models.
1 Introduction
Integer-valued autoregressive (INAR) time series models play an important role in the the-
oretical research and real-life applications over the last few years. These models are con-
structed usually based on the binomial thinning operator introduced by Steutel and van Harn
(1979), which is defined as follows
X
α ◦ X := ξi , X > 0, (1.1)
i=1
Key words and phrases. Binomial thinning, INAR, random environment, time series of counts.
Received July 2018; accepted October 2018.
251
252 Z. Liu, Q. Li and F. Zhu
2 Definition
First, we introduce an INAR model. A sequence of the INAR process in time n, denoted
by {Xn }, which is obtained under a certain combination of the environment conditions. We
assume that there is a finite number of combinations, denoted by r ∈ {1, 2, . . .}, which also
represents the number of different parameter values of the marginal distribution, and possible
sets of environment factors are represented by Er = {1, 2, . . . , r}. A sequence of random
variables {Zn }n∈N0 , where N0 ≡ N ∪ 0 and N ≡ {1, 2, . . .}, is called the r states random
environment process if it is a Markov chain taking values in Er .
A sequence of non-negative
r
integer-valued random variables {Xn (Zn )}n∈N0 , where
Xn (Zn ) is defined as z=1 Xn (zn )I{Zn =zn } , is called the r states random environment
Random environment binomial thinning integer-valued autoregressive process 253
INAR(1) process (see Definition 2 in Nastić, Laketa and Ristić (2016)), if it has the fol-
lowing expression
Xn−1
(Zn−1 )
Xn (Zn ) = Ui + εn (Zn−1 , Zn ), n ∈ N,
i=1
where
r
r
ε(Zn−1 , Zn ) = εn (zn−1 , zn )I{Zn−1 =zn−1 ,Zn =zn } ,
zn−1 =1 zn =1
{Ui } is a counting sequence of i.i.d. random variables, {Zn } is an r states random environment
process, zn is the realization of the random environment state in time n, I is the indicator
function and {εn (i, j )}, n ∈ N0 , i, j ∈ Er , are sequences of i.i.d. random variables, which
meet the following assumptions:
(A1) {Zn }, {εn (1, 1)} and {εn (1, 2)}, . . . , {εn (r, r)} are mutually independent for all n ∈ N0 ;
(A2) {Zm } and εn (i, j ) are independent of Xn (l) for n < m and any i, j, l ∈ Er .
The inspiration of our new INAR processes comes from Tang and Wang (2014) and Nastić,
Laketa and Ristić (2016), the former discussed a first-order random coefficient INAR model
under random environment by introducing a Markov chain with finite state space, while the
latter introduced a first-order random coefficient INAR model under random environment
with geometric distribution. An r states random environment INAR(1) processes based on
the binomial thinning operator with Poisson or geometric marginal distribution is given by
the following definition.
Definition 1. A sequence {Xn (zn )}n∈N0 , where zn is the realization of the r states random
environment process {Zn } in time n, is the r states random environment INAR(1) process
based on the binomial thinning operator (RrINAR(1)), if Xn (zn ) is defined as
Proposition 1. The bivariate time series {Xn (zn ), zn } given by (2.1) is a first-order Markov
time series.
Proposition 1 can be proved with arguments similar to those in Nastić, Laketa and Ristić
(2016), and the handling skills can also be found in Tang and Wang (2014).
254 Z. Liu, Q. Li and F. Zhu
Definition 2. A sequence {Xn }n∈N0 , is the generalized r states random environment INAR(1)
process based on the binomial thinning operator (GRrINAR(1)), if Xn is defined as
Xn = αzn−1 ◦ Xn−1 + εn (zn−1 ), n ∈ N, (2.2)
Xn−1
where αzn−1 ∈ (0, 1), αzn−1 ◦ Xn−1 = i=1 Ui,zn−1 , the counting sequence {Ui,zn−1 }i∈N , in-
corporated in α◦ makes a sequence of i.i.d. random variables with pmf given as P (Ui,zn−1 =
1) = αzn−1 , P (Ui,zn−1 = 0) = 1 − αzn−1 , Xn has the following two different marginal distri-
butions:
Case 2.1
μx −μ
P (Xn = x) = e , x ∈ N0 .
x!
Case 2.2
μx
P (Xn = x) = , x ∈ N0 .
(1 + μ)x+1
Note that Definition 1 considers the variation of the marginal distribution, while Defini-
tion 2 considers the variation of the thinning operator. Both models are dynamic, but they
have different emphasize: one focuses on marginal distribution, the other concentrates on the
fluctuation, which behave differently in prediction, see Figure 7 in Section 6 for an intuitive
explanation.
3 Properties
In this section, we derive some properties of models (2.1) and (2.2), such as the distribution
of the innovation, the correlation structure and the conditional variance of the processes. Be-
cause of the structural similarity between two kinds of models, the aforementioned properties
are similar. The proofs of all theorems and corollaries are given in the Appendix.
Theorem 1. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr > 0.
If zn−1 = i, zn = j , i, j ∈ Er , 0 ≤ α ≤ min{ μμst ; s, t ∈ Er }, then random variable εn (i, j ) has
the following distribution.
Case 1.1
d
εn (i, j ) = P (μj − αμi ).
Case 1.2
⎧
⎪ μj αμi
⎪
⎨Geom , w.p. 1− ,
d
εn (i, j ) = 1 + μj μj
⎪ αμi
⎪
⎩0, w.p. .
μj
Theorem 2. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr >
0. Then (1) The covariance function of the random variables Xn (zn ) and Xn−k (zn−k ), k ∈
{0, 1, . . . , n}, is positive and is given as:
Case 1.1
γn (k) ≡ Cov Xn (zn ), Xn−k (zn−k ) = α k μzn−k .
Case 1.2
γn (k) ≡ Cov Xn (zn ), Xn−k (zn−k ) = α k μzn−k (1 + μzn−k ).
(2) The correlation function of the random variables Xn (zn ) and Xn−k (zn−k ), k ∈ {0, 1,
. . . , n}, is positive, less than 1 and is given as:
Case 1.1
μzn−k
ρn (k) ≡ Corr Xn (zn ), Xn−k (zn−k ) = α k .
μ zn
Case 1.2
μzn−k (1 + μzn−k )
ρn (k) ≡ Corr Xn (zn ), Xn−k (zn−k ) = α k .
μzn (1 + μzn )
The proof of Theorem 2 is similar to Theorem 3 in Nastić, Laketa and Ristić (2016), so
the details are omitted here. The following theorem are some regression properties of the
RrINAR(1) process.
Theorem 3. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr > 0.
Then, the k-step conditional mean and variance of Xn+k (zn+k ) on Xn (zn ) are the following
cases.
Case 1.1
E Xn+k (zn+k )|Xn (zn ) = α k Xn (zn ) + μzn+k − α k μzn , k ∈ N0 ,
k k
Var Xn+k (zn+k )|Xn (zn ) = α 1 − α Xn (zn ) − μzn
+ μzn+k − α 2k μzn , k ∈ N0 .
Case 1.2
E Xn+k (zn+k )|Xn (zn ) = α k Xn (zn ) + μzn+k − α k μzn , k ∈ N0 ,
Var Xn+k (zn+k )|Xn (zn ) = α k 1 − α k Xn (zn ) − μzn + μzn+k (1 + μzn+k )
− α 2k μzn (1 + μzn ), k ∈ N0 .
Theorem 4. Let {Xn }n∈N0 be the process given by (2.2), αi ∈ (0, 1), i ∈ 1, . . . , r. If zn−1 = i,
i ∈ Er , then random variable εn (i) has the following distribution:
Case 2.1
d
εn (i) = P (μ − αi μ).
256 Z. Liu, Q. Li and F. Zhu
Case 2.2
⎧
⎪
⎨Geom μ
d , w.p. 1 − αi ,
εn (i) = 1+μ
⎪
⎩0, w.p. αi .
Corollary 2. Let zn−1 = i, i ∈ Er . The expectation and variance of a random variable εn (i)
in Definition 2 are given as:
Case 2.1 E(εn (i)) = μ − αi μ and Var(εn (i)) = μ − αi μ, respectively.
Case 2.2 E(εn (i)) = μ − αi μ and Var(εn (i)) = μ(1 + μ) − αi μ(1 + αi μ), respectively.
Theorem 5. Let {Xn }n∈N0 be the process given by Definition 2, αi ∈ (0, 1), i ∈ 1, . . . , r. The
one-step conditional variance of Xn+1 on Xn is:
Case 2.1
Var(Xn+1 |Xn ) = αzn (1 − αzn )Xn + (1 − αzn )μ.
Case 2.2
Var(Xn+1 |Xn ) = αzn (1 − αzn )Xn + μ(1 + μ) − αzn μ(1 + αzn μ).
4 Estimation
We can see that the RrINAR(1) and GRrINAR(1) models are two dynamic ones, which can
adjust their marginal distributions to varying circumstances or their innovation distributions
through time. Suppose we have data, a set of realizations: {x1 , x2 , . . . , xN }. It is required to
estimate the parameters to fit the data. For RrINAR(1), we can use the K-means clustering to
partition N observation into r clusters in which each observation belongs to the cluster with
the nearest mean. In this simulation, we use the statistical software R to obtain the sequence
z1 , z2 , . . . , zN . For GRrINAR(1), we can use the relationship between αzi and its one-step
conditional variance. Take r = 2 as an example, consider the absolute value between xi and
xi+1 , if |xi+1 − xi | ≤ σ , where σ is the standard deviation of its marginal distribution, then
zi = 1; else if, zi = 2. Similarly, for r = 3, if |xi+1 − xi | ≤ σ , then zi = 1; else if, σ <
|xi+1 − xi | ≤ 2σ , then zi = 2; else, zi = 3. Using this criterion, we can derive the sequence
z1 , z2 , . . . , zN−1 .
We consider the Yule–Walker (YW) estimation for the RrINAR(1) model, conditional
maximum likelihood (CML) estimation for both RrINAR(1) and GRrINAR(1) models. For
GRrINAR(1) model, which contains several different thinning operators, the YW estima-
tion is not considered since the moment estimation usually shows little efficiency in small
samples.
Random environment binomial thinning integer-valued autoregressive process 257
(k) 1
γ1,l = Xi+1 (k) − μ
k,l Xi (k) − μ
k,l .
ck,l {i,i+1}⊆Rk,l
Definition 3. We notice that if under the circumstance k, the estimators obtained from the
subsample Uk are defined as
1 (k) 1 2
k =
μ Xi (k), γ0 = Xi (k) − μ
k ,
nk i∈I nk i∈I
k k
1
γ1(k) = Xi+1 (k) − μ
k Xi (k) − μ
k ,
sk {i,i+1}⊆Ik
where sk = l∈I (nk,l >1) ck,l .
omit the factor pi−1,i = P (Zi = zi | Zi−1 = zi−1 ) and derive the joint log-likelihood function
as follows
log L = log L(x1 , z1 , . . . xN , zN | μ1 , μ2 , . . . , μr , α)
N
= log P Xi (zi ) = xi | Xi−1 (zi−1 ) = xi−1 .
i=2
Case 2.1
min{xi−1 ,xi }
N xi−1
log L = log αzki−1 (1 − αzi−1 )xi−1 −k
i=2 k=o
k
5 Numerical simulations
A simulation study was conducted to evaluate the finite sample performances of the YW and
CML estimates. We have simulated 500 replications with sample size n = 200, 400 for each
model, and discussed two practicable cases: the case of three or two states. For the estimation
of the parameters, the most important is to determine the Markov chain, we set vector p as
the discrete distribution of z1 , and matrix P as the process transition probability matrix. The
matrix sets the frequencies of the realized circumstances and it shapes the dynamical structure
of the RrINAR(1) and GRrINAR(1) processes. For this, these set-ups were considered as
follows:
For Poisson RrINAR(1) model (short as P-RrINAR(1)) and Geometric RrINAR(1) model
(short as G-RrINAR(1)):
(1) The P-R3INAR(1) model (three states random environment process based on binomial
thinning operator with Poisson marginal distribution (2.1), Case (1.1)) and the G-R3INAR(1)
model (three states random environment process based on binomial thinning operator with
geometric marginal distribution (2.1), Case (1.2)) with the following three scenarios
0.4 0.3 0.3
Scenario (a): (μ1 , μ2 , μ3 , α) = (1, 2, 3, 0.2), p = (0.33, 0.34, 0.33), P = 0.3 0.4 0.3 ;
0.3 0.3 0.4
260 Z. Liu, Q. Li and F. Zhu
6 Illustrative examples
In applications of INAR models, researchers may face that the time series are not stationary.
The stationary INAR models may not be the best choice but usually unavoidable. We have
obtained two time series representing a monthly counting of drug reselling (DRUGS) from
the Forecasting Principles website (http://www.forecastingprinciples.com). These crimes are
reported in the 27th (Figure 1) and the 24th (Figure 5) police car beats in Pittsburgh from
January 1990 to December 2001, each constituting a sequence of 144 observations. The data
in the 27th police car beat was also discussed in Nastić, Laketa and Ristić (2016).
The first step in standard INAR modeling is to obtain the plots of the time series: the
ACF and partial ACF (PACF). From the ACF and PACF plots (Figures 1 and 5), we find that
modeling the counts using INAR(1) is reasonable. In the time series plots, it is not difficult
to see that besides small jumps, in the last 2 years, there is a steady and permanent significant
Random environment binomial thinning integer-valued autoregressive process 261
Table 1 Mean of estimates, RMSE (within parentheses) for P-R3INAR(1) and G-R3INAR(1)
P (a) 200 YW 1.0269 (0.1381) 2.0311 (0.2012) 3.0253 (0.2385) 0.1949 (0.1245)
CML 0.9988 (0.1299) 2.0008 (0.1852) 3.0224 (0.2199) 0.2041 (0.0657)
400 YW 1.0181 (0.0952) 2.0122 (0.1363) 3.0020 (0.1616) 0.1981 (0.0868)
CML 0.9976 (0.0925) 2.0030 (0.1292) 3.0079 (0.1688) 0.1975 (0.0435)
(b) 200 YW 4.0198 (0.3497) 5.0080 (0.3536) 5.9958 (0.4200) 0.4787 (0.1200)
CML 4.0001 (0.3011) 5.0039 (0.3217) 5.9976 (0.3795) 0.4998 (0.0489)
400 YW 3.9892 (0.2537) 4.9965 (0.2327) 5.9856 (0.3091) 0.4893 (0.0901)
CML 4.0010 (0.2146) 4.9951 (0.2163) 6.0051 (0.2641) 0.5011 (0.0326)
(c) 200 YW 1.0336 (0.1411) 2.0078 (0.1851) 6.0022 (0.3019) 0.0986 (0.1288)
CML 0.9922 (0.1326) 1.9928 (0.1836) 6.0192 (0.3309) 0.0991 (0.0639)
400 YW 1.0128 (0.0871) 2.0056 (0.1251) 5.9933 (0.2191) 0.1012 (0.0893)
CML 1.0019 (0.0954) 2.0019 (0.1277) 6.0021 (0.2187) 0.0990 (0.0431)
Geo (a) 200 YW 1.0195 (0.1839) 2.0017 (0.3314) 2.9859 (0.4569) 0.1944 (0.1307)
CML 1.0014 (0.1804) 1.9817 (0.3188) 3.0203 (0.4303) 0.1986 (0.0402)
400 YW 1.0197 (0.1319) 2.0039 (0.2290) 3.0233 (0.3226) 0.1959 (0.0935)
CML 1.0010 (0.1237) 1.9948 (0.2135) 2.9906 (0.3020) 0.2001 (0.0272)
(b) 200 YW 4.0079 (0.7932) 5.0184 (0.7712) 6.0141 (1.1440) 0.4712 (0.1458)
CML 3.9647 (0.6584) 4.9762 (0.7384) 5.9574 (0.9193) 0.5008 (0.0233)
400 YW 3.9964 (0.5490) 5.0242 (0.5713) 5.9948 (0.8045) 0.4813 (0.1027)
CML 4.0032 (0.4394) 4.9809 (0.4994) 5.9941 (0.6342) 0.4995 (0.0170)
(c) 200 YW 1.0342 (0.1857) 2.0200 (0.3231) 5.9937 (0.7944) 0.1017 (0.1300)
CML 0.9813 (0.1810) 1.9675 (0.3006) 6.0350 (0.8026) 0.1011 (0.0314)
400 YW 1.0158 (0.1289) 2.0218 (0.2126) 5.9913 (0.5910) 0.0995 (0.0896)
CML 1.0022 (0.1213) 2.0003 (0.2173) 6.0031 (0.5616) 0.1015 (0.0211)
increase of the observed period. So we assume that these two counting sequences may arise
from different environments, or may have different parameter turbulence through time, we
can apply our models to them.
We model these two datasets using RrINAR(1) in (2.1), GRrINAR(1) in (2.2) and Geomet-
ric RrINAR(1) based on the negative binomial thinning operator (short as RrNGINAR(1))
from Nastić, Laketa and Ristić (2016). For Example 1 (27th police car beat), we fit the mod-
els mentioned above based on the whole sequence of 144 observations, obtain the result of
which random environment each data belongs to, then estimate the parameters, and generate
the fitting data according to the known random environment. Finally, we calculate the value
of Akaike information criterion (AIC), Bayesian information criterion (BIC) and the RMSE
between 144 fitting data and its corresponding observations. For Example 2 (24th police car
beat), we divide the sequence of 144 observations into two parts, the first is the former 132
observations, the second is the last 12 observations. The transition probability matrix is esti-
mated by the method in Anderson and Goodman (1957). According to this matrix, we predict
which random environment the last 12 data belonging to, and then generate the last 12 predic-
tions. Finally, we calculate the value of AIC and BIC. We also calculate the RMSE between
the last 12 predictions and the last 12 observations.
Example 1
Establishing the above models according to the the sequence of 144 observations. Then we
calculate AIC and BIC. We also calculate RMSE between the observations and the pre-
262 Z. Liu, Q. Li and F. Zhu
Table 2 Mean of estimates, RMSE (within parentheses) for P-R2INAR(1) and G-R2INAR(1)
Table 3 Mean of estimates, RMSE (within parentheses) for P-GR3INAR(1) and G-GR3INAR(1)
P (g) 200 CML 0.4989 (0.0896) 0.3003 (0.1147) 0.0972 (0.1318) 5.0119 (0.2088)
400 0.5022 (0.0584) 0.2940 (0.0796) 0.0971 (0.0897) 4.9974 (0.1579)
(h) 200 CML 0.6965 (0.0612) 0.4922 (0.0800) 0.2913 (0.1224) 1.9983 (0.1619)
400 0.6966 (0.0427) 0.4970 (0.0577) 0.2925 (0.0817) 1.9941 (0.1242)
(i) 200 CML 0.5946 (0.0731) 0.3926 (0.1006) 0.2047 (0.1126) 2.9990 (0.1723)
400 0.5962 (0.0511) 0.3957 (0.0682) 0.1954 (0.0814) 3.0052 (0.1307)
Geo (g) 200 CML 0.4976 (0.0443) 0.3024 (0.0526) 0.1003 (0.0659) 5.0250 (0.4941)
400 0.4997 (0.0317) 0.3012 (0.0363) 0.0996 (0.0420) 4.9920 (0.3455)
(h) 200 CML 0.7009 (0.0487) 0.5009 (0.0532) 0.3030 (0.0792) 2.0071 (0.2880)
400 0.6997 (0.0340) 0.4982 (0.0407) 0.3036 (0.0524) 2.0018 (0.1970)
(i) 200 CML 0.5992 (0.0491) 0.3987 (0.0605) 0.1995 (0.0735) 2.9910 (0.3526)
400 0.5979 (0.0326) 0.3977 (0.0417) 0.2008 (0.0465) 3.0057 (0.2497)
dicted values. For P-RrINAR(1), G-RrINAR(1) and RrNGINAR(1), we separate the se-
quence for two or three possible random states by K-means clustering algorithm (Figure 2).
For P-GRrINAR(1) and G-GRrINAR(1), we use the criterion in Section 4, and then separate
the random environment for two or three random states. After that, using CML in Section 4.2
Random environment binomial thinning integer-valued autoregressive process 263
Table 4 Mean of estimates, RMSE (within parentheses) for P-GR2INAR(1) and G-GR2INAR(1)
to estimate the parameters. The results in Table 5 about RrNGINAR(1) are not exactly the
same as those of Nastić, Laketa and Ristić (2016). Because the result of K-means cluster anal-
ysis using R is often not unique and the adoption of different algorithms, the little difference
is reasonable. According to the results of Table 5, P-R3INAR(1) has the smallest AIC, BIC
and RMSE, to this set of data, which means that P-R3INAR(1) perform best among those
models.
264 Z. Liu, Q. Li and F. Zhu
Figure 3 27th DRUGS data (•) and fitted values from P-R3INAR(1).
The fitted P-R3INAR(1) model for 27th DRUGS data is given in Figure 3. To further
examine the adequacy of the fitted model, let us consider the Pearson residuals, defined by
Xt − α̂Xt−1 (zt−1 ) − μ̂zt + α̂ μ̂zt−1
r1t = 1
.
[α̂(1 − α̂)(Xt−1 (zt−1 ) − μ̂zt−1 ) + μ̂zt − α̂ 2 μ̂zt−1 ] 2
We can also calculate the mean square error of Pearson residuals, which is equal to
n
t=1 r1t /(n − p), where p denotes the number of estimated parameters. Table 6 gives
2
some characteristics of the residuals for P-R3INAR(1) model. Figure 4 plots the ACF and
PCF of the Pearson residuals. There is no evidence of any correlation within the residu-
als, a finding is supported by the Ljung–Box statistic of 20.4391 based on 15 lags (because
2 (14) = 23.6847).
χ0.05
Example 2
Dividing the sequence of 144 observations into two parts, the first is the former 132 ob-
servations, remaining the last 12 observations as the second part. We establish the above
models according to the first part, then estimate the transition probability matrix between the
Random environment binomial thinning integer-valued autoregressive process 265
Table 5 Parameter estimates, RMSE, AIC and BIC for modeling of the 27th DRUGS data
environment states in the former 132 observations. Later, we forecast the last 12 probable
environment states by using the transition probability matrix and then forecast its related ob-
servations. Finally, we calculate the AIC, BIC of the former 132 observations, and RMSE of
difference between the last 12 observations and the related predictions.
For P-RrINAR(1), G-RrINAR(1) and RrNGINAR(1), we separate the sequence for two
or three random states by K-means clustering algorithm (Figure 6). For P-GRrINAR(1) and
G-GRrINAR(1), using the criterion in Section 4, we separate the random environment for
266 Z. Liu, Q. Li and F. Zhu
Figure 4 Pearson residual analysis for 27th DRUGS data: the autocorrelation function and the partial autocor-
relation function of residuals.
two or three random states. We can estimate the transition probability matrix by the method
in Anderson and Goodman (1957). After that, using CML in Section 4.2 to estimate the
parameters of the INAR models. According to the results of Table 7, P-R3INAR(1) has the
smallest AIC, BIC and second-smallest RMSE, which means that P-R3INAR(1) perform
best among those models. Figure 7 gives the prediction of the last 12 observations based on
ten models of the 24th DRUGS data.
From these two examples, we show that P-R3INAR(1) model has the best performance
among these models when fitting the above two datasets. The reason may be the sample sizes
of the two samples are not large, the number of subsamples of per states is small, the variance
Random environment binomial thinning integer-valued autoregressive process 267
of Poisson marginal distribution is smaller than geometric distribution when they have the
same expectation. Larger fluctuation of geometric distribution makes it difficult to fit the
marginal distribution. We also found that the number of data states has a great impact on the
estimation of these models. For example, in Example 1, P-R3INAR(1) performs better than
G-R3INAR(1), but G-R2INAR(1) performs better than P-R2INAR(1). How to determine the
number of random states before estimation is still open. If there is prior information, we can
choose the model according to the actual situation. If not, estimating by Poisson distribution is
more accurate than geometric distribution in some real datasets, especially for small samples
with sudden fluctuations.
7 Conclusion
In this paper, we propose two classes of INAR(1) models (RrINAR(1) and GRrINAR(1)) and
illustrate that they perform better than existing RrNGINAR(1) in some cases. But there still
is space for further study. First, the marginal distributions of Xn (zn ) in RrINAR(1) may be
generalized to some other discrete distributions, like the class of zero-modified distributions.
Second, the prediction of GRrINAR(1) models is not accurate enough because it just knows
that the fluctuation whether bigger or smaller, not the value of the data, using signed binomial
thinning operator maybe more accurate.
Appendix
Proof of Theorem 1.
Case 1.1 When zn−1 = i and zn = j , where i, j ∈ Er , we have E(s Xn (j ) ) =
E(s α◦Xn−1 (i)+εn (i,j ) ). The left side becomes E(s Xn (j ) ) = e(s−1)μj . Consider the right side.
Random variable Xn−1 (i) is independent of the random variable εn (i, j ), we have
Xn−1 (i) ε (i,j )
E s α◦Xn−1 (i)+εn (i,j ) = E s α◦Xn−1 (i) E s εn (i,j ) = E Es U1 E s n
= E (sα + 1 − α)Xn−1 (i) E s εn (i,j ) = e(s−1)αμi E s εn (i,j ) .
The pgf of the random variable εn (i, j ) is given as
E s εn (i,j ) = e(s−1)(μj −αμi ) ,
268 Z. Liu, Q. Li and F. Zhu
Table 7 Transition matrix, parameter estimates, RMSE, AIC and BIC of the 24th DRUGS data
μ
if 0 ≤ α ≤ μji , the random variable εn (i, j ) has the distribution given in Case 1.1. Since i
and j are arbitrary numbers from the set Er , it follows that the random variables εn (1, 1),
εn (1, 2), . . . , εn (r, r) will have well-defined distributions for α ∈ k,l∈Er [0, μμkl ], that is, for
0 ≤ α ≤ min{ μμkl ; k, l ∈ Er }.
Case 1.2 When zn−1 = i and zn = j , where i, j ∈ Er , we have E(s Xn (j ) ) =
E(s α◦Xn−1 (i)+εn (i,j ) ). The left-hand side becomes E(s Xn (j ) ) = 1+μj1−μj s . Consider the right-
Random environment binomial thinning integer-valued autoregressive process 269
Figure 7 The prediction of the last 12 observations based on ten models of the 24th DRUGS data.
hand side. Random variable Xn−1 (i) is independent of random variable εn (i, j ), we have
E s α◦Xn−1 (i)+εn (i,j )
Xn−1 (i) εn (i,j )
= E s α◦Xn−1 (i) E s εn (i,j ) = E Es U1 E s
1
= E (sα + 1 − α)Xn−1 (i) E s εn (i,j ) = E s εn (i,j ) .
1 + μi − μi (sα + 1 − α)
The pgf of the random variable εn (i, j ) is given as
1 + αμi − αμi s αμi 1 αμi
E s εn (i,j ) = = 1− + ,
1 + μj − μj s μj 1 + μj − μj s μj
270 Z. Liu, Q. Li and F. Zhu
μ
if 0 ≤ α ≤ μji , the random variable εn (i, j ) has the distribution in Case 1.2. Since i and
j are arbitrary numbers from the set Er , it follows that the random variables εn (1, 1),
εn (1, 2), . . . , εn (r, r) have well-defined distributions for α ∈ k,l∈Er [0, μμkl ], that is, for
0 ≤ α ≤ min{ μμkl ; k, l ∈ Er }.
Proof of Corollaries 1 and 2. The results follow from the fact that E(εn (i, j )) = ε (1) and
Var(εn (i, j )) = ε (1) + ε (1)(1 − ε (1)), where ε (s) is the pgf of the random variable
εn (i, j ), one can derive the result through a simple calculation.
Proof of Theorem 3. Let μn+k|n = E(Xn+k |Xn ) and μεn = E(εn ). According to Definition
1 and the independence of random variables Xn+k−1 and εn+k , we obtain the conditional
expectation of Xn+k on Xn satisfies the equation μn+k|n = αμn+k−1|n + εn+k . Using the
equation k − 1 times and the fact that μn|n = Xn , we obtain
k−1
μn+k|n = α k Xn + α l εn+k−l .
l=0
Using the result of Corollary 1 for the expectations of the random variables εn+k−l , l ∈
{0, 1, . . . , k − 1}, we obtain the expression for the conditional expectation.
Next consider the conditional variance. Let σn+k|n 2 = Var(Xn+k (zn+k ) | Xn (zn )) and
σεn+k = Var(εn+k ). Using the similar argument and properties of the binomial thinning oper-
2
ator, the conditional variance satisfies the following equation
2
σn+k|n = α 2 σn+k−1|n
2
+ α(1 − α)μn+k−1|n + σε2n+k ,
using the equation k − 1 times, we obtain
k−1
k−1
2
σn+k|n = α 2k σn|n
2
+ α(1 − α) α 2l μn+k−1−l|n + α 2l σε2n+k−l .
l=0 l=0
Finally, using the fact 2
σn|n = 0 and Corollary 1 for the variances of the random variables
εn+k−l , l ∈ {0, 1, . . . , k − 1}, which completes the proof.
Proof of Theorem 4.
Case 2.1 When zn−1 = i, i ∈ Er , we have E(s Xn ) = E(s αi ◦Xn−1 +εn (i) ). The left side be-
comes E(s Xn ) = e(s−1)μ . Let us consider the right side, Xn−1 is independent of εn (i), we
have
Xn−1 εn (i)
E s αi ◦Xn−1 +εn (i) = E s αi ◦Xn−1 E s εn (i) = E Es (Ui )1 E s
= E (sαi + 1 − αi )Xn−1 E s εn (i) = e(s−1)αi μ E s εn (i) .
The pgf of the random variable εn (i, j ) is given as
E s εn (i) = e(s−1)(μ−αi μ) .
Since i is arbitrary number from set Er , it follows that εn (1), εn (2), . . . , εn (r) are well
defined, if 0 ≤ αi ≤ 1, then εn (i) has the distribution in Case 2.1.
Case 2.2 When zn−1 = i, where i ∈ Er , we have E(s Xn ) = E(s αi ◦Xn−1 +εn (i) ). The left-
hand side becomes E(s Xn ) = 1+μ−μs 1
. Let us consider the right-hand side, Xn−1 is indepen-
dent of εn (i), we have
Xn−1 ε (i)
E s αi ◦Xn−1 +εn (i) = E s αi ◦Xn−1 E s εn (i) = E Es (Ui )1 E s n
= E (sαi + 1 − αi )Xn−1 E s εn (i)
1
= E s εn (i) .
1 + μ − μ(sαi + 1 − αi )
Random environment binomial thinning integer-valued autoregressive process 271
Proof of Theorem 5. The proof is similar to the situation k = 1 in the second part of Theo-
rem 3.
Acknowledgments
We thank the Editor and the anonymous referee for their constructive comments and sug-
gestions that have greatly improved the paper. This work is supported by National Natural
Science Foundation of China (Nos. 11871027, 11731015), Science and Technology Devel-
oping Plan of Jilin Province (No. 20170101057JC), Science and Technology Program of
Jilin Educational Department during the “13th Five-Year” Plan Period (No. 2016-399), and
Cultivation Plan for Excellent Young Scholar Candidates of Jilin University. Zhu is the cor-
responding author.
References
Al-Osh, M. A. and Alzaid, A. A. (1987). First order integer-valued autoregressive (INAR(1)) processes. Journal
of Time Series Analysis 8, 261–275. MR0903755
Anderson, T. W. and Goodman, L. A. (1957). Statistical inference about Markov chains. The Annals of Mathe-
matical Statistics 28, 89–110. MR0084903
Barczy, M., Ispány, M. and Pap, G. (2011). Asymptotic behavior of unstable INAR(p) processes. Stochastic
Processes and Their Applications 121, 583–608. MR2763097
Bu, R., McCabe, B. P. M. and Hadri, K. (2008). Maximum likelihood estimation of higher-order integer-valued
autoregressive processes. Journal of Time Series Analysis 29, 973–994. MR2464949
Drost, F. C., van den Akker, R. and Werker, B. J. M. (2008). Local asymptotic normality and efficient estimation
for INAR(p) models. Journal of Time Series Analysis 29, 783–801. MR2450896
Drost, F. C., van den Akker, R. and Werker, B. J. M. (2009). Efficient estimation of auto-regression parameters
and innovation distributions for semiparametric integer-valued AR(p) models. Journal of the Royal Statistical
Society, Series B, Statistical Methodology 71, 467–485. MR2649605
Jazi, M. A., Jones, G. and Lai, C. D. (2012). First-order integer valued AR processes with zero inflated Poisson
innovations. Journal of Time Series Analysis 33, 954–963. MR2991911
Joe, H. (1996). Time series models with univariate margins in the convolution-closed infinitely divisible class.
Journal of Applied Probability 33, 664–677. MR1401464
Laketa, P. N., Nastić, A. S. and Ristić, M. M. (2018). Generalized random environment INAR models of higher
order. Mediterranean Journal of Mathematics 15. MR3740339 https://doi.org/10.1007/s00009-017-1054-z
Latour, A. (1998). Existence and stochastic structure of a non-negative integer-valued autoregressive processes.
Journal of Time Series Analysis 4, 439–455. MR1652193
Li, H., Yang, K., Zhao, S. and Wang, D. (2018). First-order random coefficients integer-valued threshold autore-
gressive processes. AStA Advances in Statistical Analysis 102, 305–331. MR3829551
McCabe, B. P. M., Martin, G. M. and Harris, D. (2011). Efficient probabilistic forecasts for counts. Journal of the
Royal Statistical Society, Series B, Statistical Methodology 73, 253–272. MR2814495
McKenzie, E. (1986). Autoregressive moving-average processes with negative binomial and geometric distribu-
tions. Advances in Applied Probability 18, 679–705. MR0857325
Nastić, A. S., Laketa, P. N. and Ristić, M. M. (2016). Random environment integer-valued autoregressive process.
Journal of Time Series Analysis 37, 267–287. MR3511585
Nastić, A. S., Laketa, P. N. and Ristić, M. M. (2018). Random environment INAR models of higher order. REVS-
TAT Statistical Journal. To appear.
Pedeli, X., Davison, A. C. and Fokianos, K. (2015). Likelihood estimation for the INAR(p) model by saddlepoint
approximation. Journal of the American Statistical Association 110, 1229–1238. MR3420697
272 Z. Liu, Q. Li and F. Zhu
Qi, X., Li, Q. and Zhu, F. (2019). Modeling time series of count with excess zeros and ones based on INAR(1)
model with zero-and-one inflated Poisson innovations. Journal of Computational and Applied Mathematics
346, 572–590. MR3864182 https://doi.org/10.1016/j.cam.2018.07.043
Steutel, F. W. and van Harn, K. (1979). Discrete analogues of self-decomposability and stability. Annals of Prob-
ability 7, 893–899. MR0542141
Tang, M. and Wang, Y. (2014). Asymptotic behavior of random coefficient INAR model under random environ-
ment defined by difference equation. Advances in Difference Equations 2014, 99.
Weiß, C. H. (2018). An Introduction to Discrete-Valued Time Series. Chichester: John Wiley & Sons.
Zheng, H., Basawa, I. V. and Datta, S. (2006). Inference for pth-order random coefficient integer-valued autore-
gressive processes. Journal of Time Series Analysis 27, 411–440. MR2328539
Z. Liu Q. Li
F. Zhu College of Mathematics
School of Mathematics Changchun Normal University
Jilin University Changchun 130032
2699 Qianjin China
Changchun 130012 E-mail: 46968158@qq.com
China
E-mail: 914590404@qq.com
zfk8010@163.com