18 BJPS421

Brazilian Journal of Probability and Statistics
2020, Vol. 34, No. 2, 251–272

https://doi.org/10.1214/18-BJPS421
© Brazilian Statistical Association, 2020
Random environment binomial thinning integer-valued

autoregressive process with Poisson or geometric marginal
Zhengwei Liua , Qi Lib and Fukang Zhua
a Jilin University
b Changchun Normal University
Abstract. To predict time series of counts with small values and remarkable
fluctuations, an available model is the r states random environment process
based on the negative binomial thinning operator and the geometric marginal.
However, we argue that the aforementioned model may suffer from the fol-
lowing two drawbacks. First, under the condition of no prior information,
the overdispersed property of the geometric distribution may cause the pre-
dictions fluctuate greatly. Second, because of the constraints on the model
parameters, some estimated parameters are close to zero in real-data exam-
ples, which may not objectively reveal the correlation relationship. For the
first drawback, an r states random environment process based on the bi-
nomial thinning operator and the Poisson marginal is introduced. For the
second drawback, we propose a generalized r states random environment
integer-valued autoregressive model based on the binomial thinning oper-
ator to model fluctuations of data. Yule–Walker and conditional maximum
likelihood estimates are considered and their performances are assessed via
simulation studies. Two real-data sets are conducted to illustrate the better
performances of the proposed models compared with some existing models.
1 Introduction
Integer-valued autoregressive (INAR) time series models play an important role in the the-
oretical research and real-life applications over the last few years. These models are con-
structed usually based on the binomial thinning operator introduced by Steutel and van Harn
(1979), which is defined as follows

X
α ◦ X := ξi , X > 0, (1.1)
i=1
and 0 otherwise, where ξ := {ξi } is a collection of independent and identically distributed

(i.i.d.) Bernoulli counting random variables with fixed success probability α ∈ [0, 1], and X
is a non-negative integer-valued random variable independent of ξ . McKenzie (1986) and
Al-Osh and Alzaid (1987) introduced the INAR(1) model as follows
Xt = α ◦ Xt−1 + Zt , t ∈ Z, (1.2)
where {Zt } is a sequence of i.i.d. integer-valued random variables with finite mean and vari-
ance. Furthermore, it is also assumed that all thinning operations are performed mutually
independent and each of them is independent of {Zt }. The autocorrelation function (ACF)
of model (1.2) takes the form Corr(Xt , Xt−k ) = α k , and the probability generating function
(pgf) satisfies PX (s) = PX (1 − α + αs)Pz (s).
The binomial thinning operator (1.1) and related time series models were generalized in
many ways. Joe (1996) proposed a general way to obtain stationary INAR(1) models of the
Key words and phrases. Binomial thinning, INAR, random environment, time series of counts.
Received July 2018; accepted October 2018.
251
252 Z. Liu, Q. Li and F. Zhu
form Xt = Mt (Xt−1 , α) + Zt , where Mt is a random operator. Zheng, Basawa and Datta

(2006) generalized α in (1.1) to be random and proposed the random coefficient thinning
operator, which was further considered by Li et al. (2018) among others. Jazi, Jones and
Lai (2012) introduced a new stationary INAR(1) process with zero-inflated Poisson innova-
tions and it was further generalized to zero-and-one inflated case by Qi, Li and Zhu (2019).
For the pth-order INAR model, Drost, van den Akker and Werker (2008) provided an effi-
cient estimator of the parameters, showed that it has the local asymptotic normality property.
Barczy, Ispány and Pap (2011) obtained the asymptotic behavior of an unstable INAR(p)
process. McCabe, Martin and Harris (2011) derived the efficient probabilistic forecasts of the
integer-valued random variables by estimating the forecast distribution non-parametrically
and proved the asymptotic (non-parametric) efficiency. Parameter estimation for INAR-type
models has been discussed extensively. Latour (1998) considered conditional least-squares
estimation. Bu, McCabe and Hadri (2008) developed a general maximum likelihood analysis
of INAR(p) processes. However it becomes difficult to implement as the order p increases,
thus Pedeli, Davison and Fokianos (2015) proposed a simple saddlepoint approximation to
the log-likelihood. Drost, van den Akker and Werker (2009) considered a more realistic semi-
parametric INAR(p) model where there is essentially no restrictions on the innovation dis-
tribution, and proposed an (semiparametrically) efficient estimator of the autoregression pa-
rameters. Weiß (2018) gave excellent reviews about INAR models.
Nastić, Laketa and Ristić (2016) and Nastić, Laketa and Ristić (2018) have introduced non-
stationary integer-valued time series models based on a random environment process. In the
first paper, a random environment INAR model has been introduced, while the second paper
generalizes this model to high-order models, and this model is further generalized by Laketa,
Nastić and Ristić (2018) by relaxing the assumption about the equality of thinning operators
and the equality of the maximal orders for the random states. We make the following contri-
butions compared with Nastić, Laketa and Ristić (2016): (1) A simple but practicable INAR
model with Poisson marginal is introduced to model the non-stationary nature of time series
of counts. (2) We propose an r states random environment binomial thinning INAR model
with Poisson marginal, which performs better than existing models. (3) A new generalized
r states random environment binomial thinning INAR model and its new related empirical
criterion for classifying random environment are introduced.
This paper is organized as follows. The definition of two classes r states random envi-
ronment INAR(1) processes with Poisson or geometric marginal distribution are introduced
in Section 2. In Section 3, we derive some basic stochastic properties, like the conditional
moment and the distributional properties. The parameter estimation is given in Section 4 and
simulations are given in Section 5. Section 6 is devoted to the application of the proposed
models to two real datasets. All proofs are given in the Appendix.
2 Definition
First, we introduce an INAR model. A sequence of the INAR process in time n, denoted
by {Xn }, which is obtained under a certain combination of the environment conditions. We
assume that there is a finite number of combinations, denoted by r ∈ {1, 2, . . .}, which also
represents the number of different parameter values of the marginal distribution, and possible
sets of environment factors are represented by Er = {1, 2, . . . , r}. A sequence of random
variables {Zn }n∈N0 , where N0 ≡ N ∪ 0 and N ≡ {1, 2, . . .}, is called the r states random
environment process if it is a Markov chain taking values in Er .
A sequence of non-negative
r
integer-valued random variables {Xn (Zn )}n∈N0 , where
Xn (Zn ) is defined as z=1 Xn (zn )I{Zn =zn } , is called the r states random environment
Random environment binomial thinning integer-valued autoregressive process 253
INAR(1) process (see Definition 2 in Nastić, Laketa and Ristić (2016)), if it has the fol-
lowing expression
Xn−1
(Zn−1 )
Xn (Zn ) = Ui + εn (Zn−1 , Zn ), n ∈ N,
i=1
where

r
r
ε(Zn−1 , Zn ) = εn (zn−1 , zn )I{Zn−1 =zn−1 ,Zn =zn } ,
zn−1 =1 zn =1
{Ui } is a counting sequence of i.i.d. random variables, {Zn } is an r states random environment
process, zn is the realization of the random environment state in time n, I is the indicator
function and {εn (i, j )}, n ∈ N0 , i, j ∈ Er , are sequences of i.i.d. random variables, which
meet the following assumptions:
(A1) {Zn }, {εn (1, 1)} and {εn (1, 2)}, . . . , {εn (r, r)} are mutually independent for all n ∈ N0 ;
(A2) {Zm } and εn (i, j ) are independent of Xn (l) for n < m and any i, j, l ∈ Er .
The inspiration of our new INAR processes comes from Tang and Wang (2014) and Nastić,
Laketa and Ristić (2016), the former discussed a first-order random coefficient INAR model
under random environment by introducing a Markov chain with finite state space, while the
latter introduced a first-order random coefficient INAR model under random environment
with geometric distribution. An r states random environment INAR(1) processes based on
the binomial thinning operator with Poisson or geometric marginal distribution is given by
the following definition.
Definition 1. A sequence {Xn (zn )}n∈N0 , where zn is the realization of the r states random
environment process {Zn } in time n, is the r states random environment INAR(1) process
based on the binomial thinning operator (RrINAR(1)), if Xn (zn ) is defined as
Xn (zn ) = α ◦ Xn−1 (zn−1 ) + εn (zn−1 , zn ), n ∈ N, (2.1)

X (z )
where α ∈ (0, 1), α ◦ Xn−1 (zn−1 ) = i=1 n−1 n−1
Ui , the counting sequence {Ui }, i ∈ N, in-
corporated in α◦ makes a sequence of i.i.d. random variables with probability mass function
(pmf) given as P (Ui = 1) = α, P (Ui = 0) = 1 − α, Xn (zn ) has the following two different
marginal distributions:
Case 1.1
μxzn −μz
P Xn (zn ) = x = e n, x ∈ N0 .
x!
Case 1.2
μxzn
P Xn (zn ) = x = , x ∈ N0 .
(1 + μzn )x+1
Proposition 1. The bivariate time series {Xn (zn ), zn } given by (2.1) is a first-order Markov
time series.
Proposition 1 can be proved with arguments similar to those in Nastić, Laketa and Ristić
(2016), and the handling skills can also be found in Tang and Wang (2014).
Definition 2. A sequence {Xn }n∈N0 , is the generalized r states random environment INAR(1)
process based on the binomial thinning operator (GRrINAR(1)), if Xn is defined as
Xn = αzn−1 ◦ Xn−1 + εn (zn−1 ), n ∈ N, (2.2)
Xn−1
where αzn−1 ∈ (0, 1), αzn−1 ◦ Xn−1 = i=1 Ui,zn−1 , the counting sequence {Ui,zn−1 }i∈N , in-
corporated in α◦ makes a sequence of i.i.d. random variables with pmf given as P (Ui,zn−1 =
1) = αzn−1 , P (Ui,zn−1 = 0) = 1 − αzn−1 , Xn has the following two different marginal distri-
butions:
Case 2.1
μx −μ
P (Xn = x) = e , x ∈ N0 .
x!
Case 2.2
μx
P (Xn = x) = , x ∈ N0 .
(1 + μ)x+1
Note that Definition 1 considers the variation of the marginal distribution, while Defini-
tion 2 considers the variation of the thinning operator. Both models are dynamic, but they
have different emphasize: one focuses on marginal distribution, the other concentrates on the
fluctuation, which behave differently in prediction, see Figure 7 in Section 6 for an intuitive
explanation.
3 Properties
In this section, we derive some properties of models (2.1) and (2.2), such as the distribution
of the innovation, the correlation structure and the conditional variance of the processes. Be-
cause of the structural similarity between two kinds of models, the aforementioned properties
are similar. The proofs of all theorems and corollaries are given in the Appendix.
3.1 Properties of RrINAR(1) model

We begin with the distributions of the random variables εn (1, 1), εn (1, 2), . . . , εn (r, r) in
process (2.1).
Theorem 1. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr > 0.
If zn−1 = i, zn = j , i, j ∈ Er , 0 ≤ α ≤ min{ μμst ; s, t ∈ Er }, then random variable εn (i, j ) has
the following distribution.
Case 1.1
d
εn (i, j ) = P (μj − αμi ).
Case 1.2
⎧
⎪ μj αμi
⎪
⎨Geom , w.p. 1− ,
d
εn (i, j ) = 1 + μj μj
⎪ αμi
⎪
⎩0, w.p. .
μj
Corollary 1. Let zn−1 = i and zn = j , i, j ∈ Er . The expectation and variance of a random

variable εn (i, j ) in Definition 1 are given as the following.
Case 1.1 E(εn (i, j )) = μj − αμi and Var(εn (i, j )) = μj − αμi , respectively.
Case 1.2 E(εn (i, j )) = μj − αμi and Var(εn (i, j )) = μj (1 + μj ) − αμi (1 + αμi ), respec-
tively.
Next, we consider the covariance and correlation structure of process (2.1).
Theorem 2. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr >
0. Then (1) The covariance function of the random variables Xn (zn ) and Xn−k (zn−k ), k ∈
{0, 1, . . . , n}, is positive and is given as:
Case 1.1

γn (k) ≡ Cov Xn (zn ), Xn−k (zn−k ) = α k μzn−k .
Case 1.2

γn (k) ≡ Cov Xn (zn ), Xn−k (zn−k ) = α k μzn−k (1 + μzn−k ).
(2) The correlation function of the random variables Xn (zn ) and Xn−k (zn−k ), k ∈ {0, 1,
. . . , n}, is positive, less than 1 and is given as:
Case 1.1
μzn−k
ρn (k) ≡ Corr Xn (zn ), Xn−k (zn−k ) = α k .
μ zn
Case 1.2
μzn−k (1 + μzn−k )
ρn (k) ≡ Corr Xn (zn ), Xn−k (zn−k ) = α k .
μzn (1 + μzn )
The proof of Theorem 2 is similar to Theorem 3 in Nastić, Laketa and Ristić (2016), so
the details are omitted here. The following theorem are some regression properties of the
RrINAR(1) process.
Theorem 3. Let {Xn (zn )}n∈N0 be the process given by (2.1), μ1 > 0, μ2 > 0, . . . , μr > 0.
Then, the k-step conditional mean and variance of Xn+k (zn+k ) on Xn (zn ) are the following
cases.
Case 1.1

E Xn+k (zn+k )|Xn (zn ) = α k Xn (zn ) + μzn+k − α k μzn , k ∈ N0 ,
k k
Var Xn+k (zn+k )|Xn (zn ) = α 1 − α Xn (zn ) − μzn
+ μzn+k − α 2k μzn , k ∈ N0 .
Case 1.2

E Xn+k (zn+k )|Xn (zn ) = α k Xn (zn ) + μzn+k − α k μzn , k ∈ N0 ,

Var Xn+k (zn+k )|Xn (zn ) = α k 1 − α k Xn (zn ) − μzn + μzn+k (1 + μzn+k )
− α 2k μzn (1 + μzn ), k ∈ N0 .
3.2 Properties of GRrINAR(1) model

We derive some properties of the GRrINAR(1) process (2.2), the following theorem is about
the distributions of the random variables εn (1), εn (2), . . . , εn (r).
Theorem 4. Let {Xn }n∈N0 be the process given by (2.2), αi ∈ (0, 1), i ∈ 1, . . . , r. If zn−1 = i,
i ∈ Er , then random variable εn (i) has the following distribution:
Case 2.1
d
εn (i) = P (μ − αi μ).
Case 2.2
⎧
⎪
⎨Geom μ
d , w.p. 1 − αi ,
εn (i) = 1+μ
⎪
⎩0, w.p. αi .
Corollary 2. Let zn−1 = i, i ∈ Er . The expectation and variance of a random variable εn (i)
in Definition 2 are given as:
Case 2.1 E(εn (i)) = μ − αi μ and Var(εn (i)) = μ − αi μ, respectively.
Case 2.2 E(εn (i)) = μ − αi μ and Var(εn (i)) = μ(1 + μ) − αi μ(1 + αi μ), respectively.
Next, we consider the correlation structure of the GRrINAR(1) model.
Theorem 5. Let {Xn }n∈N0 be the process given by Definition 2, αi ∈ (0, 1), i ∈ 1, . . . , r. The
one-step conditional variance of Xn+1 on Xn is:
Case 2.1
Var(Xn+1 |Xn ) = αzn (1 − αzn )Xn + (1 − αzn )μ.
Case 2.2
Var(Xn+1 |Xn ) = αzn (1 − αzn )Xn + μ(1 + μ) − αzn μ(1 + αzn μ).
According to Theorem 5, in Case 2.1, for fixed μ, if Xn = 0, then Var(Xn+1 |Xn ) =

(1 − αzn )μ; if Xn = 0, the conditional variance is a quadratic function of Xn , the sym-
n −μ
metry axis is X2X n
, we can calculate the limit of the expectation of the symmetry axis as
n −μ
limn→∞ E( X2X n
) → 0− . Similarly, in Case 2.2, for fixed μ, if Xn = 0, then Var(Xn+1 |Xn ) =
μ(1 + μ) − αzn μ(1 + αzn μ), the symmetry axis is − 2μ
1
; if Xn = 0, the symmetry axis is
Xn −μ Xn −μ −
2(Xn +μ2 )
,
the limit of the expectation of the symmetry axis is limn→∞ E( 2(X 2 ) → 0 .
n +μ )
In this sense, with αzn increases from 0 to 1, the one-step conditional variance decreases
monotonically in both two cases.
4 Estimation
We can see that the RrINAR(1) and GRrINAR(1) models are two dynamic ones, which can
adjust their marginal distributions to varying circumstances or their innovation distributions
through time. Suppose we have data, a set of realizations: {x1 , x2 , . . . , xN }. It is required to
estimate the parameters to fit the data. For RrINAR(1), we can use the K-means clustering to
partition N observation into r clusters in which each observation belongs to the cluster with
the nearest mean. In this simulation, we use the statistical software R to obtain the sequence
z1 , z2 , . . . , zN . For GRrINAR(1), we can use the relationship between αzi and its one-step
conditional variance. Take r = 2 as an example, consider the absolute value between xi and
xi+1 , if |xi+1 − xi | ≤ σ , where σ is the standard deviation of its marginal distribution, then
zi = 1; else if, zi = 2. Similarly, for r = 3, if |xi+1 − xi | ≤ σ , then zi = 1; else if, σ <
|xi+1 − xi | ≤ 2σ , then zi = 2; else, zi = 3. Using this criterion, we can derive the sequence
z1 , z2 , . . . , zN−1 .
We consider the Yule–Walker (YW) estimation for the RrINAR(1) model, conditional
maximum likelihood (CML) estimation for both RrINAR(1) and GRrINAR(1) models. For
GRrINAR(1) model, which contains several different thinning operators, the YW estima-
tion is not considered since the moment estimation usually shows little efficiency in small
samples.
4.1 Yule–Walker estimation

The ergodicity of the INAR(1) process ensures the consistency of the sample mean and sam-
ple covariance. The following is a brief description of proving consistency. Consider a sam-
ple X1 (z1 ), X2 (z2 ), . . . , XN (zN ) of the random environment INAR(1) process {Xn (zn )}.
Let zk = i, zk+1 = zk+2 = · · · = zn = i, zn+1 = i, where k and n are natural numbers. The
subsample Xk+1 (i), Xk+2 (i), . . . , Xn (i) corresponds to the circumstance i is maximal in the
sense that it cannot be expanded neither to the left nor to the right. In other words, all of its
elements are in the same circumstance. On account of the ergodicity of the INAR(1) process,
we know that these estimators are strongly consistent.
We use the method similar to Nastić, Laketa and Ristić (2016). First, partition the samples
into the equivalence classes in the following way:

Ik = i ∈ {1, 2, . . . , N} | zi = k , k ∈ {1, 2, . . . , r},

r
Ik = {1, 2, . . . , N}, |Ik | = nk , n1 + n2 + · · · + nr = N,
k=1

Uk = Xk1 (k), Xk2 (k), . . . , Xknk (k) , ki ∈ Ik ,
ki < ki+1 , ∀i ∈ {1, 2, . . . , nk − 1},
where Uk is the subsample of the initial sample, and it contains all the elements corresponding
to the circumstance k but not contain elements belong to other circumstances. This subsample
can be partitioned into the maximal subsamples, where “maximal” has the same meaning
mentioned above. Let ik be the number of all the maximal subsamples of the sample Uk .
Denote these subsamples as Uk,1 , Uk,2 , . . . , Uk,ik . Let us introduce the following symbols
Rk,l = {i ∈ {1, 2, . . . , N} | Xi (zi ) ∈ Uk,l }, define the number of elements that Rk,l contained
is |Rk,l | = nk,l and nk,1 + nk,2 + · · · nk,ik = nk . Using the previous conclusions, estimators
obtained from the samples Uk,l are strongly consistent. We denote ck,l = (nk,l − 1)I (nk,l >
1), where I is the indicator function, these estimators are
1 (k) 1 2
k,l =
μ Xi (k), γ0,l = Xi (k) − μ
k,l ,
nk,l i∈R nk,l i∈R
k,l k,l
(k) 1
γ1,l = Xi+1 (k) − μ
k,l Xi (k) − μ
k,l .
ck,l {i,i+1}⊆Rk,l
Definition 3. We notice that if under the circumstance k, the estimators obtained from the
subsample Uk are defined as
1 (k) 1 2
k =
μ Xi (k), γ0 = Xi (k) − μ
k ,
nk i∈I nk i∈I
k k
1
γ1(k) = Xi+1 (k) − μ
k Xi (k) − μ
k ,
sk {i,i+1}⊆Ik

where sk = l∈I (nk,l >1) ck,l .
The parameter α can be calculated easily,

r (k)
nk γ1
α=

αk , αk =
where (k)
.
k=1
N γ0
The proposed process is not stationary, but it can be proved similarly as Nastić, Laketa and
Ristić (2016) that
α is strongly consistent.
4.2 Conditional maximum likelihood estimation

We can obtain the log-likelihood function of the process given by Definition 1, which can
be used for parameter estimation of unknown parameters μ1 , μ2 , . . . , μr , α. Assume that we
have already obtained values x0 and z0 , which make the joint mass function more readable.
Let Yi = (Xi , Zi ), yi = (xi , zi ) and A = {Ys = ys , 0 ≤ s < i − 1}, where xn is the realization
of the process in time n, from Proposition 1 we have
P (Yi = yi | Yi−1 = yi−1 , A) = P (Zi = zi | Zi−1 = zi−1 )
xi−1

·P Uk + εi (Zi−1 , Zi ) = xi ,
k=1
omit the factor pi−1,i = P (Zi = zi | Zi−1 = zi−1 ) and derive the joint log-likelihood function
as follows
log L = log L(x1 , z1 , . . . xN , zN | μ1 , μ2 , . . . , μr , α)

N

= log P Xi (zi ) = xi | Xi−1 (zi−1 ) = xi−1 .
i=2
Then we have the following cases.

Case 1.1
min{xi−1 ,xi }

N xi−1 (μzi − αμzi−1 )(xi −k)
log L = log α k (1 − α)xi−1 −k ·
i=2 k=o
k (xi − k)!

−(μzi −αμzi−1 ) (μzi − αμzi−1 )xi −(μz −αμz )
·e · I{xi−1 =0} + e i i−1 · I
{xi−1 =0} .
xi !
Case 1.2
xi−1

N xi−1 αμzi−1
log L = log α k (1 − α)xi−1 −k 1 −
i=2 k=o
k μ zi

μxzii −k αμzi−1 1
· I{xi−1 <xi } + (1 − α)xi−1 1−
(1 + μzi )xi −k+1 μ zi 1 + μzi
x −1
αμzi−1 i
xi−1 αμzi−1
+ I{xi =0} + α k (1 − α)xi−1 −k 1 −
μ zi k=o
k μ zi

μxzii −k xi−1 xi αμzi−1 1
· −k+1
+ α · (1 − α)xi−1 −xi 1−
(1 + μzi ) xi xi μ zi 1 + μzi

αμzi−1 αμzi−1
+ I{xi−1 ≥xi } · I{xi =0} I{xi−1 =0} + 1−
μ zi μ zi

μxzii αμzi−1 1
· I{xi =0} + 1−
(1 + μzi )xi +1 μ zi 1 + μzi

αμzi−1
+ I{xi =0} I{xi−1 =0} .
μ zi
Similarly, from Definition 2, we have:
Case 2.1
min{xi−1 ,xi }

N xi−1
log L = log αzki−1 (1 − αzi−1 )xi−1 −k
i=2 k=o
k
((1 − αzi−1 )μ)(xi −k) −(1−αz )μ

· e i−1 · I{xi−1 =0}
(xi − k)!

((1 − αzi−1 )μ)xi −(1−αz )μ
+ e i−1 · I{xi−1 =0} .
xi !
Case 2.2
xi−1

N xi−1 μxi −k
log L = log αzki−1 (1 − αzi−1 )xi−1 −k
i=2 k=o
k (1 + μ)xi −k+1

1 − αzi−1
· I{xi−1 <xi } + (1 − αzi−1 )xi−1 + αzi−1 I{xi =0}
1+μ
x −1
i
xi−1 μxi −k
+ αzki−1 (1 − αzi−1 )xi−1 −k
k=o
k (1 + μ)xi −k+1

xi−1 xi 1 − αzi−1
+ αzi−1 · (1 − αzi−1 )xi−1 −xi + αzi−1
xi 1+μ

μxzii
· I{xi−1 ≥xi } · I{xi =0} I{xi−1 =0} + (1 − αzi−1 )
(1 + μzi )xi +1

1 − αzi−1
· I{xi =0} + + αzi−1 I{xi =0} I{xi−1 =0} .
1+μ
Finally, the CML estimates can be obtained by maximizing these functions in terms of some
numerical algorithms. We use the package MaxLik in R to optimize these functions.
5 Numerical simulations
A simulation study was conducted to evaluate the finite sample performances of the YW and
CML estimates. We have simulated 500 replications with sample size n = 200, 400 for each
model, and discussed two practicable cases: the case of three or two states. For the estimation
of the parameters, the most important is to determine the Markov chain, we set vector p as
the discrete distribution of z1 , and matrix P as the process transition probability matrix. The
matrix sets the frequencies of the realized circumstances and it shapes the dynamical structure
of the RrINAR(1) and GRrINAR(1) processes. For this, these set-ups were considered as
follows:
For Poisson RrINAR(1) model (short as P-RrINAR(1)) and Geometric RrINAR(1) model
(short as G-RrINAR(1)):
(1) The P-R3INAR(1) model (three states random environment process based on binomial
thinning operator with Poisson marginal distribution (2.1), Case (1.1)) and the G-R3INAR(1)
model (three states random environment process based on binomial thinning operator with
geometric marginal distribution (2.1), Case (1.2)) with the following three scenarios
0.4 0.3 0.3
Scenario (a): (μ1 , μ2 , μ3 , α) = (1, 2, 3, 0.2), p = (0.33, 0.34, 0.33), P = 0.3 0.4 0.3 ;
0.3 0.3 0.4
0.5 0.4 0.1

Scenario (b): (μ1 , μ2 , μ3 , α) = (4, 5, 6, 0.5), p = (0.1, 0.8, 0.1), P = 0.3 0.4 0.3 ;
0.1 0.4 0.5
0.4 0.3 0.3
Scenario (c): (μ1 , μ2 , μ3 , α) = (1, 2, 6, 0.1), p = (0.45, 0.3, 0.25), P = 0.35 0.4 0.25 .
0.2 0.3 0.5
(2) P-R2INAR(1) and G-R2INAR(1) with the following three scenarios

Scenario (d): (μ1 , μ2 , α) = (1, 2, 0.3), p = (0.5, 0.5), P = 0.6 0.4 ;

0.4 0.6
Scenario (e): (μ1 , μ2 , α) = (4, 5, 0.6), p = (0.5, 0.5), P = 0.3 0.7 ;
0.7 0.3
Scenario (f): (μ1 , μ2 , α) = (1, 3, 0.2), p = (0.8, 0.2), P = 0.5 0.5 .
0.4 0.6
For Poisson GRrINAR(1) model (short as P-GRrINAR(1)) and Geometric GRrINAR(1)
model (short as G-GRrINAR(1)):
(3) The P-GR3INAR(1) model (the general three states random environment process based
on binomial thinning operator with Poisson marginal distribution (2.2), Case (2.1)); and the
G-GR3INAR(1) model (the general three states random environment process based on bi-
nomial thinning operator with geometric marginal distribution (2.2), Case (2.2)) with the
following three scenarios
0.4 0.3 0.3
Scenario (g): (α1 , α2 , α3 , μ) = (0.5, 0.3, 0.1, 5), p = (0.33, 0.34, 0.33), P = 0.3 0.4 0.3 ;
0.3 0.3 0.4
0.5 0.4 0.1
Scenario (h): (α1 , α2 , α3 , μ) = (0.7, 0.5, 0.3, 2), p = (0.1, 0.8, 0.1), P = 0.3 0.4 0.3 ;
0.1 0.4 0.5
0.4 0.3 0.3
Scenario (i): (α1 , α2 , α3 , μ) = (0.6, 0.4, 0.2, 3), p = (0.45, 0.3, 0.25), P = 0.35 0.4 0.25 ,
0.2 0.3 0.5
where α1 , α2 , α3 represent the different thinning operators belonging to three different ran-
dom environment states (Definition 2).
(4) P-GR2INAR(1) and G-GR2INAR(1) with the following three scenarios
Scenario (j): (α1 , α2 , μ) = (0.7, 0.3, 3), p = (0.5, 0.5), P = 0.6 0.4 ;

0.4 0.6
Scenario (k): (α1 , α2 , μ) = (0.6, 0.4, 2), p = (0.5, 0.5), P = 0.3 0.7 ;
0.7 0.3
Scenario (l): (α1 , α2 , μ) = (0.5, 0.2, 5), p = (0.8, 0.2), P = 0.5 0.5 .
0.4 0.6
In this simulation, the transition probability matrices were given beforehand to obtain the
desired process dynamics, which are responsible for setting the dynamic structure of the
process but not the parameters of the process. The simulation results are given in Tables 1–
4. It can be seen that as the sample size increases, the estimates seem converge to the true
parameter values with the root mean square errors (RMSEs) decreasing towards 0. The two
estimation methods both perform well, and CML gave smaller RMSEs than YW in most
cases.
6 Illustrative examples
In applications of INAR models, researchers may face that the time series are not stationary.
The stationary INAR models may not be the best choice but usually unavoidable. We have
obtained two time series representing a monthly counting of drug reselling (DRUGS) from
the Forecasting Principles website (http://www.forecastingprinciples.com). These crimes are
reported in the 27th (Figure 1) and the 24th (Figure 5) police car beats in Pittsburgh from
January 1990 to December 2001, each constituting a sequence of 144 observations. The data
in the 27th police car beat was also discussed in Nastić, Laketa and Ristić (2016).
The first step in standard INAR modeling is to obtain the plots of the time series: the
ACF and partial ACF (PACF). From the ACF and PACF plots (Figures 1 and 5), we find that
modeling the counts using INAR(1) is reasonable. In the time series plots, it is not difficult
to see that besides small jumps, in the last 2 years, there is a steady and permanent significant
Table 1 Mean of estimates, RMSE (within parentheses) for P-R3INAR(1) and G-R3INAR(1)
Scenario N Method μˆ1 μˆ2 μˆ3 α̂
P (a) 200 YW 1.0269 (0.1381) 2.0311 (0.2012) 3.0253 (0.2385) 0.1949 (0.1245)
CML 0.9988 (0.1299) 2.0008 (0.1852) 3.0224 (0.2199) 0.2041 (0.0657)
400 YW 1.0181 (0.0952) 2.0122 (0.1363) 3.0020 (0.1616) 0.1981 (0.0868)
CML 0.9976 (0.0925) 2.0030 (0.1292) 3.0079 (0.1688) 0.1975 (0.0435)
(b) 200 YW 4.0198 (0.3497) 5.0080 (0.3536) 5.9958 (0.4200) 0.4787 (0.1200)
CML 4.0001 (0.3011) 5.0039 (0.3217) 5.9976 (0.3795) 0.4998 (0.0489)
400 YW 3.9892 (0.2537) 4.9965 (0.2327) 5.9856 (0.3091) 0.4893 (0.0901)
CML 4.0010 (0.2146) 4.9951 (0.2163) 6.0051 (0.2641) 0.5011 (0.0326)
(c) 200 YW 1.0336 (0.1411) 2.0078 (0.1851) 6.0022 (0.3019) 0.0986 (0.1288)
CML 0.9922 (0.1326) 1.9928 (0.1836) 6.0192 (0.3309) 0.0991 (0.0639)
400 YW 1.0128 (0.0871) 2.0056 (0.1251) 5.9933 (0.2191) 0.1012 (0.0893)
CML 1.0019 (0.0954) 2.0019 (0.1277) 6.0021 (0.2187) 0.0990 (0.0431)
Geo (a) 200 YW 1.0195 (0.1839) 2.0017 (0.3314) 2.9859 (0.4569) 0.1944 (0.1307)
CML 1.0014 (0.1804) 1.9817 (0.3188) 3.0203 (0.4303) 0.1986 (0.0402)
400 YW 1.0197 (0.1319) 2.0039 (0.2290) 3.0233 (0.3226) 0.1959 (0.0935)
CML 1.0010 (0.1237) 1.9948 (0.2135) 2.9906 (0.3020) 0.2001 (0.0272)
(b) 200 YW 4.0079 (0.7932) 5.0184 (0.7712) 6.0141 (1.1440) 0.4712 (0.1458)
CML 3.9647 (0.6584) 4.9762 (0.7384) 5.9574 (0.9193) 0.5008 (0.0233)
400 YW 3.9964 (0.5490) 5.0242 (0.5713) 5.9948 (0.8045) 0.4813 (0.1027)
CML 4.0032 (0.4394) 4.9809 (0.4994) 5.9941 (0.6342) 0.4995 (0.0170)
(c) 200 YW 1.0342 (0.1857) 2.0200 (0.3231) 5.9937 (0.7944) 0.1017 (0.1300)
CML 0.9813 (0.1810) 1.9675 (0.3006) 6.0350 (0.8026) 0.1011 (0.0314)
400 YW 1.0158 (0.1289) 2.0218 (0.2126) 5.9913 (0.5910) 0.0995 (0.0896)
CML 1.0022 (0.1213) 2.0003 (0.2173) 6.0031 (0.5616) 0.1015 (0.0211)
increase of the observed period. So we assume that these two counting sequences may arise
from different environments, or may have different parameter turbulence through time, we
can apply our models to them.
We model these two datasets using RrINAR(1) in (2.1), GRrINAR(1) in (2.2) and Geomet-
ric RrINAR(1) based on the negative binomial thinning operator (short as RrNGINAR(1))
from Nastić, Laketa and Ristić (2016). For Example 1 (27th police car beat), we fit the mod-
els mentioned above based on the whole sequence of 144 observations, obtain the result of
which random environment each data belongs to, then estimate the parameters, and generate
the fitting data according to the known random environment. Finally, we calculate the value
of Akaike information criterion (AIC), Bayesian information criterion (BIC) and the RMSE
between 144 fitting data and its corresponding observations. For Example 2 (24th police car
beat), we divide the sequence of 144 observations into two parts, the first is the former 132
observations, the second is the last 12 observations. The transition probability matrix is esti-
mated by the method in Anderson and Goodman (1957). According to this matrix, we predict
which random environment the last 12 data belonging to, and then generate the last 12 predic-
tions. Finally, we calculate the value of AIC and BIC. We also calculate the RMSE between
the last 12 predictions and the last 12 observations.
Example 1
Establishing the above models according to the the sequence of 144 observations. Then we
calculate AIC and BIC. We also calculate RMSE between the observations and the pre-
Table 2 Mean of estimates, RMSE (within parentheses) for P-R2INAR(1) and G-R2INAR(1)
Scenario N Method μˆ1 μˆ2 α̂
P (d) 200 YW 1.0331 (0.1236) 2.0173 (0.1678) 0.2921 (0.0946)

CML 0.9991 (0.1163) 1.9975 (0.1731) 0.2940 (0.0646)
400 YW 1.0040 (0.0863) 2.0091 (0.1239) 0.2939 (0.0695)
CML 0.9945 (0.0818) 1.9928 (0.1139) 0.3001 (0.0418)
(e) 200 YW 4.0068 (0.2920) 4.9902 (0.3189) 0.5846 (0.1427)
CML 4.0113 (0.2896) 5.0154 (0.3321) 0.5983 (0.0402)
400 YW 4.0097 (0.2210) 5.0042 (0.2436) 0.5924 (0.1023)
CML 3.9776 (0.2167) 4.9859 (0.2479) 0.5987 (0.0294)
(f) 200 YW 1.0349 (0.1247) 2.9969 (0.1788) 0.1927 (0.0955)
CML 1.0048 (0.1067) 3.0098 (0.1842) 0.1965 (0.0663)
400 YW 1.0146 (0.0803) 3.0026 (0.1325) 0.1970 (0.0696)
CML 1.0029 (0.0822) 3.0009 (0.1305) 0.2027 (0.0397)
Geo (d) 200 YW 1.0350 (0.1794) 2.0197 (0.2995) 0.2877 (0.0971)
CML 0.9868 (0.1651) 1.9990 (0.2995) 0.2966 (0.0424)
400 YW 1.0061 (0.1254) 1.9981 (0.2083) 0.2954 (0.0750)
CML 0.9953 (0.1196) 1.9905 (0.1955) 0.2964 (0.0312)
(e) 200 YW 4.0169 (0.6968) 5.0554 (0.8436) 0.5811 (0.1764)
CML 4.0171 (0.6076) 4.9637 (0.7147) 0.6000 (0.0223)
400 YW 4.0252 (0.4566) 5.0410 (0.5786) 0.5946 (0.1380)
CML 3.9782 (0.4064) 4.9619 (0.5038) 0.5992 (0.0158)
(f) 200 YW 1.0414 (0.1644) 3.0233 (0.3739) 0.1905 (0.0999)
CML 0.9910 (0.1511) 2.9958 (0.3549) 0.1984 (0.0379)
400 YW 1.0177 (0.1200) 2.9907 (0.2687) 0.1961 (0.0750)
CML 0.9974 (0.1113) 2.9953 (0.2495) 0.1993 (0.0268)
Table 3 Mean of estimates, RMSE (within parentheses) for P-GR3INAR(1) and G-GR3INAR(1)
Scenario N Method αˆ1 αˆ2 αˆ3 μ̂
P (g) 200 CML 0.4989 (0.0896) 0.3003 (0.1147) 0.0972 (0.1318) 5.0119 (0.2088)
400 0.5022 (0.0584) 0.2940 (0.0796) 0.0971 (0.0897) 4.9974 (0.1579)
(h) 200 CML 0.6965 (0.0612) 0.4922 (0.0800) 0.2913 (0.1224) 1.9983 (0.1619)
400 0.6966 (0.0427) 0.4970 (0.0577) 0.2925 (0.0817) 1.9941 (0.1242)
(i) 200 CML 0.5946 (0.0731) 0.3926 (0.1006) 0.2047 (0.1126) 2.9990 (0.1723)
400 0.5962 (0.0511) 0.3957 (0.0682) 0.1954 (0.0814) 3.0052 (0.1307)
Geo (g) 200 CML 0.4976 (0.0443) 0.3024 (0.0526) 0.1003 (0.0659) 5.0250 (0.4941)
400 0.4997 (0.0317) 0.3012 (0.0363) 0.0996 (0.0420) 4.9920 (0.3455)
(h) 200 CML 0.7009 (0.0487) 0.5009 (0.0532) 0.3030 (0.0792) 2.0071 (0.2880)
400 0.6997 (0.0340) 0.4982 (0.0407) 0.3036 (0.0524) 2.0018 (0.1970)
(i) 200 CML 0.5992 (0.0491) 0.3987 (0.0605) 0.1995 (0.0735) 2.9910 (0.3526)
400 0.5979 (0.0326) 0.3977 (0.0417) 0.2008 (0.0465) 3.0057 (0.2497)
dicted values. For P-RrINAR(1), G-RrINAR(1) and RrNGINAR(1), we separate the se-
quence for two or three possible random states by K-means clustering algorithm (Figure 2).
For P-GRrINAR(1) and G-GRrINAR(1), we use the criterion in Section 4, and then separate
the random environment for two or three random states. After that, using CML in Section 4.2
Table 4 Mean of estimates, RMSE (within parentheses) for P-GR2INAR(1) and G-GR2INAR(1)
Scenario N Method αˆ1 αˆ2 μ̂
P (j) 200 CML 0.7005 (0.0471) 0.2952 (0.0838) 3.0014 (0.2044)

400 0.6963 (0.0326) 0.2964 (0.0654) 2.9963 (0.1424)
(k) 200 CML 0.5920 (0.0588) 0.3956 (0.0849) 2.0044 (0.1690)
400 0.6015 (0.0386) 0.3953 (0.0549) 2.0026 (0.1194)
(l) 200 CML 0.4947 (0.0792) 0.1988 (0.0931) 4.9885 (0.2251)
400 0.4977 (0.0540) 0.1962 (0.0680) 5.0052 (0.1604)
Geo (j) 200 CML 0.6993 (0.0302) 0.2985 (0.0541) 2.9877 (0.3946)
400 0.6985 (0.0223) 0.2985 (0.0387) 2.9898 (0.2779)
(k) 200 CML 0.5961 (0.0457) 0.3934 (0.0559) 2.0035 (0.2788)
400 0.6011 (0.0299) 0.4003 (0.0377) 2.0060 (0.2008)
(l) 200 CML 0.4996 (0.0376) 0.2025 (0.0434) 4.9998 (0.5060)
400 0.4989 (0.0265) 0.2007 (0.0303) 4.9899 (0.3524)
Figure 1 27th DRUGS series, autocorrelations and partial autocorrelations.
to estimate the parameters. The results in Table 5 about RrNGINAR(1) are not exactly the
same as those of Nastić, Laketa and Ristić (2016). Because the result of K-means cluster anal-
ysis using R is often not unique and the adoption of different algorithms, the little difference
is reasonable. According to the results of Table 5, P-R3INAR(1) has the smallest AIC, BIC
and RMSE, to this set of data, which means that P-R3INAR(1) perform best among those
models.
Figure 2 27th DRUGS data in three or two states.
Figure 3 27th DRUGS data (•) and fitted values from P-R3INAR(1).
The fitted P-R3INAR(1) model for 27th DRUGS data is given in Figure 3. To further
examine the adequacy of the fitted model, let us consider the Pearson residuals, defined by
Xt − α̂Xt−1 (zt−1 ) − μ̂zt + α̂ μ̂zt−1
r1t = 1
.
[α̂(1 − α̂)(Xt−1 (zt−1 ) − μ̂zt−1 ) + μ̂zt − α̂ 2 μ̂zt−1 ] 2
We can also calculate the mean square error of Pearson residuals, which is equal to
n
t=1 r1t /(n − p), where p denotes the number of estimated parameters. Table 6 gives
2
some characteristics of the residuals for P-R3INAR(1) model. Figure 4 plots the ACF and
PCF of the Pearson residuals. There is no evidence of any correlation within the residu-
als, a finding is supported by the Ljung–Box statistic of 20.4391 based on 15 lags (because
2 (14) = 23.6847).
χ0.05
Example 2
Dividing the sequence of 144 observations into two parts, the first is the former 132 ob-
servations, remaining the last 12 observations as the second part. We establish the above
models according to the first part, then estimate the transition probability matrix between the
Table 5 Parameter estimates, RMSE, AIC and BIC for modeling of the 27th DRUGS data
model CML RMSE AIC BIC
P-R3INAR(1) μ1 = 0.8166 2.4352 453.3236 465.2029

μ2 = 23.0785
μ3 = 9.3196
α = 0.0348
G-R3INAR(1) μ1 = 0.8241 5.1565 471.4125 483.2917
μ2 = 21.6645
μ3 = 8.5302
α = 0.0348
R3NGINAR(1) μ1 = 0.8215 4.3389 471.5221 483.4014
μ2 = 23.6046
μ3 = 8.5531
α = 0.0333
P-GR3INAR(1) α1 = 0.8802 7.1209 752.1834 764.0626
α2 = 0.4463
α3 = 0.1207
μ = 8.9221
G-GR3INAR(1) α1 = 0.4411 6.8526 535.7508 547.6301
α2 = 0.6265
α3 = 0.4740
μ = 2.9025
P-R2INAR(1) μ1 = 1.0712 3.7776 564.7871 573.6965
μ2 = 15.7599
α = 0.0431
G-R2INAR(1) μ1 = 1.1144 4.8769 497.4740 506.3834
μ2 = 12.7787
α = 0.0557
R2NGINAR(1) μ1 = 1.1085 4.4024 497.6248 506.5342
μ2 = 12.9136
α = 0.0519
P-GR2INAR(1) α1 = 0.8755 7.0440 756.8653 765.7748
α2 = 0.3391
μ = 8.5725
G-GR2INAR(1) α1 = 0.4427 6.7889 536.0080 544.9174
α2 = 0.5519
μ = 2.9762
Table 6 Some characteristics of the 27th DRUGS data residuals
model mean standard deviation MSE Ljung–Box statistic
P-R3INAR(1) 0.0055 1.3203 1.7810 20.4391
environment states in the former 132 observations. Later, we forecast the last 12 probable
environment states by using the transition probability matrix and then forecast its related ob-
servations. Finally, we calculate the AIC, BIC of the former 132 observations, and RMSE of
difference between the last 12 observations and the related predictions.
For P-RrINAR(1), G-RrINAR(1) and RrNGINAR(1), we separate the sequence for two
or three random states by K-means clustering algorithm (Figure 6). For P-GRrINAR(1) and
G-GRrINAR(1), using the criterion in Section 4, we separate the random environment for
Figure 4 Pearson residual analysis for 27th DRUGS data: the autocorrelation function and the partial autocor-
relation function of residuals.
Figure 5 24th DRUGS series, autocorrelations and partial autocorrelations.
two or three random states. We can estimate the transition probability matrix by the method
in Anderson and Goodman (1957). After that, using CML in Section 4.2 to estimate the
parameters of the INAR models. According to the results of Table 7, P-R3INAR(1) has the
smallest AIC, BIC and second-smallest RMSE, which means that P-R3INAR(1) perform
best among those models. Figure 7 gives the prediction of the last 12 observations based on
ten models of the 24th DRUGS data.
From these two examples, we show that P-R3INAR(1) model has the best performance
among these models when fitting the above two datasets. The reason may be the sample sizes
of the two samples are not large, the number of subsamples of per states is small, the variance
Figure 6 24th DRUGS data in three or two states.
of Poisson marginal distribution is smaller than geometric distribution when they have the
same expectation. Larger fluctuation of geometric distribution makes it difficult to fit the
marginal distribution. We also found that the number of data states has a great impact on the
estimation of these models. For example, in Example 1, P-R3INAR(1) performs better than
G-R3INAR(1), but G-R2INAR(1) performs better than P-R2INAR(1). How to determine the
number of random states before estimation is still open. If there is prior information, we can
choose the model according to the actual situation. If not, estimating by Poisson distribution is
more accurate than geometric distribution in some real datasets, especially for small samples
with sudden fluctuations.
7 Conclusion
In this paper, we propose two classes of INAR(1) models (RrINAR(1) and GRrINAR(1)) and
illustrate that they perform better than existing RrNGINAR(1) in some cases. But there still
is space for further study. First, the marginal distributions of Xn (zn ) in RrINAR(1) may be
generalized to some other discrete distributions, like the class of zero-modified distributions.
Second, the prediction of GRrINAR(1) models is not accurate enough because it just knows
that the fluctuation whether bigger or smaller, not the value of the data, using signed binomial
thinning operator maybe more accurate.
Appendix
Proof of Theorem 1.
Case 1.1 When zn−1 = i and zn = j , where i, j ∈ Er , we have E(s Xn (j ) ) =
E(s α◦Xn−1 (i)+εn (i,j ) ). The left side becomes E(s Xn (j ) ) = e(s−1)μj . Consider the right side.
Random variable Xn−1 (i) is independent of the random variable εn (i, j ), we have
Xn−1 (i) ε (i,j )
E s α◦Xn−1 (i)+εn (i,j ) = E s α◦Xn−1 (i) E s εn (i,j ) = E Es U1 E s n

= E (sα + 1 − α)Xn−1 (i) E s εn (i,j ) = e(s−1)αμi E s εn (i,j ) .
The pgf of the random variable εn (i, j ) is given as

E s εn (i,j ) = e(s−1)(μj −αμi ) ,
Table 7 Transition matrix, parameter estimates, RMSE, AIC and BIC of the 24th DRUGS data
model matrix CML RMSE AIC BIC

0.6923 0.2307 0.0769
P-R3INAR(1) 0.0769 0.3846 0.5384 μ1 = 24.5130 10.1571 513.3920 524.9232
0.0253 0.2531 0.7215
μ2 = 6.4364
μ3 = 1.1339
α = 0.0197
G-R3INAR(1) μ1 = 23.5916 17.0269 581.4121 592.9433
μ2 = 6.3504
μ3 = 1.1246
α = 0.0288
R3NGINAR(1) μ1 = 23.4396 14.9080 581.3765 592.9077
μ2 = 6.3293
μ3 = 1.1234
α = 0.0343

0.9210 0.0614 0.0175
P-GR3INAR(1) 0.5555 0.2222 0.2222 α1 = 0.7015 11.6583 1040.9420 1052.4730
0.5714 0 0.4285
α2 = 0.2127
α3 = 0.2031
μ = 7.9750
G-GR3INAR(1) α1 = 0.2314 16.4519 708.5603 720.0915
α2 = 0
α3 = 0.1990
μ = 5.7085

0.7142 0.2857
P-R2INAR(1) 0.0427 0.9572
μ1 = 22.8077 9.7724 721.6241 730.2725
μ2 = 2.7794
α = 0.0684
G-R2INAR(1) μ1 = 22.3401 18.5965 636.1490 644.7974
μ2 = 2.7588
α = 0.0263
R2NGINAR(1) μ1 = 22.2584 16.7854 636.1300 644.7784
μ2 = 2.7582
α = 0.0277

0.9210 0.0789
P-GR2INAR(1) 0.5625 0.4375
α1 = 0.7013 10.3963 1038.9560 1047.6040
α2 = 0.2066
μ = 7.9692
G-GR2INAR(1) α1 = 0.2105 13.7325 707.3969 716.0453
α2 = 0.1061
μ = 5.3808
μ
if 0 ≤ α ≤ μji , the random variable εn (i, j ) has the distribution given in Case 1.1. Since i
and j are arbitrary numbers from the set Er , it follows that the random variables εn (1, 1),

εn (1, 2), . . . , εn (r, r) will have well-defined distributions for α ∈ k,l∈Er [0, μμkl ], that is, for
0 ≤ α ≤ min{ μμkl ; k, l ∈ Er }.
Case 1.2 When zn−1 = i and zn = j , where i, j ∈ Er , we have E(s Xn (j ) ) =
E(s α◦Xn−1 (i)+εn (i,j ) ). The left-hand side becomes E(s Xn (j ) ) = 1+μj1−μj s . Consider the right-
Figure 7 The prediction of the last 12 observations based on ten models of the 24th DRUGS data.
hand side. Random variable Xn−1 (i) is independent of random variable εn (i, j ), we have

E s α◦Xn−1 (i)+εn (i,j )
Xn−1 (i) εn (i,j )
= E s α◦Xn−1 (i) E s εn (i,j ) = E Es U1 E s
1
= E (sα + 1 − α)Xn−1 (i) E s εn (i,j ) = E s εn (i,j ) .
1 + μi − μi (sα + 1 − α)
1 + αμi − αμi s αμi 1 αμi
E s εn (i,j ) = = 1− + ,
1 + μj − μj s μj 1 + μj − μj s μj
μ
if 0 ≤ α ≤ μji , the random variable εn (i, j ) has the distribution in Case 1.2. Since i and
j are arbitrary numbers from the set Er , it follows that the random variables εn (1, 1),
εn (1, 2), . . . , εn (r, r) have well-defined distributions for α ∈ k,l∈Er [0, μμkl ], that is, for
0 ≤ α ≤ min{ μμkl ; k, l ∈ Er }.
Proof of Corollaries 1 and 2. The results follow from the fact that E(εn (i, j )) = ε (1) and
Var(εn (i, j )) = ε (1) + ε (1)(1 − ε (1)), where ε (s) is the pgf of the random variable
εn (i, j ), one can derive the result through a simple calculation.
Proof of Theorem 3. Let μn+k|n = E(Xn+k |Xn ) and μεn = E(εn ). According to Definition
1 and the independence of random variables Xn+k−1 and εn+k , we obtain the conditional
expectation of Xn+k on Xn satisfies the equation μn+k|n = αμn+k−1|n + εn+k . Using the
equation k − 1 times and the fact that μn|n = Xn , we obtain

k−1
μn+k|n = α k Xn + α l εn+k−l .
l=0
Using the result of Corollary 1 for the expectations of the random variables εn+k−l , l ∈
{0, 1, . . . , k − 1}, we obtain the expression for the conditional expectation.
Next consider the conditional variance. Let σn+k|n 2 = Var(Xn+k (zn+k ) | Xn (zn )) and
σεn+k = Var(εn+k ). Using the similar argument and properties of the binomial thinning oper-
2
ator, the conditional variance satisfies the following equation
2
σn+k|n = α 2 σn+k−1|n
2
+ α(1 − α)μn+k−1|n + σε2n+k ,
using the equation k − 1 times, we obtain

k−1
k−1
2
σn+k|n = α 2k σn|n
2
+ α(1 − α) α 2l μn+k−1−l|n + α 2l σε2n+k−l .
l=0 l=0
Finally, using the fact 2
σn|n = 0 and Corollary 1 for the variances of the random variables
εn+k−l , l ∈ {0, 1, . . . , k − 1}, which completes the proof.
Proof of Theorem 4.
Case 2.1 When zn−1 = i, i ∈ Er , we have E(s Xn ) = E(s αi ◦Xn−1 +εn (i) ). The left side be-
comes E(s Xn ) = e(s−1)μ . Let us consider the right side, Xn−1 is independent of εn (i), we
have
Xn−1 εn (i)
E s αi ◦Xn−1 +εn (i) = E s αi ◦Xn−1 E s εn (i) = E Es (Ui )1 E s

= E (sαi + 1 − αi )Xn−1 E s εn (i) = e(s−1)αi μ E s εn (i) .

E s εn (i) = e(s−1)(μ−αi μ) .
Since i is arbitrary number from set Er , it follows that εn (1), εn (2), . . . , εn (r) are well
defined, if 0 ≤ αi ≤ 1, then εn (i) has the distribution in Case 2.1.
Case 2.2 When zn−1 = i, where i ∈ Er , we have E(s Xn ) = E(s αi ◦Xn−1 +εn (i) ). The left-
hand side becomes E(s Xn ) = 1+μ−μs 1
. Let us consider the right-hand side, Xn−1 is indepen-
dent of εn (i), we have
Xn−1 ε (i)
E s αi ◦Xn−1 +εn (i) = E s αi ◦Xn−1 E s εn (i) = E Es (Ui )1 E s n

= E (sαi + 1 − αi )Xn−1 E s εn (i)
1
= E s εn (i) .
1 + μ − μ(sαi + 1 − αi )
The pgf of the random variable εn (i) is given as

1 + αi μ − αi μs 1
E s εn (i) = = (1 − αi ) + αi .
1 + μ − μs 1 + μ − μs
Since i is arbitrary number from the set Er , it follows that εn (1), εn (2), . . . , εn (r) are well
defined, if 0 ≤ αi ≤ 1, then εn (i) has the distribution in Case 2.2.
Proof of Theorem 5. The proof is similar to the situation k = 1 in the second part of Theo-
rem 3.
Acknowledgments
We thank the Editor and the anonymous referee for their constructive comments and sug-
gestions that have greatly improved the paper. This work is supported by National Natural
Science Foundation of China (Nos. 11871027, 11731015), Science and Technology Devel-
oping Plan of Jilin Province (No. 20170101057JC), Science and Technology Program of
Jilin Educational Department during the “13th Five-Year” Plan Period (No. 2016-399), and
Cultivation Plan for Excellent Young Scholar Candidates of Jilin University. Zhu is the cor-
responding author.
References
Al-Osh, M. A. and Alzaid, A. A. (1987). First order integer-valued autoregressive (INAR(1)) processes. Journal
of Time Series Analysis 8, 261–275. MR0903755
Anderson, T. W. and Goodman, L. A. (1957). Statistical inference about Markov chains. The Annals of Mathe-
matical Statistics 28, 89–110. MR0084903
Barczy, M., Ispány, M. and Pap, G. (2011). Asymptotic behavior of unstable INAR(p) processes. Stochastic
Processes and Their Applications 121, 583–608. MR2763097
Bu, R., McCabe, B. P. M. and Hadri, K. (2008). Maximum likelihood estimation of higher-order integer-valued
autoregressive processes. Journal of Time Series Analysis 29, 973–994. MR2464949
Drost, F. C., van den Akker, R. and Werker, B. J. M. (2008). Local asymptotic normality and efficient estimation
for INAR(p) models. Journal of Time Series Analysis 29, 783–801. MR2450896
Drost, F. C., van den Akker, R. and Werker, B. J. M. (2009). Efficient estimation of auto-regression parameters
and innovation distributions for semiparametric integer-valued AR(p) models. Journal of the Royal Statistical
Society, Series B, Statistical Methodology 71, 467–485. MR2649605
Jazi, M. A., Jones, G. and Lai, C. D. (2012). First-order integer valued AR processes with zero inflated Poisson
innovations. Journal of Time Series Analysis 33, 954–963. MR2991911
Joe, H. (1996). Time series models with univariate margins in the convolution-closed infinitely divisible class.
Journal of Applied Probability 33, 664–677. MR1401464
Laketa, P. N., Nastić, A. S. and Ristić, M. M. (2018). Generalized random environment INAR models of higher
order. Mediterranean Journal of Mathematics 15. MR3740339 https://doi.org/10.1007/s00009-017-1054-z
Latour, A. (1998). Existence and stochastic structure of a non-negative integer-valued autoregressive processes.
Journal of Time Series Analysis 4, 439–455. MR1652193
Li, H., Yang, K., Zhao, S. and Wang, D. (2018). First-order random coefficients integer-valued threshold autore-
gressive processes. AStA Advances in Statistical Analysis 102, 305–331. MR3829551
McCabe, B. P. M., Martin, G. M. and Harris, D. (2011). Efficient probabilistic forecasts for counts. Journal of the
Royal Statistical Society, Series B, Statistical Methodology 73, 253–272. MR2814495
McKenzie, E. (1986). Autoregressive moving-average processes with negative binomial and geometric distribu-
tions. Advances in Applied Probability 18, 679–705. MR0857325
Nastić, A. S., Laketa, P. N. and Ristić, M. M. (2016). Random environment integer-valued autoregressive process.
Journal of Time Series Analysis 37, 267–287. MR3511585
Nastić, A. S., Laketa, P. N. and Ristić, M. M. (2018). Random environment INAR models of higher order. REVS-
TAT Statistical Journal. To appear.
Pedeli, X., Davison, A. C. and Fokianos, K. (2015). Likelihood estimation for the INAR(p) model by saddlepoint
approximation. Journal of the American Statistical Association 110, 1229–1238. MR3420697
Qi, X., Li, Q. and Zhu, F. (2019). Modeling time series of count with excess zeros and ones based on INAR(1)
model with zero-and-one inflated Poisson innovations. Journal of Computational and Applied Mathematics
346, 572–590. MR3864182 https://doi.org/10.1016/j.cam.2018.07.043
Steutel, F. W. and van Harn, K. (1979). Discrete analogues of self-decomposability and stability. Annals of Prob-
ability 7, 893–899. MR0542141
Tang, M. and Wang, Y. (2014). Asymptotic behavior of random coefficient INAR model under random environ-
ment defined by difference equation. Advances in Difference Equations 2014, 99.
Weiß, C. H. (2018). An Introduction to Discrete-Valued Time Series. Chichester: John Wiley & Sons.
Zheng, H., Basawa, I. V. and Datta, S. (2006). Inference for pth-order random coefficient integer-valued autore-
gressive processes. Journal of Time Series Analysis 27, 411–440. MR2328539
Z. Liu Q. Li
F. Zhu College of Mathematics
School of Mathematics Changchun Normal University
Jilin University Changchun 130032
2699 Qianjin China
Changchun 130012 E-mail: 46968158@qq.com
China
E-mail: 914590404@qq.com
zfk8010@163.com

18 BJPS421

Uploaded by

Copyright:

Available Formats

18 BJPS421

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

18 BJPS421

Uploaded by

Copyright:

Available Formats

Brazilian Journal of Probability and Statistics

2020, Vol. 34, No. 2, 251–272

Random environment binomial thinning integer-valued

and 0 otherwise, where ξ := {ξi } is a collection of independent and identically distributed

form Xt = Mt (Xt−1 , α) + Zt , where Mt is a random operator. Zheng, Basawa and Datta

Xn (zn ) = α ◦ Xn−1 (zn−1 ) + εn (zn−1 , zn ), n ∈ N, (2.1)

3.1 Properties of RrINAR(1) model

Corollary 1. Let zn−1 = i and zn = j , i, j ∈ Er . The expectation and variance of a random

Next, we consider the covariance and correlation structure of process (2.1).

3.2 Properties of GRrINAR(1) model

Next, we consider the correlation structure of the GRrINAR(1) model.

According to Theorem 5, in Case 2.1, for fixed μ, if Xn = 0, then Var(Xn+1 |Xn ) =

4.1 Yule–Walker estimation

The parameter α can be calculated easily,

4.2 Conditional maximum likelihood estimation

Then we have the following cases.

((1 − αzi−1 )μ)(xi −k) −(1−αz )μ

0.5 0.4 0.1

Scenario N Method μˆ1 μˆ2 μˆ3 α̂

Scenario N Method μˆ1 μˆ2 α̂

P (d) 200 YW 1.0331 (0.1236) 2.0173 (0.1678) 0.2921 (0.0946)

Scenario N Method αˆ1 αˆ2 αˆ3 μ̂

Scenario N Method αˆ1 αˆ2 μ̂

P (j) 200 CML 0.7005 (0.0471) 0.2952 (0.0838) 3.0014 (0.2044)

Figure 1 27th DRUGS series, autocorrelations and partial autocorrelations.

Figure 2 27th DRUGS data in three or two states.

model CML RMSE AIC BIC

P-R3INAR(1) μ1 = 0.8166 2.4352 453.3236 465.2029

Table 6 Some characteristics of the 27th DRUGS data residuals

model mean standard deviation MSE Ljung–Box statistic

P-R3INAR(1) 0.0055 1.3203 1.7810 20.4391

Figure 5 24th DRUGS series, autocorrelations and partial autocorrelations.

Figure 6 24th DRUGS data in three or two states.

model matrix CML RMSE AIC BIC

The pgf of the random variable εn (i) is given as

You might also like