Published by De Gruyter May 9, 2019

On the performance of information criteria for model identification of count time series

  • Christian H. Weiß ORCID logo EMAIL logo and Martin H.-J.M. Feld


Model fitting for count time series is of great relevance for many economic applications. Here, we focus on the step of model selection, where information criteria like AIC and BIC are commonly used in practice. Previous studies about their model selection abilities concentrated on real-valued time series, but here, we comprehensively investigate AIC and BIC in a count time series context. In our simulations, we consider diverse scenarios of model selection, like the identification of serial (in)dependence, overdispersion, zero inflation or a trend, the order selection within a given model family as well as the model selection also across model families. We apply our findings to economic count time series about monthly numbers of strikes in the US, and about monthly numbers of corporate insolvencies in the districts of Rhineland-Palatinate.


The authors thank the referees for carefully reading the article and for their comments, which greatly improved the article.

Appendix A

Some models for count time series

This appendix briefly summarizes the definitions and relevant properties of the count time series models being considered in this article; more details and references can be found in the book by Weiß (2018).

In the simplest case, the generated counts are independent and identically distributed (i. i. d.). But most of the models considered here are types of regression models. The integer-valued autoregressive conditional heteroskedasticity (INARCH) models in Sections 2 to 5 assume the conditional mean at time t, Mt:=E[Xt|Xt1,], to be a linear function of the last p observations, i.e. Mt=β+i=1pαiXti, whereas the marginal regression models in Section 6 assume a linear trend in time: Mt=E[Xt]=a+bt. Given the mean at time t, the actual count is generated from either the Poisson distribution Poi(Mt) (Poi-INARCH or Poi-Reg model, respectively) or the negative binomial distribution NB(Mtπ1π,π) (NB-INARCH or NB-Reg model, respectively). Here, the NB-parameter π controls the degree of overdispersion, because the dispersion index of an NB-variate is equal to 1/π. Some of our analyses related to the INARCH model also use the zero-inflated Poisson distribution ZIP(11ωMt,ω) (ZIP-INARCH), where ω determines the extent of zero inflation. In view of likelihood computation as required for Appendix B, the following formulae for computing (conditional) probabilities are relevant:

(5) INARCH ( p ) Reg P ( X t = x | x t 1 , ) = P ( X t = x ) = Poi- e m t m t x x ! , e m t m t x x ! , where  m t = β + i = 1 p α i x t i where  m t = a + b t NB- ( n t + x 1 ) ( x ) x ! ( 1 π ) x π n t , ( n t + x 1 ) ( x ) x ! ( 1 π ) x π n t , where  n t = ( β + i = 1 p α i x t i ) π 1 π where  n t = ( a + b t ) π 1 π ZIP- ω 𝟙 { x = 0 } + ( 1 ω ) e λ t λ t x x ! , where  λ t = ( β + i = 1 p α i x t i ) 1 1 ω

where k(l)=k(kl+1) denotes the falling factorials, and 𝟙{} the indicator function.

In Sections 2 to 4, we also consider types of integer-valued autoregressive (INAR) models of order 1. The INAR(1) model is defined by the recursion Xt=αXt1+ϵt, where the binomial thinning operator “α∘” is defined by the conditional distribution αX|XBin(X,α), and where the innovations (ϵt)N are i. i. d. count random variables. We consider either Poisson or negative binomial or zero-inflated Poisson innovations, ϵtPoi(λ) or ϵtNB(n,π) or ϵtZIP(λ,ω), leading to the Poi- or NB- or ZIP-INAR(1) model, respectively. The corresponding conditional probabilities are

(6) P ( X t = x | x t 1 , ) = j = 0 min { x , x t 1 } ( x t 1 j ) α j ( 1 α ) x t 1 j P ( ϵ t = x j ) , where P ( ϵ t = x j ) = { e λ λ x j ( x j ) ! if Poi-INAR(1) , ( n + x j 1 ) ( x j ) ( x j ) ! ( 1 π ) x j π n if NB-INAR(1) , ω 𝟙 { x j = 0 } + ( 1 ω ) e λ λ x j ( x j ) ! if ZIP-INAR(1) .

Note that the Poi-INAR(1) model also has a Poisson marginal distribution, XtPoi(λ1α), whereas the NB- and ZIP-INAR(1)’s (as well as all INARCH’s) marginal distribution is not known explicitly.

Appendix B

About the simulations

The simulation studies of the present article have been done with R (R Core Team 2018). In contrast to previous works like Emiliano, Vivanco, and de Menezes (2014) and Rinke and Sibbertsen (2016), we even used 10,000 replications per simulated scenario. The resulting time series according to the DGP were then used to fit all of the specified candidate models via maximum likelihood (ML) estimation, where we computed the required log-likelihood based on formulae (5) and (6). For the autoregressive models of order p, we actually used a conditional ML approach, i.e. we maximized the log-likelihood conditioned on the first p observations. To correct for this reduced number of unconditional observations, we followed the suggestion in Weiß (2018) and multiplied the maximized log-likelihood by the factor T/(T − p); then we computed the values of AIC and BIC for each candidate model.

To get a comprehensive picture about the information criteria’s model selection abilities, we fixed the length T as well as marginal properties per given scenario, but we randomly drew the autoregressive parameters from the possible range (in case of the autoregressive INAR and INARCH models). More precisely, for order p = 1, we drew α1 (or α, respectively) from a uniform distribution on the interval (0.1; 0.9) ⊂ (0; 1) (we avoided more extreme values of α1 to circumvent computational problems). Similarly, α1,,αp were drawn considering the stationarity condition i=1pαi<1 together with the additional constraint 0.1<αi<0.9. To ensure that the DGP (nearly) reached its stationary state, we always used a prerun of length 250.

Considering these issues, the architecture of our simulation codes is as follows: Having specified the candidate models and the DGP,

  1. we simulated 10,000 time series according to the DGP,

  2. we fitted the candidate models by ML estimation and stored the resulting maximized log-likelihood values,

  3. we computed the considered information criteria and used these for model selection.

The full R codes are available from the authors upon request.


The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/snde-2018-0012).

Published Online: 2019-05-09

