L'étude dynamique des données de comptage intervient dans de nombreuses applications (modélisation et prévision du nombre de patients infectés par une maladie au cours du temps, du nombre journalier de certaines transactions financières, du nombre mensuel d'entreprises en défaut de paiement...) et nécessite le développement de modèles de séries temporelles qui soient à la fois à valeurs entières et suffisamment parcimonieux. Cette double-contrainte demande alors une approche spécifique et il est souvent plus délicat de définir des modèles analogues à ceux utilisés pour étudier les séries temporelles à valeurs réelles. Dans cet exposé, nous commencerons par présenter les modèles les plus connus dans ce domaine, en particulier les processus autorégressifs à valeurs entières (INAR, Poisson GARCH...). Dans une deuxième partie, nous discuterons une nouvelle approche pour définir des extensions non-stationnaires de ces modèles et qui autorise les paramètres àévoluer avec le temps. Enfin, lorsque l'espace des états est borné, nous discuterons de quelques modèles (autorégressifs logistiques ou multinomiaux) adaptés plus généralement à l'étude des séries temporelles à valeurs catégorielles.
Consider an integer-valued stochastic process ${ X_t,t∈Z} $. Assume a parametric form for the conditional mean : \begin{align*} E( X_t| X_t-1,X_t-2,...) =&λ( X_t-1,X_t-2,...;θ_0) =λ_t( θ _0) =λ_t, t∈Z. \end{align*} For most of the count time series models, in particular the Poisson INteger GARCH (INGARCH), the Negative Binomial INGARCH and the INteger AR (INAR), that conditional mean has a linear form. Estimating $θ_0$ is obviously of primary importance, in particular for predicting the future values of $X_t$. The maximum-likelihood estimator (MLE) is often readily computable---except for parameter-driven models like the INAR model---but it requires to specify a conditional distribution (for instance Poisson or Negative Binomial). In practice, the choice of the conditional distribution is an issue. There exists actually no natural choice for the conditional distribution, or even for the condition variance $υ_t$. For example, the choice of the Poisson distribution with intensity $λ_t$ entails $υ_t=λ_t$, and is thus questionable since it has been empirically observed that numerous count time series exhibit conditional overdispersion (see e.g. Christou and Fokianos, 2014). Moreover the choice of a wrong conditional distribution may affect the efficiency, or even the consistency, of the misspecified MLE of $θ_0$. In the present work, we focus on the estimation of the parameter $θ_0$ of the conditional mean, without assuming a specific form for the conditional distribution of the observations (such as for instance the Poisson distribution). In particular, we are interested in estimators that could be consistent even if the conditional variance is misspecified. An example of such misspecification-consistent estimator is the Poisson QMLE (PQMLE). This estimator coincides with the MLE when the conditional distribution of the observations is Poisson $ P(λ_t)$, but the PQMLE is consistent and asymptotically normal (CAN) for a much broader class of conditional distributions (see Ahmad and Francq, 2016). However, this estimator is in general inefficient when $v_t≠λ_t$. Motivated by the existence of overdispersed series for which $v_t> λ_t$, Aknouche, Bendjeddou and Touche (2018) studied the profile Negative Binomial QMLE (NBQMLE). This estimator is also consistent for estimating $θ_0$ under very mild regularity conditions, but may be inefficient. We propose and study alternative weighted least squares estimators (WLSE), which enjoy the same consistency property as the PQMLE and NBQMLE when the conditional distribution is misspecified, but gain in efficiency when $v_t$ is well specified. The asymptotic and finite sample properties of these estimators have been studied. Compare to the above-mentioned QMLEs, the WLSE presents the advantages of 1) being of higher efficiency in some situations; 2) being asymptotically efficient when the conditional distribution belongs to the linear exponential family; 3) having a standard asymptotic normal distribution even when one or several coefficients of the conditional mean are equal to zero; 4) being explicit and requiring no optimisation routine in INARCH models.
Count time series analysis is nowadays an active area of time series research. Numerous models and methods have been recently introduced in order to account for the main characteristics exhibited by count time series observed in practice such as overdispersion, small values, overfrequency of zeros, locally constant behavior and asymmetric marginal distributions. Following the fundamental classification by Cox (1981), a time series model is called observation-driven or parameter-driven depending to whether the corresponding conditional distribution is specified conditionally on an observable or unobservable process. Since count time series models are almost often related to the Poisson process with a given parametric intensity, Cox's categorization may reduce for integer-valued models to the following dichotomy: observable intensity models versus unobservable intensity models. As is well known, observable intensity models which include in particular integer-valued GARCH (INGARCH) processes are reputed to be easy to estimate, particularly by maximum likelihood-type methods, but their probabilistic structures (e.g., ergodicity, existence of moments) are quite difficult to reveal. In contrast, unobservable intensity models are generally of simple structure and offer a great flexibility in representing serial autocorrelation. However, their estimation by the maximum likelihood method is computationally very demanding, if not infeasible. Moreover, unlike observable intensity models (e.g., INGARCH models) a weak ARMA representation for unobservable intensity models does not exist. Nevertheless, unobserved intensity models as a class of parameter driven models may, in principle, be estimated by filtering-based methods such as Bayesian MCMC and EM-type algorithms. In this work we propose a class of unobservable intensity models we call integer-valued stochastic volatility (INSV). The corresponding conditional distribution is a Poisson mixture where the logarithm of the intensity follows a drifted Gaussian autoregression. The proposed model allows for a wide range of possible conditional distributions such as Poisson, negative binomial, double Poisson, etc. In the pure Poisson case, the INSV model is the discrete-valued analog of the stochastic volatility (SV) model proposed by Taylor (1982). Our model may be viewed as an alternative to the observation-driven Poisson INGARCH model in which the intensity only depends on the past process. Similar unobserved intensity models have been earlier proposed but they differ from our model by their parametrization. We first study the probability structure of the INSV model such as ergodicity, covariance structure and existence of moments. Then parameter estimation is carried out using the Bayesian Griddy Gibbs sampler in both Poisson and negative binomial cases. In particular, the autoregressive parameters are sampled using conjugate priors while the unobservable intensity and the variance of the mixing process are indirectly sampled using the Griddy Gibbs scheme. The unobserved intensities are sampled element-by-element in the spirit of Jaquier et al. (1994) while model selection is performed using the Deviance Information Criterion. An application to Bayesian intensity forecasting through simulated and real count series is given.
Integer-valued time series modeling has seen a considerable evolution in recent decades where an important role is played by the generalized autoregressive conditionally heteroskedastic (INGARCH) process. Statistical analysis for the INGARCH model were mainly based on quasi-maximum likelihood estimates which enjoy simple computational complexities while having good asymptotic properties. However, in finite samples, these estimates may be less efficient than estimates based on the Bayesian approach for which reproducibility is allowed and simulation-based forecasting is appealing. This work proposes a Bayesian MCMC estimate for INGARCH models under three conditional distributions: the Poisson distribution, the negative binomial distribution and the quasi-Poisson distribution. The estimate is based on the Griddy-Gibbs sampler under diffuse priors. Model selection is carried out using the Deviance Information Criterion and some MCMC diagnostic tools are presented to assess the performance of the proposed estimate. A comparison of the proposed estimate with the Poisson QMLE (Ahmad and Francq, 2016) and the negative binomial QMLE (Aknouche et al., 2018) under the three mentioned distributions is made. A Bayesian in-sample and out-of-sample forecasting procedure is proposed for some real count time series data.