International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

Rasaki Olawale

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 ISSN: 2456-1452 Maths 2018; 3(5): 20-27 © 2018 Stats & Maths www.mathsjournal.com Received: 03-07-2018 Accepted: 04-08-2018 Olanrewaju O Rasaki Department of Statistics, University of Ibadan, Ibadan, Nigeria Oseni Ezekiel Department of Banking and Finance, University of Lagos, Lagos, Nigeria Adekola Lanrewaju Olumide Department of Physical Sciences, Bells University of Technology, Ota, Nigeria Oyinloye Adedeji Adigun Department of Statistics, University of Ibadan, Ibadan, Nigeria On skew generalized extreme value-ARMA model: An application to average monthly temperature (19012016) in Nigeria Olanrewaju O Rasaki, Oseni Ezekiel, Adekola Lanrewaju Olumide and Oyinloye Adedeji Adigun Abstract This study describes the approach for modeling extreme and lengthy time-varying series of an Autoregressive Moving Average of order ( p, q) via a Skew Generalized Extreme Value distribution as the white noise. This approach establishes the procedure for parameters’ estimation and their standard errors for the SGEV-ARMA ( p, q) model via the iterative Fisher information scores derived from the Maximum Likelihood Estimation for a chosen optimal degree of flexibility (bandwidth) "  " . The study was applied to a lengthy series of average monthly temperature (report in oC) of Lagos, Nigeria from January 1901 to December 2016 with 1381 data points. It was noted that SGEV-ARMA (3,3) recorded a subjacent model performance error via the evaluated indexes of AIC, BIC and HQIC (103.02, 141.35 & 124.50) respectively compare to an intensive error performance in the white noise Gaussian-ARMA (3, 3) with (108, 144.4 & 129.26) respectively. In addition, the forecast error indexes with the SGEV subjected white noise were miniaturized compared to the Gaussian white noise. Keywords: Autoregressive moving average, bandwidth, maximum likelihood estimation, skew generalized extreme value, temperature, and white noise Correspondence Olanrewaju O Rasaki Department of Statistics, University of Ibadan, Ibadan, Nigeria 1. Introduction The origin of extreme value theory started its course by Gnedenko (1943) [10] when it was used to study the maxima series of Gaussian subjected variables under general hypothesis of limiting distribution called Generalized Extreme Value (GEV) distribution for series of extreme (s), lengthy series, lengthy observations, ecological observations, climate observations etc. Its course was extended when an unusual or usual event takes place regardless of whether or not it is catastrophic or when an event causes catastrophes (Farago and Katz, 1990 [6]; Faranda et al. 2012 [7].). However, its development could be traced back to the work done by Bernoulli in 1975[3]. Kotz and Nadarajah (2000) [11] and its first application was made by Fuller in 1914 [8]. It is based on large deviations from the median of probability distributions such that the theory assesses the type of probability distribution generated by processes. Rieders (2014) [17] affirmed that limiting distributions (which are distinct from the normal distribution) are the Extreme Value Distributions (EVD) for maximum, minimum or extreme lengthy contaminated series or collection of observations of either dependent or covariates random variable (s). It is widely used in modeling phenomena in disciplines, such as structural hydrology, meteorology, engineering, finance, earth sciences, traffic prediction and risk management. Estimates of extreme precipitation are consistent in forecasting planning infrastructures such as dams flood frequency etc. Engineers often need such statistics for the design of structures for flood protection using Areal Reduction Factors (ARFs) to convert quantiles for point rainfall to the corresponding quantiles of areal rainfall. ARFs have been derived empirically by estimating the areal rainfall as a function of point rainfall measurements e.g. Natural Environment Research Council (NERC) (1975) [13]; Bell (1976) [2]; or by statistical modeling (Bacchi and Ranzi, 1996) [1]. ~20~ International Journal of Statistics and Applied Mathematics Three approaches had been in existence for the practical applications extreme value- the first method evolves deriving block maxima (that is maxima/minima) series as a preliminary step. The approach relies partly on the results of the Fisher–Tippett– Gnedenko theorem, leading to the Generalized Extreme Value Distribution (GEVD) being selected for fitting. The second method relies on extracting "Peak over Threshold" (POT) that is peak values above or below a certain threshold from a continuous record while the third approach tries to strike the balance between the block maxima and peak over threshold approaches via r th-largest order statistics approach (Ragulina and Reitan, 2017) [16]. Chavez-Demoulin and Davison (2012) [5] superposed monthly maximum river ﬂow at the station Muota-Ingenbohl in Switzerland, for the years 1923-2008 by ﬁtting a nonparametric GEV with time dependent location parameter where "  " plays the same role as the bandwidth in local likelihood estimation to the random eﬀects model. In addition, Laurini & Pauli (2009) [12], applied it using Bayesian computational tools. Ning and Bloomﬁeld (2017) [4] water ﬂow dataset is collected from French Broad River at Asheville in North Carolina. The datasets contain annual maximum water ﬂow level from 1941 to 2009 and used the Dependent Generalized Extreme Value (DGEV) model as the white noise of the Autoregressive model to explain the water flow phenomena. Hence, this article subjected the Skew Generalized Extreme Value (SGEV) distribution as the stochastic error distribution of Autoregressive Moving Average (ARMA) as a deterministic time varying model for trend or non-trend effect and used the iterative Fisher information scores via the maximum likelihood method of estimation in estimating their standard errors for a chosen optima degree of flexibility (bandwidth) "  " . 2. Specification of the skew generalized extreme value- autoregressive moving average (SGEV-ARMA) A stochastic process of yt is said to follow an Autoregressive Moving Average (ARMA) of order ( p, q) if it satisfies yt  0   i yt i  i t i p q i 1 i 0 1 that is independently and identically distributed with mean zero and variance   . Replacing and forgoing the standard Gaussian 2 of zero mean and variance E ( yt )     2 of  t for Skew Generalized Extreme Value (SGEV) with mean and variance  2 1    1 (1   )  ;  2  v ( yt )  (1  2 )   2 (1   )  Where,  is the Gamma 2  2       1 function with identity ( )  x  e x , for  > 0   the location parameter of GEV;   the scale parameter;   the  1  x shape parameter;   the degree of flexibility (bandwidth) measure 0 With 0  1 in equation (1), yt will be stable and equals p q      1  L y   i  t i    i i t 0 i 0  i 1  2 So, equation (1) becomes, yt  0   i yt i  i t i p q i 1 i 0 3 SGEV  ARMA( p, q) That is, SGEV-ARMA ( p, q) . Ribereau et al. (2011) [15] defined the Probability Density Function (PDF) of a random variable ( yt ) for Skew Generalized Extreme Value (SGEV) distribution to be f ( yt )  (  1) g ( yt )G  ( yt ) 4 distribution respectively, where  must be strictly greater than -1. Where g ( yt ) and G( yt ) are the PDF and Cumulative Distribution Function (CDF) of the Generalized Extreme Value (GEV) G( yt )  e     1     1  yt         5    ~21~ International Journal of Statistics and Applied Mathematics g (  ,  ,  ; yt )  1 1    yt       1 1  1    yt      exp   1            yt      1   So, f ( ,  ; yt )   1        Such that, 1   yt      1 1 6 1       y    exp  (  1) 1   t         7  0 ;  0 ;   y   i yt i  i t i 3. Parameter Estimation Given that yt  0  p q i 1 i 0 p q   y   i  t i   i t i   1  i 1 i 0 f ( ,  ,  ,  ; y )    1         i i t The log-likelihood function, ( E ( yt ), v( yt )) with order ( p, q) so, follows SGEV 1   p q   1   y    i  t i   i t i    i 1 i 0  exp  (  1)  1          1 1             8  L()  log L(i ,i ,  , ; yt )  n log(  1)  n log     1      i yt i  i t i  i 0  log  (1   ) i 1  i 1    p n q    i yt i  i t i n   i 0     1   (1   ) i 1  i 1       p q        1   9 Where   i ,i , ,; yt  p q   yt2ii   i t i    1  i 1 i 0 1  1    1   (1   ) q   p 2     i 1   y     i  t i    n   t i i L ()        1   i 1 i 0     p q i       i 1    i yt i   i t i     1  n   i 1 i 0   1 log (1 )          i 1       n q  p 2  y     i  t i  n   t i i i 0 (1   )  i 1   i 1        p q   i yt i   i t i n  i 0  log  (1   ) i 1  i 1    q  p 2 y    i  t i n   t i i i 1 i 0     1     i 1         ~22~       1  10 1    11 International Journal of Statistics and Applied Mathematics  p  y  q  2  i  n   i t i i 0 (1   )   i 1    p y ( y )  q  2  i 1    t i i t i  i L ()       1 n  i 1 i 0    p q  i    i 1   i yt i   i t i   n   i 1 i 0   log  (1   )  i 1     t i t i L() n 1  n  1      1  log        i 1     1  1      1        1      1 1    12  13  1 q   1  (  1) n  p L() 1  2 log      y y ( )    t i i t i  i  t i (i t i )  3    i 1  i 1 i 0            p q 2  p y   q   i 1      yt ii   i  t i  i t i  n   t i i L ()  i 1  2 i 0 i 0  (1   )   i 1  p q 2 i   i 1   n  i yt i    i t i    log  (1   ) i 1 i 0    i 1    14 n p q     i  t i y    i t i n   i i 1 0 log  (1   )    i 1   q   p 3   y   n   t i i   i  t i i 0    i 1    i 1      q  p 3  y   i  t i  t i i 1 n   i 1 i 0   1 (1  )  i 1                          1          p q 2  p y   q   i 1      yt ii   i  t i  i t i  n   t i i L ()  i 1  2 i 0 i 0  (1   )   i 1  p q 2  i   i 1   n  i yt i    i t i    log  (1   ) i 1 i 0    i 1   n p q  y   i  t i     i t i n i i 1 0   log  (1   )    i 1   q   p 3 y     n   t i i   i  t i i 0     i 1    i 1      ~23~              15 International Journal of Statistics and Applied Mathematics q  p y   i3 t i   i i n  1 i 0   1 (1  )  i 1  i 1            1   16 n    1  1 2       2    L() n  1 1      i 1    1   2    2   2  1  n  1          1 log             i 1 q   1  (  1) n  p L() 1 1      2  log y y 3 ( )    t i i t i  i  t i (i t i )    2 3 3 4     i 1  i 1 i 0     17   18 The Hessian matrix of the parameter space,   2 L ( )  2  i   2 L ( )  i2 2 H ()   L ()     2 L ( )   2    2 L ( )   2               19 H () is a block diagonal matrix and a square matrix of n by n dimension depending on the order of the SGEVARMA, that is the order of ( p, q) . Such that the The Fisher Scoring algorithm,  m 1    L() /      E  H () /  , , ,     i i     i ,i , ,    1 m m1  m   I n( m )  S ( m ) 1 (m)  20 S ( m ) are the Fisher information and Score matrixes respectively to be evaluated by via Newton-Raphson iterative procedure for a chosen value of the degree of flexibility measures ( ) that must be strictly greater than one. Where I n and 3.1 Model identification criteria Model selection via information criteria is being defined as  criteria(m)  (Maximized likelihood) + f (n, m) 21 However, Tsay (2016) [18]defined the information criteria in terms of the Akaike’s Information Criteria (AIC), Bayesian’s Information Criteria (BIC) and Hannan and Quinn Information Criteria (HQIC). The criteria defined will be extended to by substituting the maximum likelihood of the SGEV-ARMA to the residual variance needed. 4. Analysis and discussion of results The average monthly series of temperature of Lagos, Nigeria from January 1901 to December 2016 was used. Observations are reported in Degree Celsius (oC), recorded on monthly bases starting from the inception of meteorological section under the ministry of environment. The models’ estimation and Exploratory Data Analysis utilized One thousand three hundred and eighty one (1381) data points of the average monthly temperature. All the data points maintained the same unit of measurement of oC. ~24~ International Journal of Statistics and Applied Mathematics 4.1 Preliminary and descriptive analysis Fig 1: Time Plot of the Temperature Series from 1901 to 2016. From the visualization above, it is noted that the average monthly temperature ranges approximately from 21 to 32 oC throughout the stipulated period. It can also be deduced that the recorded oCs maintained a near steady and constant trend around the 90 to 100 percentiles of the range. The minimum oC was recorded within the first quarter of the year 1920 with the trend around 10 to 20 percentiles not really steady and constant. Table 1: Descriptive statistics and stationary test of the temperature report. Mean Min. Max. Standard Deviation skewness kurtosis Augmented Dickey-Fuller Test Phillips-Perron Unit Root Test KPSS Test Box-Pierce test Cox-Stuart trend test Estimates 26.91 21.90 31.57 1.8075 3.3803 -0.6504 -16.666 -14.747 0.5154 592.44 370 P-values ------------------------0.01 0.01 0.0382 0.0023 0.0086 Table.1 represents the descriptive measurements for the meteorological temperature. The oC clustered around 26.91 with a collaborated maximum oC of 21.90 and minimum oC of 31.57. The recorded meteorological oCs was affected by extreme values (possibly lower outlier) via an indication by the estimated value (3.3803) of the skewness that is strictly greater than three. The effect of the estimated kurtosis as shown in figure. 2 led to the unusual peakedness or flatness of the graph of the frequency distribution of the meteorological data, especially with respect to the concentration of values near the mean as compared with the normal distribution. The Augmented Dickey- Fuller, Phillips-Perron Unit Root and Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests with P-values (0.01, 0.01 and 0.0382) respectively suggested and indicated the probability of the meteorological data having a unit root; being non-stationary are 0.01, 0.01 and 0.0382 respectively, so the tests tell that there is a very high probability that the data is stationary. Similarly, the Box-Pierce and Cox-Stuart trend tests for invertible and trend effect of the data with P-values (0.0023 and 0.0086) respectively betokens the invertible of the Moving Average embedded coupled with constant trend. Fig 2: Kernel density plot of the temperature series. ~25~ International Journal of Statistics and Applied Mathematics The kernel density plot confirmed the unusual emerged of the fat-tailed of the frequency distribution of the weather report meteorological distributed. It is nothing but merged mesokurtic distributions. 4.2 Model estimation and analysis Table 2: Minimum lag (order) selection and white noise test. MA0 5460.5 4682.3 4030.6 MA1 4567.7 4322.5 4031.8 MA2 4177.5 4142.6 3994.4 MA3 4106.1 4105.8 3965.4 AR3 4031.4 4015.6 3995.5 3961.0 Check Non-White-Noise AR0 AR1 AR2 lag 1 2 3 4 5 LB 594 605 746 929 989 P-value 0.0000 0.0000 0.0000 0.0000 0.0000 6 1006 0.0000 Table 2 presents the Minimum and optima lag selection for the weather series. An alternate technique from the exponentiation decaying of the Partial Autoregressive Correlation Function in describing the appropriate order for ARMA model is the minimum selection of the optima order selection of the Autoregressive (AR) cross sectioning with the Moving Average (MA). The ideal order for exponentiated decaying series of the weather series was at (3, 3). Furthermore, table. 2 unveiled the non-white noise test for the series from lag one to lag six, the hypotheses being the stated as “Non-white noise series” were accepted from lag one to lag six where it cut-off. Table 3: Coefficients of the Gaussian-ARMA (3, 3) and SGEV-ARMA (3, 3) 1 2 3 1 2 3 Gaussian-ARMA (3, 3) Estimates SD Z-ratio 1 2 3 1 2 3 P-Value 0.36058 0.1386 9.8184 0.0000 -0.85391 0.1653 -5.1664 0.0099 0.1849 0.10051 1.8401 0.0078 0.3275 0.1360 2.4065 0.0621 -0.0138 0.04247 -0.3270 0.0000 0.2685 0.0369 7.2724 0.0000 Log-likelihood = -46.98, AIC = 108, BIC= 144.4, HQIC=129.26. RMSE=19.7837; MAPE=23.147; MPE=24.147; MAE=20.9082; R-.squared=88.8334. SGEV-ARMA (3, 3) Estimates SD Z-ratio P-Value 0.7933 0.0624 12.5027 0.0000 -0.5297 0.0089 15.4093 0.0036 0.9442 0.0305 -2.6028 0.0041 0.5209 0.0007 -10.002 0.0000 -0.2821 0.0620 -8.3091 0.0000 0.4893 0.1209 9.8943 0.0000 Log-likelihood = -50.34, AIC = 103.02, BIC= 141.35, HQIC=124.50. RMSE=17.0088; MAPE=21.818; MPE=21.346; MAE=18.2028; Rsquared= 89.4605. yt  0.7933yt 1  0.5297yt 2  0.9442yt 3  0.5209 t 1  0.2821 t 2  0.4893 t 3  SGEV  26.004,1.0392 ; yt  0.3606 yt 1  0.85391yt 2  0.1849yt 3  0.3275 t 1  0.0138 t 2  0.2685 t 3  GAUSSIAN  26.91,1.8075 Table. 3 present the Skew Generalized Extreme Value- Autoregressive Moving Average; SGEV-ARMA (3, 3) in comparison with the conventional Gaussian-ARMA (3, 3) in terms of parameterization and indexes. Firstly, is to be noted that the no differencing in any form was subjected to the series, since it was stationary at raw. The crucial step in an appropriate model performance is the determination of optimal model via the models with subjacent performance criteria; the AIC, BIC and HQIC for the Gaussian white-noise were (108, 144.4 and 129.26) respectively at optima selected of (3, 3) compared to SGEV white-noise with lesser AIC, BIC and HQIC (103.02, 141.35, 124.50) respectively, that best described and captured lesser stochastic error. In addition, the mean of the two models differs by 0.906, connoting a smaller magnitude in each of the models’ clustering around the mean oC, but the variation (1.8075) in the SGEV-ARMA (3, 3) model was dinky compared to the Gaussian-ARMA (3, 3). Furthermore, the evaluation of the computed forecast error indexes of the two models of the weather series report were estimated. The Residual Mean Squared Error (RSME), Mean Absolute Percentage Error (MAPE), Mean Percentage Error (MPE) and Mean Absolute Error (MAE) were estimated as (19.7837, 23.147, 24.147, 20.9082, and 88.8334) for the Gaussian white noise ARMA (3, 3) compared with a miniaturized SGE-ARMA (3, 3) forecast error indexes of the same RSME, MAPE, MPE, and MAE as (17.0088, 21.818, 21.346, 18.2028). The error forecast of the SGEV-ARMA is subjacent to Gaussian-ARMA, suggesting a more robust evaluation and performance of the incorporated non-white-noise of the SGEV. ~26~ International Journal of Statistics and Applied Mathematics 5. Conclusions In conclusion, the recorded weather report series in oCs maintained a near steady and constant trend around the 90 to 100 percentiles of the range and minimum oC was recorded within the first quarter of the year 1920 with the trend around 10 to 20 percentiles not really steady and constant. The series was favored with a miniaturized chance of 0.01 of non- stationary. In addition, the variation absolution via skewness and kurtosis; and model performance was superincumbent in the SGEV whitenoise compression in matchmaking with the conventional Gaussian error term subjected to the ARMA. 6. Reference 1. Bacchi B, Ranzi R. On the derivation of the areal reduction factor of storms. Atmospheric Research. 1996; 42:123-135. 2. Bell FC. The areal reduction factors in rainfall frequency estimation. Institute of Hydrology, Wallingford, 1976, 35. 3. Bernoulli N. De Usu Artis conjectandi in lure. Doctoral Thesis, Basel. In: Die Werke von Jakob Bernoulli. 1975; 3(1):287326. 4. Ning B, Bloomﬁeld P. Bayesian inference for generalized extreme value distribution with Gaussian copula dependence. Cornell University Library 2017. ArXiv: 1703.00968v1 [stat. ME]. 5. Chavez-Demoulin V, Davison AC. Modeling Time Series Extremes. REVSTAT-Statistical Journal, 2012; 10(1):109-133. 6. Faragó T, Katz RW. Extremes and design values in climatology. World Meteorological Organization WMO-TD. 1990; 386(14):46. 7. Faranda D, Lucarini V, Turchetti G, Vaienti S. Generalized extreme value distribution parameters as dynamical indicators of stability. International Journal of Bifurcation and Chaos. 2012; 22(11):1793- 6551. doi.org/10.1142/S0218127412502768. 8. Fuller WE. Flood flows. ASCE Trans. 1914; 77:567-617. 9. Galina R, Trond R. Generalized extreme value shape parameter and its nature for extreme precipitation using long time series and the Bayesian approach. Hydrological Sciences Journal, 2017; 62(6):863-879. doi:10.1080/02626667.2016.1260134. 2150-3435. 10. Gnedenko B. Sur la distribution limite du terme maximum d’une série aléatoire. The Annals of Mathematics. 1943; 44:423453. https://www.jstor.org/stable/1968974 11. Kotz S, Nadarajah S. Extreme Value Distributions: Theory and Applications. London: Imperial College Press, 2000. 12. Laurini F, Pauli F. Smoothing sample extremes: The mixed model approach. Computational Statistics and Data Analysis. 2009; 53:3842-3854. 13. Natural Environment Research Council (NERC). Flood studies report. NERC, London, 1975:2. 14. Padoan SA, Wand MP. Mixed model-based additive models for sample extremes. Statistical Probability Letter. 2008; 78:2850-2858. 15. Ribereau P, Masiello E, Naveau P. Skew generalized extreme value distribution: probability weighted moments estimation and application to block maxima procedure. Communication in Statistics-Theory and Methods, Taylor & Francis, 2014, 1-25. https://hal.archives-ouvertes.fr/hal-01018877 16. Ragulina G, Reitan T. Generalized extreme value shape parameter and its nature for extreme precipitation using long time series and the Bayesian approach. Hydrological Sciences Journal, 2017; 4(1):2150-3435. doi: 10.1080/02626667.2016.1260134. 17. Rieder HE. Extreme Value Theory: A primer. Lamont-Doherty Earth Observatory, 2014. 18. Tsay RS. Lecture Notes of Bus 41202 (Spring) Analysis of Financial Time Series, 2016. ~27~

Log In

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

International Journal of Statistics and Applied Mathematics 2018; 3(5): 20-27 On skew generalized extreme value-ARMA model: An application to average monthly temperature (1901- 2016) in Nigeria

Related Papers

RELATED PAPERS