Saturday, November 19, 2016

Financial products with capital protection barrier - part 7

Unit Root and Stationarity Tests

Abstract

In this section, I am going to verify if our log returns time series has one among the following trend flavours:

  • stochastic trend

  • deterministic trend

  • no trends at all

Stochastic trend can be detected by unit-root tests.

Deterministic trend can be detected by trend stationary tests.

More specifically, in case of trend stationarity tests, they can be distinguished two scenarios:

  • deterministic trend, when there is a deterministic linear relationship with time

  • deterministic level, when the average of our observations is different from zero

Two common trend removal or de-trending procedures are first differencing and time-trend regression.

First differencing is appropriate for I(1) time series, and in general n-differencing for I(n) ones.

Time-trend regression is appropriate for trend stationary I(0) time series.

Unit root tests can be used to determine if trending data should be first differenced or regressed on deterministic functions of time to render the data stationary.

Unit root tests have the purpose to figure out if the original time series is I(1) while trend stationary if it is I(0).

Analysis

At the purpose, I am going to take advantage of:

  • Dickey Fuller Test to verify for I(1), in other words, stochastic trend

  • KPSS test to verify for I(0), in other words, trend stationarity

Herein below, I am loading the saved environment and determining the number of lags based on Schwertz suggestion, ref. [1] eq. 4.5.

load(file="structured-product-3.RData")
invisible(lapply(ts.package, function(x) {
  suppressPackageStartupMessages(library(x, character.only=TRUE)) }))

# Schwert suggestion ref. [1]  eq. 4.5
(urdftest_lag = floor(12* (length(GSPC_log_returns)/100)^0.25))
## [1] 19

Running all type of Augmented Dickey-Fuller tests, constant, non-constant and constant+trend. The null-hypothesis sets forth presence of unit roots, in other words I(1) time series.

  • type = c

The hypothesis and test statistics of the Dickey-Fuller unit toot test of type "costant" are the following (see Ref. 4).

\[ \begin{equation} \begin{cases} \ Model:\ \Delta y_{t}\ =\ a_{0}+ \gamma y_{t-1} + \epsilon_{t} \\ \\ H_{0}: \gamma = 0\ \ \ \ \ test\ statistics: \tau_{2} \\ \\ H_{0}: \ a_{0} = \gamma = 0\ \ \ \ \ test\ statistics: \phi_{1} \\ \end{cases} \end{equation} \]

urdfTest(GSPC_log_returns, lags = urdftest_lag, type = c("c"), doplot = TRUE)

## 
## Title:
##  Augmented Dickey-Fuller Unit Root Test
## 
## Test Results:
##   
##   Test regression drift 
##   
##   Call:
##   lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
##   
##   Residuals:
##         Min        1Q    Median        3Q       Max 
##   -0.050818 -0.002715 -0.000186  0.002682  0.031036 
##   
##   Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
##   (Intercept)   0.0003677  0.0002395   1.535    0.125    
##   z.lag.1      -1.1965334  0.1987630  -6.020 3.04e-09 ***
##   z.diff.lag1   0.0344884  0.1929247   0.179    0.858    
##   z.diff.lag2   0.0503757  0.1868002   0.270    0.788    
##   z.diff.lag3   0.0098653  0.1808014   0.055    0.957    
##   z.diff.lag4  -0.0365646  0.1747815  -0.209    0.834    
##   z.diff.lag5   0.0005171  0.1683473   0.003    0.998    
##   z.diff.lag6   0.0078356  0.1620555   0.048    0.961    
##   z.diff.lag7   0.0073623  0.1555110   0.047    0.962    
##   z.diff.lag8   0.0475661  0.1489661   0.319    0.750    
##   z.diff.lag9   0.0837395  0.1428675   0.586    0.558    
##   z.diff.lag10  0.0864162  0.1373661   0.629    0.530    
##   z.diff.lag11  0.0255999  0.1315184   0.195    0.846    
##   z.diff.lag12 -0.0070726  0.1247852  -0.057    0.955    
##   z.diff.lag13  0.0087764  0.1169121   0.075    0.940    
##   z.diff.lag14  0.0445574  0.1084233   0.411    0.681    
##   z.diff.lag15  0.0393628  0.0988245   0.398    0.691    
##   z.diff.lag16  0.0576455  0.0870806   0.662    0.508    
##   z.diff.lag17  0.0465397  0.0747122   0.623    0.534    
##   z.diff.lag18  0.0314942  0.0602086   0.523    0.601    
##   z.diff.lag19  0.0373990  0.0395026   0.947    0.344    
##   ---
##   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   
##   Residual standard error: 0.005687 on 599 degrees of freedom
##   Multiple R-squared:  0.5904,   Adjusted R-squared:  0.5767 
##   F-statistic: 43.16 on 20 and 599 DF,  p-value: < 2.2e-16
##   
##   
##   Value of test-statistic is: -6.0199 18.157 
##   
##   Critical values for test statistics: 
##         1pct  5pct 10pct
##   tau2 -3.43 -2.86 -2.57
##   phi1  6.43  4.59  3.78
## 
## Description:
##  Sat Aug 19 20:00:14 2017 by user: egargio
  • type = nc

The hypothesis and test statistics of the Dickey-Fuller unit toot test of type "no costant" are the following (see Ref. 4).

\[ \begin{equation} \begin{cases} \ Model:\ \Delta y_{t}\ =\ \gamma y_{t-1} + \epsilon_{t} \\ \\ H_{0}: \gamma = 0\ \ \ \ \ test\ statistics: \tau_{1} \\ \\ \end{cases} \end{equation} \]

urdfTest(GSPC_log_returns, lags = urdftest_lag, type = c("nc"), doplot = TRUE)

## 
## Title:
##  Augmented Dickey-Fuller Unit Root Test
## 
## Test Results:
##   
##   Test regression none 
##   
##   Call:
##   lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
##   
##   Residuals:
##         Min        1Q    Median        3Q       Max 
##   -0.050520 -0.002375  0.000117  0.003018  0.031214 
##   
##   Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
##   z.lag.1      -1.105455   0.189920  -5.821 9.55e-09 ***
##   z.diff.lag1  -0.052854   0.184556  -0.286    0.775    
##   z.diff.lag2  -0.032492   0.179036  -0.181    0.856    
##   z.diff.lag3  -0.068722   0.173602  -0.396    0.692    
##   z.diff.lag4  -0.110791   0.168153  -0.659    0.510    
##   z.diff.lag5  -0.068955   0.162336  -0.425    0.671    
##   z.diff.lag6  -0.057010   0.156633  -0.364    0.716    
##   z.diff.lag7  -0.052847   0.150656  -0.351    0.726    
##   z.diff.lag8  -0.007925   0.144679  -0.055    0.956    
##   z.diff.lag9   0.032617   0.139091   0.235    0.815    
##   z.diff.lag10  0.039235   0.134037   0.293    0.770    
##   z.diff.lag11 -0.017654   0.128611  -0.137    0.891    
##   z.diff.lag12 -0.046034   0.122316  -0.376    0.707    
##   z.diff.lag13 -0.025360   0.114908  -0.221    0.825    
##   z.diff.lag14  0.015262   0.106852   0.143    0.886    
##   z.diff.lag15  0.014946   0.097647   0.153    0.878    
##   z.diff.lag16  0.038358   0.086267   0.445    0.657    
##   z.diff.lag17  0.032093   0.074201   0.433    0.666    
##   z.diff.lag18  0.021905   0.059951   0.365    0.715    
##   z.diff.lag19  0.032963   0.039441   0.836    0.404    
##   ---
##   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   
##   Residual standard error: 0.005694 on 600 degrees of freedom
##   Multiple R-squared:  0.5887,   Adjusted R-squared:  0.575 
##   F-statistic: 42.95 on 20 and 600 DF,  p-value: < 2.2e-16
##   
##   
##   Value of test-statistic is: -5.8206 
##   
##   Critical values for test statistics: 
##         1pct  5pct 10pct
##   tau1 -2.58 -1.95 -1.62
## 
## Description:
##  Sat Aug 19 20:00:14 2017 by user: egargio
  • type = ct

The hypothesis and test statistics of the Dickey-Fuller unit toot test of type "costant with trend" are the following (see Ref. 4).

\[ \begin{equation} \begin{cases} \ Model:\ \Delta y_{t}\ =\ a_{0}+ \gamma y_{t-1} + a_{2}t\ + \epsilon_{t} \\ \\ H_{0}: \gamma = 0\ \ \ \ \ test\ statistics: \tau_{3} \\ \\ H_{0}: \gamma = a_{2} = 0\ \ \ \ \ test\ statistics: \phi_{3} \\ \\ H_{0}: \ a_{0} = \gamma = a_{2} = 0\ \ \ \ \ test\ statistics: \phi_{2} \\ \end{cases} \end{equation} \]

urdfTest(GSPC_log_returns, lags = urdftest_lag, type = c("ct"), doplot = TRUE)

## 
## Title:
##  Augmented Dickey-Fuller Unit Root Test
## 
## Test Results:
##   
##   Test regression trend 
##   
##   Call:
##   lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
##   
##   Residuals:
##         Min        1Q    Median        3Q       Max 
##   -0.050724 -0.002759 -0.000206  0.002569  0.031211 
##   
##   Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
##   (Intercept)   6.425e-04  5.027e-04   1.278    0.202    
##   z.lag.1      -1.219e+00  2.021e-01  -6.032 2.84e-09 ***
##   tt           -8.083e-07  1.300e-06  -0.622    0.534    
##   z.diff.lag1   5.623e-02  1.962e-01   0.287    0.774    
##   z.diff.lag2   7.144e-02  1.899e-01   0.376    0.707    
##   z.diff.lag3   3.023e-02  1.838e-01   0.164    0.869    
##   z.diff.lag4  -1.689e-02  1.777e-01  -0.095    0.924    
##   z.diff.lag5   1.949e-02  1.712e-01   0.114    0.909    
##   z.diff.lag6   2.609e-02  1.648e-01   0.158    0.874    
##   z.diff.lag7   2.470e-02  1.581e-01   0.156    0.876    
##   z.diff.lag8   6.383e-02  1.513e-01   0.422    0.673    
##   z.diff.lag9   9.888e-02  1.450e-01   0.682    0.496    
##   z.diff.lag10  1.005e-01  1.393e-01   0.721    0.471    
##   z.diff.lag11  3.857e-02  1.332e-01   0.290    0.772    
##   z.diff.lag12  4.685e-03  1.263e-01   0.037    0.970    
##   z.diff.lag13  1.920e-02  1.182e-01   0.162    0.871    
##   z.diff.lag14  5.354e-02  1.094e-01   0.489    0.625    
##   z.diff.lag15  4.687e-02  9.961e-02   0.470    0.638    
##   z.diff.lag16  6.350e-02  8.763e-02   0.725    0.469    
##   z.diff.lag17  5.087e-02  7.507e-02   0.678    0.498    
##   z.diff.lag18  3.424e-02  6.040e-02   0.567    0.571    
##   z.diff.lag19  3.859e-02  3.957e-02   0.975    0.330    
##   ---
##   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   
##   Residual standard error: 0.00569 on 598 degrees of freedom
##   Multiple R-squared:  0.5906,   Adjusted R-squared:  0.5762 
##   F-statistic: 41.08 on 21 and 598 DF,  p-value: < 2.2e-16
##   
##   
##   Value of test-statistic is: -6.0318 12.2211 18.2943 
##   
##   Critical values for test statistics: 
##         1pct  5pct 10pct
##   tau3 -3.96 -3.41 -3.12
##   phi2  6.09  4.68  4.03
##   phi3  8.27  6.25  5.34
## 
## Description:
##  Sat Aug 19 20:00:14 2017 by user: egargio

By comparing the test statistics values with the critical values at 5% significance level, we can reject the null hypothesis of unit root presence for each type of Dickey-Fuller tests. Based on the same test statistics values, we can reject the same at any reported significance level.

The KPSS test (Kwiatkowski, Phillips, Schmidt and Shin) sets forth as null hypothesis that time series under analysis is I(0).

I am going to test all available types, passing type parameter equal to mu to indicate constant, equal tau to indicate time trend.

  • type = mu
urkpssTest(GSPC_log_returns, type = c("mu"), lags = c("long"),  doplot = TRUE)

## 
## Title:
##  KPSS Unit Root Test
## 
## Test Results:
##   
##   Test is of type: mu with 19 lags. 
##   
##   Value of test-statistic is: 0.1804 
##   
##   Critical value for a significance level of: 
##                   10pct  5pct 2.5pct  1pct
##   critical values 0.347 0.463  0.574 0.739
## 
## Description:
##  Sat Aug 19 20:00:14 2017 by user: egargio
  • type = tau
urkpssTest(GSPC_log_returns, type = c("tau"), lags = c("long"),  doplot = TRUE)

## 
## Title:
##  KPSS Unit Root Test
## 
## Test Results:
##   
##   Test is of type: tau with 19 lags. 
##   
##   Value of test-statistic is: 0.0719 
##   
##   Critical value for a significance level of: 
##                   10pct  5pct 2.5pct  1pct
##   critical values 0.119 0.146  0.176 0.216
## 
## Description:
##  Sat Aug 19 20:00:15 2017 by user: egargio

Based on test statistics values reported by both KPSS tests above, we cannot reject the I(0) null-hypothesis, hence our log-returns time series is either constant or time trend stationary.

As additional verification, I am herein fitting the log return time series against a deterministic trend, taking advantage of the auto.arima() function made available by the forecast package. It is same approach of ref. 3, example 9-2.

suppressPackageStartupMessages(library(forecast))
GSPC_log_ret_det_trend <- auto.arima(GSPC_log_returns, d=0, xreg=1:length(GSPC_log_returns), seasonal=FALSE)
summary(GSPC_log_ret_det_trend)
## Series: GSPC_log_returns 
## Regression with ARIMA(2,0,1) errors 
## 
## Coefficients:
##           ar1      ar2     ma1  intercept  xreg
##       -1.1183  -0.1448  0.9714      8e-04     0
## s.e.   0.0714   0.0398  0.0603      4e-04     0
## 
## sigma^2 estimated as 3.354e-05:  log likelihood=2391.28
## AIC=-4770.56   AICc=-4770.43   BIC=-4743.79
## 
## Training set error measures:
##                        ME        RMSE         MAE MPE MAPE      MASE
## Training set -4.27733e-06 0.005768567 0.003926029 NaN  Inf 0.6413649
##                     ACF1
## Training set 0.006948835

Based on above summary, there is no non-null coefficient associated to 1:length(GSPC_log_returns) variable, hence no deterministic time trend is present in our time series.

Based on the same reference and example, now I am going to verify for stochastic trend presence.

GSPC_log_ret_sto_trend <- auto.arima(GSPC_log_returns, d = 1, seasonal = FALSE)
summary(GSPC_log_ret_sto_trend)
## Series: GSPC_log_returns 
## ARIMA(1,1,0) with drift         
## 
## Coefficients:
##           ar1  drift
##       -0.5825  0e+00
## s.e.   0.0323  2e-04
## 
## sigma^2 estimated as 5.129e-05:  log likelihood=2250.12
## AIC=-4494.24   AICc=-4494.2   BIC=-4480.86
## 
## Training set error measures:
##                        ME        RMSE        MAE MPE MAPE      MASE
## Training set 1.229948e-05 0.007144885 0.00510128 NaN  Inf 0.8333566
##                    ACF1
## Training set -0.1877412

No coefficient is associated to drift as reported by the summary, hence no stochastic trend is present in our time series.

Conclusions

We reject the null hypothesis for unit root in our log-returns time series.

There is no deterministic trend as highlighted by regressing with time.

Ultimately, constant level stationarity has been confirmed.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.