around-R: October 2016

Scenario

Abstract

Suppose to consider a structured product whose underlying is Goldman Sachs Group Inc Pc ADR (ISIN: US38144X6094) share price. I download from Yahoo the last three years daily prices history I start my analysis based on the Adjusted close price log returns, as under log normally distributed prices hypothesis such transformation removes trends in the original time series, providing with trend stationarity property.

Anyway, stationarity will be specifically verified in a specific exploratory analysis post.

Analysis

suppressPackageStartupMessages(library(quantmod))
suppressPackageStartupMessages(library(xts))
suppressPackageStartupMessages(library(timeSeries))

getSymbols("GSPC", src = "yahoo", from = as.Date("2013-07-01"), to = as.Date("2016-06-30"))

## [1] "GSPC"

GSPC_AdjClose <- Ad(GSPC)
r <- xts(returns(GSPC_AdjClose), order.by=index(GSPC))
GSPC_log_returns <- window(r, start=as.Date("2013-07-02"), end=as.Date("2016-06-30"))
summary(GSPC_log_returns)

##      Index            GSPC.Adjusted       
##  Min.   :2013-07-02   Min.   :-5.164e-02  
##  1st Qu.:2014-04-01   1st Qu.:-2.908e-03  
##  Median :2014-12-30   Median : 0.000e+00  
##  Mean   :2014-12-31   Mean   : 5.001e-05  
##  3rd Qu.:2015-09-30   3rd Qu.: 3.354e-03  
##  Max.   :2016-06-30   Max.   : 3.315e-02

I will generally refer to the underlying adjusted close price or its log return values with the term time series. Our time series is made of 756 daily observations. Both adjusted close price and its log returns plots are herein shown.

par(mfrow=c(1,2))
plot(GSPC)
plot(GSPC_log_returns, type='l', main="GSPC log returns")

par(mfrow=c(1,1))

The exploratory analysis that will follow with next posts comprises the following verifications:

Structural changes: it implies a change in mean or trend or, in general, a change in the structural nature of the stochastic process. Structural breaks can seriously complicate predictions generation. Specifically, I will take advantage of the CUSUM, Chow and F-stats tests made available by the strucchange package. I am running this kind of tests at the beginning to identify the most recent data subset whereby no structural change occurs. In that way, dealing with a time series having homogeneous structure, in terms of mean values for example, I can compute metrics that can be overall valid. The strucchange package provides an interesting set of tests at the purpose.
Outliers: it is important to capture values within expected distribution vs anomalous ones, known as outliers. Outliers treatment may comprise deletion or replacement, for example. However in this case, I will try to determine some basic properties driving the frequency of outliers occurrance, so that in principle I could simulate the outliers generation as a separate process from the rest.
Trend stationarity: if trend stationarity is confirmed it means that the stochastic process tends to return to a constant mean. That guarantees properties we observe in the stochastic process do not change with time. The Dickey Fuller test will be used at this purpose to verify unit roots presence.
Homoscedasticity: when such property is verified, the variance does not change with time and therefore the standard deviation we compute can be used as a constant parameter for simulation purposes. The McLeod-Li test, as made available by TSA package, will be used to check for heteroscedasticity. If confirmed, a GARCH fit of the process will be performed. If not, I will infere stochastic process structure by going through ACF and PACF tests.
Auto and partial correlations: I will take advantage of total and partial autocorrelation plots combined with LjungBoxTest to ckeck fot significance. In case our observations do not show significative auto or partial correlation, it means that each sample is indipendent from previous ones. In that case, further analysis shall be carried on to verify the white noise hypothesis. If such hypothesis is rejected, an ARMA representation of the process is needed.
Normality: to verify that the spatial distribution has normal distribution. That is a necessary condition to satisfy white gaussian noise hypothesis. Further, it justifies the adoption of normal distribution generation for simulation. I will take advantage of quantile-to-quantile plots, Jarque-Bera and Lillie tests. Skewness and kurtosis will be computed.
Fractional ARIMA: to verify if we are dealing with a long-memory process. That is important in order to determine the best way to simulate our observations. As per the normality tests, this kind of test can spread light on the spatial structure of our distributions, in particolar, revealing if fat-tails are there.

To better clarify what the tests and related packages will be used, I herein show a summary table about.

test_name <- c("adf.test", "McLeod.Li.Test", "acf", "pacf", "qqnorm",
               "qqline", "shapiro.test", "jarque.bera.test", "lillie.test",
               "outlier", "sctest", "LjungBoxTest", "kurtosis", "skewness",
               "fitdist", "urdfTest, urkpssTest", "fracdiff", "bptest", 
               "e.divisive")
package_name <- c("tseries", "TSA", "stats", "stats", "stats", "stats", "stats",
                  "tseries", "nortest", "extremevalues", "strucchange", "FitAR",
                  "moments", "moments", "fitdistrplus", "fUnitRoots", "fracdiff", "lmtest",
                  "ecp")
test_package <- data.frame(test_name, package_name)

ts.package <- unique(package_name)
ts.package <- setdiff(ts.package, "stats")
ts.package <- c("quantmod", "knitr", "rootSolve", "timeSeries", "EnvStats",
                ts.package)
invisible(lapply(ts.package, function(x) {
  suppressPackageStartupMessages(library(x, character.only=TRUE)) }))

kable(test_package)

test_name	package_name
adf.test	tseries
McLeod.Li.Test	TSA
acf	stats
pacf	stats
qqnorm	stats
qqline	stats
shapiro.test	stats
jarque.bera.test	tseries
lillie.test	nortest
outlier	extremevalues
sctest	strucchange
LjungBoxTest	FitAR
kurtosis	moments
skewness	moments
fitdist	fitdistrplus
urdfTest, urkpssTest	fUnitRoots
fracdiff	fracdiff
bptest	lmtest
e.divisive	ecp

ts.package

##  [1] "quantmod"      "knitr"         "rootSolve"     "timeSeries"   
##  [5] "EnvStats"      "tseries"       "TSA"           "nortest"      
##  [9] "extremevalues" "strucchange"   "FitAR"         "moments"      
## [13] "fitdistrplus"  "fUnitRoots"    "fracdiff"      "lmtest"       
## [17] "ecp"

Saving the packages table, the GSCP adjusted close and log returns to have them available for my next posts.

save(ts.package, GSPC_log_returns, GSPC_AdjClose, file="structured-product-1.RData")

Conclusions

I have identified a financial product to be used as underlying. Now the basic task is to identify its basic properties by a detailed exploratory analysis to ease its simulation. Further, the simulation results evaluation will allow us to determine gains/losses percentage and barrier breakage events.

Next post will show the first step of the exploratory analysis.

around-R

Sunday, October 30, 2016

Financial products with capital protection barrier - part 2

Scenario

Abstract

Analysis

Conclusions

Featured Post

Plant Leaf Classification - Part 3

Total Pageviews