Scenario
Abstract
Suppose to consider a structured product whose underlying is Goldman Sachs Group Inc Pc ADR (ISIN: US38144X6094) share price. I download from Yahoo the last three years daily prices history I start my analysis based on the Adjusted close price log returns, as under log normally distributed prices hypothesis such transformation removes trends in the original time series, providing with trend stationarity property.
Anyway, stationarity will be specifically verified in a specific exploratory analysis post.
Analysis
suppressPackageStartupMessages(library(quantmod))
suppressPackageStartupMessages(library(xts))
suppressPackageStartupMessages(library(timeSeries))
getSymbols("GSPC", src = "yahoo", from = as.Date("2013-07-01"), to = as.Date("2016-06-30"))
## [1] "GSPC"
GSPC_AdjClose <- Ad(GSPC)
r <- xts(returns(GSPC_AdjClose), order.by=index(GSPC))
GSPC_log_returns <- window(r, start=as.Date("2013-07-02"), end=as.Date("2016-06-30"))
summary(GSPC_log_returns)
## Index GSPC.Adjusted
## Min. :2013-07-02 Min. :-5.164e-02
## 1st Qu.:2014-04-01 1st Qu.:-2.908e-03
## Median :2014-12-30 Median : 0.000e+00
## Mean :2014-12-31 Mean : 5.001e-05
## 3rd Qu.:2015-09-30 3rd Qu.: 3.354e-03
## Max. :2016-06-30 Max. : 3.315e-02
I will generally refer to the underlying adjusted close price or its log return values with the term time series. Our time series is made of 756 daily observations. Both adjusted close price and its log returns plots are herein shown.
par(mfrow=c(1,2))
plot(GSPC)
plot(GSPC_log_returns, type='l', main="GSPC log returns")
par(mfrow=c(1,1))
The exploratory analysis that will follow with next posts comprises the following verifications:
Structural changes: it implies a change in mean or trend or, in general, a change in the structural nature of the stochastic process. Structural breaks can seriously complicate predictions generation. Specifically, I will take advantage of the CUSUM, Chow and F-stats tests made available by the strucchange package. I am running this kind of tests at the beginning to identify the most recent data subset whereby no structural change occurs. In that way, dealing with a time series having homogeneous structure, in terms of mean values for example, I can compute metrics that can be overall valid. The strucchange package provides an interesting set of tests at the purpose.
Outliers: it is important to capture values within expected distribution vs anomalous ones, known as outliers. Outliers treatment may comprise deletion or replacement, for example. However in this case, I will try to determine some basic properties driving the frequency of outliers occurrance, so that in principle I could simulate the outliers generation as a separate process from the rest.
Trend stationarity: if trend stationarity is confirmed it means that the stochastic process tends to return to a constant mean. That guarantees properties we observe in the stochastic process do not change with time. The Dickey Fuller test will be used at this purpose to verify unit roots presence.
Homoscedasticity: when such property is verified, the variance does not change with time and therefore the standard deviation we compute can be used as a constant parameter for simulation purposes. The McLeod-Li test, as made available by TSA package, will be used to check for heteroscedasticity. If confirmed, a GARCH fit of the process will be performed. If not, I will infere stochastic process structure by going through ACF and PACF tests.
Auto and partial correlations: I will take advantage of total and partial autocorrelation plots combined with LjungBoxTest to ckeck fot significance. In case our observations do not show significative auto or partial correlation, it means that each sample is indipendent from previous ones. In that case, further analysis shall be carried on to verify the white noise hypothesis. If such hypothesis is rejected, an ARMA representation of the process is needed.
Normality: to verify that the spatial distribution has normal distribution. That is a necessary condition to satisfy white gaussian noise hypothesis. Further, it justifies the adoption of normal distribution generation for simulation. I will take advantage of quantile-to-quantile plots, Jarque-Bera and Lillie tests. Skewness and kurtosis will be computed.
Fractional ARIMA: to verify if we are dealing with a long-memory process. That is important in order to determine the best way to simulate our observations. As per the normality tests, this kind of test can spread light on the spatial structure of our distributions, in particolar, revealing if fat-tails are there.
To better clarify what the tests and related packages will be used, I herein show a summary table about.
test_name <- c("adf.test", "McLeod.Li.Test", "acf", "pacf", "qqnorm",
"qqline", "shapiro.test", "jarque.bera.test", "lillie.test",
"outlier", "sctest", "LjungBoxTest", "kurtosis", "skewness",
"fitdist", "urdfTest, urkpssTest", "fracdiff", "bptest",
"e.divisive")
package_name <- c("tseries", "TSA", "stats", "stats", "stats", "stats", "stats",
"tseries", "nortest", "extremevalues", "strucchange", "FitAR",
"moments", "moments", "fitdistrplus", "fUnitRoots", "fracdiff", "lmtest",
"ecp")
test_package <- data.frame(test_name, package_name)
ts.package <- unique(package_name)
ts.package <- setdiff(ts.package, "stats")
ts.package <- c("quantmod", "knitr", "rootSolve", "timeSeries", "EnvStats",
ts.package)
invisible(lapply(ts.package, function(x) {
suppressPackageStartupMessages(library(x, character.only=TRUE)) }))
kable(test_package)
test_name | package_name |
---|---|
adf.test | tseries |
McLeod.Li.Test | TSA |
acf | stats |
pacf | stats |
qqnorm | stats |
qqline | stats |
shapiro.test | stats |
jarque.bera.test | tseries |
lillie.test | nortest |
outlier | extremevalues |
sctest | strucchange |
LjungBoxTest | FitAR |
kurtosis | moments |
skewness | moments |
fitdist | fitdistrplus |
urdfTest, urkpssTest | fUnitRoots |
fracdiff | fracdiff |
bptest | lmtest |
e.divisive | ecp |
ts.package
## [1] "quantmod" "knitr" "rootSolve" "timeSeries"
## [5] "EnvStats" "tseries" "TSA" "nortest"
## [9] "extremevalues" "strucchange" "FitAR" "moments"
## [13] "fitdistrplus" "fUnitRoots" "fracdiff" "lmtest"
## [17] "ecp"
Saving the packages table, the GSCP adjusted close and log returns to have them available for my next posts.
save(ts.package, GSPC_log_returns, GSPC_AdjClose, file="structured-product-1.RData")
Conclusions
I have identified a financial product to be used as underlying. Now the basic task is to identify its basic properties by a detailed exploratory analysis to ease its simulation. Further, the simulation results evaluation will allow us to determine gains/losses percentage and barrier breakage events.
Next post will show the first step of the exploratory analysis.