Sunday, March 5, 2017

Financial products with capital protection barrier - part 11

Product Log Returns Simulation

Abstract

In my previous post I showed how to compute a standard deviation \(\sigma_{dom}\) value for the normal distribution in order to have:

  1. first order dominance of \(T_{n}(B_{\sigma})\) with respect to \(T_{n}(Y)\)

  2. first order dominance of \(Q_{n}(B_{\sigma})\) with respect to \(Q_{n}(Y)\)

Those two above stochastic dominance conditions are sufficient conditions for the upper bounds:

\[ \begin{equation} \begin{cases} E(T_{n}(Y))\ <=\ E(T_{n}(B_{\sigma})) \\ \\ E(Q_{n}(Y))\ <=\ E(Q_{n}(B_{\sigma})) \\ \end{cases} \end{equation} \]

and consequent finite quadratic variation of the process \(Y\) under analysis. For definition and more detailed explanation see my previous post.

Here below, the plot of our log returns distribution density against the normal distribution having standard deviation \(\sigma_{dom}\) as computed in the previous post.

load(file="structured-product-4.RData")
invisible(lapply(ts.package, function(x) {
  suppressPackageStartupMessages(library(x, character.only=TRUE))}))
my_seed <- 1023
set.seed(my_seed)

plot(density(GSPC_log_returns), main = "Observations Distribution Density vs. Normal Distribution")
lines(density(rnorm(100000, 0, sigma_dom)), col='blue')

In this post, I am going to run simulations of the evolution with time of our financial product and compute rate of success or failure. That translates to collect results in terms of:

  1. product prices at each observation instant

  2. barrier break events frequencies for both European and American style products

  3. success/failure profit gain events frequencies

Simulation

The approach adopted herein foresees to simulate a pool of Brownian motions and for each evaluate the metrics as listed above.

Specifically, a Brownian motion \(B_{N}(t)\), composed by \(N\) steps and lasting \(T\) units of time, from \(t\ =\ 0\) to \(T\) is determined as:

\[ \begin{equation} \begin{cases} N\ =\ \dfrac{T}{\Delta\ t} \\ \\ B_{N}(t)\ =\ \sum_{i=1}^N \Large\epsilon_{i} \\ \end{cases} \end{equation} \]

Each single \(\Large\epsilon_{i}\) is obtained by the generation of a random variable having normal distribution and standard deviation so determined:

\[ \begin{equation} \begin{cases} {\Delta\ t}\ =\ \dfrac{T}{N} \\ \\ \Large \epsilon_{i} \normalsize =\ \textbf{N} (0,\ \sqrt{\Delta\ t})\\ \\ \end{cases} \end{equation} \]

Each generated \(\epsilon_{i}\) will be summed up with the previous ones to determine the Brownian motion value \(B\). The formulas which allow to determine the current simulated value \(X(t_{m})\) at step \(m\) of the simulation (at time \(t_{m}\)) are:

\[ \begin{equation} \begin{cases} t_{m}\ = m\Delta t \\ \\ B(t_{m})\ =\ \sum_{i=1}^m \Large\epsilon_{i}\\ \\ X(t_{m})\ =\ (\mu\ - \dfrac{\sigma^2}{2})\ t_{m} + \sigma\ B(t_{m}) \\ \end{cases} \end{equation} \]

As output of our simulations, we can define the following indicator variables specifying if profit occurs at some observation time i=1..6, and if barrier break occurs for European or American style products.

There is a substantial difference in barrier breaks evaluation between European and American style products. For European products, they are evaluated only at periodic fixed observation times (the same observation times when price is compared with initial value for product expiration and profit determination). Differently, for american products barrier break events are evaluated at any time during the lifetime of the product.

If product has expired at any observation time j, because its price was above its initial value, we are not interested in evaluating the following time series values for barrier breaks or else.

Here below, I resume the indicator variables for gain, European barrier break and American barrier break events. The observation times are index by i, and considering a product with three years maximum life time and observation times occurring each six months, we have six observation times at most.

\[ \begin{equation} \begin{cases} I_{G}(i)\ =\ [0,\ 1]\ \ i=1..6\\ \\ I_{B}(Europe)\ =\ [0,\ 1] \\ \\ I_{B}(USA)\ =\ [0,\ 1] \\ \end{cases} \end{equation} \]

The number of simulation runs is determined in order to achieve a 99% confidence interval of those indicator variables, which can assume only two values, 0 or 1. As a consequence, the corresponding confidence interval is based on the binomial distribution as depicted below.

\[ \begin{equation} \begin{aligned} CI\ =\ Z_{0.995}*\large\sqrt{\frac{p(1-p)}{N}} \end{aligned} \end{equation} \]

The function (closure) ci.fun.param is used to return function ci.fun as parametrized on alpha, probability, N number of simulation and p value.

The function ci.fun is probed for zero roots in order to determine the N number of simulations needed to match the required alpha confidence.

ci.fun.param <- function(N, param) {
  f <- function(N) {
    alpha <- param$alpha
    prob <- param$prob
    p.value <- param$p.value
    qnorm(alpha)*sqrt(prob*(1-prob)/N) - p.value
  }
  f
}

param <- list(alpha=0.995, prob=0.5, p.value=0.05)
ci.fun <- ci.fun.param(N, param)

(u.res <- uniroot(ci.fun, c(10, 10000)))
## $root
## [1] 663.4897
## 
## $f.root
## [1] -9.997558e-14
## 
## $iter
## [1] 11
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.103516e-05
(n.sim <- ceiling(u.res$root))
## [1] 664

It is then needed 664 simulations to achieve 99% confidence interval for simulations binary outcomes. Let us proceed with the simulation set-up, run simulations and collecting results.

The average return mu is set to zero congruently with the martingale hypothesis for log returns: the expected future value is the current value. Any trend that could be spotted up to the current time is transitory as random fluctuations will prevail.

Multiple simulations of the random walk is stored into the u matrix.

Brownian motion is computed by diffinv() function applied on u matrix columns.

Discrete time axes is computed and log returns Xt are based on well known formula for log-normal distribution.

Further, the observation points where European price products are evaluated against the barrier level set as percentage of the initial price. For American style products, such comparison is done at every instant of time.

# number of trading days per year
trading_days_per_year <<- 252
year_n <<- 3
time_frame <<- trading_days_per_year*year_n

sim.run <- function(years, sigma) {
  # maximum days for the investment
  N <- time_frame
  # average return
  mu <- 0
  # multiple simulations
  u <- matrix(rnorm(n.sim*N, 0, sqrt(time_frame/N)), nrow=N, ncol=n.sim)
  # brownian motion matrix
  Bt <- apply(u, 2, diffinv)
  # time axis
  ts <- (0:N)/N*time_frame
  # log returns
  log_rets <- (mu - 0.5*sigma^2)*ts + sigma*Bt
  log_rets
}

sigma_dom
## [1] 0.006247077
Xt <- sim.run(year_n, sigma_dom)

The observation times and the product protection barrier are set in the following way.

# observation interval length (every six months)
m6 <- trading_days_per_year/2

# observation time each six months expressed in number of days
obs.point <- seq(1:(2*year_n))*m6 

# 85% barrier
barrier.level <- 0.85  

The simulated log returns are stored into the matrix Xt columns. I plot some of the resulting processes.

ylim <- c(-0.40, max(Xt[,1:4]))
xlim <- c(0, time_frame+70)
plot(Xt[,1], type='l', col="black", xlim=xlim, ylim=ylim, ylab="log returns", main="Log Returns Simulations")
lines(Xt[,2], col="darkgreen", ylim=ylim)
lines(Xt[,6], col="pink", ylim=ylim)
lines(Xt[,15], col="blue", ylim=ylim)
lines(Xt[,20], col="yellow", ylim=ylim)
abline(h = log(barrier.level), lty='dotted', lwd =2, col='red')
abline(h = 0, lty='dotted', lwd=2, col='lightgreen')
abline(v=obs.point[1:5], lty='dotted', col='gray')
abline(v=obs.point[6], lty='dotted', col='black')
legend("topright", legend=as.character(c(1,2,6,15,20)), fill=c("black", "darkgreen", "pink", "blue", "yellow"))

The function sim.evaluate() is going to produce the results by saying if the gain has occurred or not and if yes at which observation windows has occurred.

The observation time occurs each six months, and, hence, being the product lasting for three years, we have six observation time to evaluate the product price and establish its expiration or continuation up to three years.

If at any observation time, the product price is higher than the initial one, the product expires and the profit is determined.

Also, we are interested in knowing about barrier breaks in both european and american style structured product.

# simulation investment results evaluation
sim.evaluate <- function(sim, point, barrier.level) {
  # investment price at observation points (one each six months)
  sim.obs <- sim[point,]

  # gain events at observation points
  gain.events <- apply(sim.obs, 2, function(x) { which (x > 0) })

  # first available gain event per simulation
  gain.first <- sapply(gain.events, function(x) { ifelse(length(x) > 0, x[1], NA)})
  
  # number of observations at 6 months window
  l <- 1:nrow(sim.obs)

  # barrier level in log scale
  log_barrier <- log(barrier.level)
  
  # vector to store barrier breaks flag for european style product
  barrier.europe <- vector(mode="logical", length = n.sim)
  
  # barrier breaks are computed within the product life timespan not afterwards
  for (i in 1:ncol(sim.obs)) {
    if (is.na(gain.first[i])) {
       barrier.europe[i] <- any(sim.obs[,i] < log_barrier)
    } else {
       barrier.europe[i] <- any(sim.obs[1:gain.first[i],i] < log_barrier)
    }
  }

  # vector to store barrier breaks flag for american style product
  barrier.us <- vector(mode="logical", length = n.sim)

  # barrier breaks are computed within the product life timespan not afterwards
  for (i in 1:ncol(sim)) {
    if (is.na(gain.first[i])) {
       barrier.us[i] <- any(sim[,i] < log_barrier)
    } else {
       barrier.us[i] <- any(sim[1:(m6*gain.first[i]),i] < log_barrier)
    }
  }

  data.frame("simulation_no" = 1:n.sim,
             "gain_event_observation_time" = gain.first,
             "barrier_break_europe" = barrier.europe,              
             "barrier_break_us" = barrier.us)
}

sim_res <- sim.evaluate(Xt, obs.point, barrier.level)
kable(head(sim_res, 10), caption="simulation results", align='c')
simulation results
simulation_no gain_event_observation_time barrier_break_europe barrier_break_us
1 1 FALSE FALSE
2 2 FALSE FALSE
3 1 FALSE FALSE
4 NA TRUE TRUE
5 2 FALSE FALSE
6 NA TRUE TRUE
7 1 FALSE FALSE
8 1 FALSE FALSE
9 1 FALSE FALSE
10 5 FALSE TRUE
ggplot(sim_res, aes(x=factor(gain_event_observation_time))) + geom_bar(stat='count') + xlab("gain events")

The barplot above shows the results of the 664 runs showing gain events associated observation window [1..6]. In the following further metrics are computed.

# tabularization of the simulation results
tb <- base::table(sim_res$gain_event, useNA="always")

# computing frequency of gain events and no gain event
freq_v <- as.vector(tb)/sum(tb)

# barrier break events for both european and american style products
(brk1_tbl <- colSums(sim_res[,c(3,4)]))
## barrier_break_europe     barrier_break_us 
##                  140                  170
# data frame collecting simulation results
sim_df <- data.frame(gain_event_observation_time = c(seq(1,6), NA), 
                     profit_no = as.vector(tb), frequency = freq_v)
kable(sim_df, caption="Investment simulation results", align='c')
Investment simulation results
gain_event_observation_time profit_no frequency
1 316 0.4759036
2 80 0.1204819
3 45 0.0677711
4 33 0.0496988
5 13 0.0195783
6 11 0.0165663
NA 166 0.2500000

The total profit frequency can be further computed.

total_profit_frequency <- sim_df %>% filter(!is.na(gain_event_observation_time)) %>% select(frequency) %>% colSums()
total_profit_frequency
## frequency 
##      0.75

It is then interesting compare the results obtained by using the stochastic dominance standard deviation with the ones obtained by using straightaway the empirical standard deviation.

(sigma_rets <- sd(GSPC_log_returns))
## [1] 0.005840571
X2t <- sim.run(3, sigma_rets)
sim_res2 <- sim.evaluate(X2t, obs.point, barrier.level)
kable(head(sim_res2, 10), caption="Investment simulation results", align='c')
Investment simulation results
simulation_no gain_event_observation_time barrier_break_europe barrier_break_us
1 NA TRUE TRUE
2 NA TRUE TRUE
3 3 FALSE FALSE
4 6 FALSE FALSE
5 1 FALSE FALSE
6 1 FALSE FALSE
7 1 FALSE FALSE
8 NA TRUE TRUE
9 2 FALSE FALSE
10 2 FALSE FALSE
ggplot(sim_res2, aes(x=factor(gain_event_observation_time))) + geom_bar(stat='count') + xlab("gain events")

tb2 <- base::table(sim_res2$gain_event, useNA="always")
freq2_v <- as.vector(tb2)/sum(tb2)

(brk2_tbl <- colSums(sim_res2[,c(3,4)]))
## barrier_break_europe     barrier_break_us 
##                  115                  153
sim2_df <- data.frame(gain_event_observation_time = c(seq(1,6), NA), 
                      profit_no = as.vector(tb2), frequency = freq2_v)
kable(sim2_df, caption="Investment results based on simulations", align='c')
Investment results based on simulations
gain_event_observation_time profit_no frequency
1 314 0.4728916
2 84 0.1265060
3 46 0.0692771
4 26 0.0391566
5 23 0.0346386
6 12 0.0180723
NA 159 0.2394578

The results look very similar to the ones obtained with the previous simulation set-up. As before, profit frequency is computed.

total2_profit_frequency <- sim2_df %>% filter(!is.na(gain_event_observation_time)) %>% select(frequency) %>% colSums()
total2_profit_frequency
## frequency 
## 0.7605422

To establish if any substantial difference is there, the following specific tests are run.

res_sim_all_df <- as.data.frame(cbind(sim_df[,2], sim2_df[,2]))
colnames(res_sim_all_df) <- c("sim_1", "sim_2")
res_sim_all_df
##   sim_1 sim_2
## 1   316   314
## 2    80    84
## 3    45    46
## 4    33    26
## 5    13    23
## 6    11    12
## 7   166   159
chisq.test(res_sim_all_df)$p.value
## [1] 0.687849
brk_sim_all_df <- as.data.frame(cbind(brk1_tbl, brk2_tbl))
colnames(brk_sim_all_df) <- c("sim_1", "sim_2")
brk_sim_all_df
##                      sim_1 sim_2
## barrier_break_europe   140   115
## barrier_break_us       170   153
chisq.test(brk_sim_all_df)$p.value
## [1] 0.6458822

By the p.value results, there is no statistical significative difference in the number of gain events or barrier breaks events. That highlights a substantial sensitivity robustness of the simulation results on standard deviation values.

Since the results will be further scrutinized in my next post, I save the environment as usual.

save(ts.package, GSPC_log_returns, GSPC_AdjClose, outliers_l, outliers_r, sigma_dom, sim_res, sim_df, total_profit_frequency, file="structured-product-5.RData")

Conclusions

In this post, I defined the set-up to run simulation of future prices as based on Brownian motion.

The dominant standard deviation value has been used and results compared with the ones obtained by using standard deviation empirical value.

Results have been collected as produced by multiple simulation runs.

Counts and frequencies of the events of our interest have been computed.

In next post, a more in depth evaluation of the results will be addressed in order to understand the risk associated to hold the product into the investor portfolio.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.