Saturday, September 5, 2015

Financial data exploratory analysis (part 4)

Building a trading model

In the previous posts, I went through some basics feature of the quantmod package in order to download and chart financial OHLC data. In this post, I will show some basics of the quantmod capabilities that can be useful to build a trading model.

First step is to download from one of the sources of historical share price data and take a look at the OHLC.

setwd("~/R/aroundrblog/Finance")
suppressWarnings(library(quantmod))
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Loading required package: TTR
## Version 0.4-0 included new data defaults. See ?getSymbols.
suppressWarnings(library(rpart))
suppressWarnings(library(rpart.plot))

ticker <- "YHOO"
getSymbols(ticker, src='yahoo')
##     As of 0.4-0, 'getSymbols' uses env=parent.frame() and
##  auto.assign=TRUE by default.
## 
##  This  behavior  will be  phased out in 0.5-0  when the call  will
##  default to use auto.assign=FALSE. getOption("getSymbols.env") and 
##  getOptions("getSymbols.auto.assign") are now checked for alternate defaults
## 
##  This message is shown once per session and may be disabled by setting 
##  options("getSymbols.warning4.0"=FALSE). See ?getSymbols for more details.
## [1] "YHOO"
head(YHOO)
##            YHOO.Open YHOO.High YHOO.Low YHOO.Close YHOO.Volume
## 2007-01-03     25.85     26.26    25.26      25.61    26352700
## 2007-01-04     25.64     26.92    25.52      26.85    32512200
## 2007-01-05     26.70     27.87    26.66      27.74    64264600
## 2007-01-08     27.70     28.04    27.43      27.92    25713700
## 2007-01-09     28.00     28.05    27.41      27.58    25621500
## 2007-01-10     27.48     28.92    27.44      28.70    40240000
##            YHOO.Adjusted
## 2007-01-03         25.61
## 2007-01-04         26.85
## 2007-01-05         27.74
## 2007-01-08         27.92
## 2007-01-09         27.58
## 2007-01-10         28.70

Second step is to choose the predictors in order to build up a formula for the next day ticker gain. I suppose for now to have as a target the prediction of next trading day Close/Open price ratio. That also assume one is able to buy the share at very precise moments of the trading day, which may be not the case for the average investor. Later I will elaborate this concept to reach a more realistic assumption.

Anyway, to start with, I elaborate the following formula:

Next(OpCl(YHOO)) ~ (Lag(LoHi(YHOO),0:3))

The left term is the next trading day share price arithmetic gain. It is expressed by two quantmod functions:

  • OpCl, computes (Cl-Op)/Op, hence an arithmetic gain of the share within the day
  • Next, which returns the +1 lag argument value

Let us take a quick look to gain confidence with.

head(data.frame(Op(YHOO), Cl(YHOO), OpCl(YHOO)))
##            YHOO.Open YHOO.Close    OpCl.YHOO
## 2007-01-03     25.85      25.61 -0.009284294
## 2007-01-04     25.64      26.85  0.047191929
## 2007-01-05     26.70      27.74  0.038951272
## 2007-01-08     27.70      27.92  0.007942202
## 2007-01-09     28.00      27.58 -0.015000000
## 2007-01-10     27.48      28.70  0.044395961

The right term is the (Hi-Lo)/Lo ratio for current and previous three days. The educated guess is that there is a relationship between historical Lo-Hi price variations and the tomorrow gain.

In the third step, I build the corresponding quantmod model:

q.model <- specifyModel(Next(OpCl(YHOO)) ~ (Lag(LoHi(YHOO),0:3)))

The object returned, whose instance in this case is q.model, it is of class quantmod. It can be used to build up the data-set with predicted and predictors values starting from the OHLC data as downloaded. That is the fourth step:

model.data <- modelData(q.model)
head(model.data)
##            Next.OpCl.YHOO Lag.LoHi.YHOO.0.3.Lag.0 Lag.LoHi.YHOO.0.3.Lag.1
## 2007-01-08    -0.01500000              0.02223846              0.04538638
## 2007-01-09     0.04439596              0.02334911              0.02223846
## 2007-01-10     0.01529906              0.05393582              0.02334911
## 2007-01-11     0.01621812              0.02334495              0.05393582
## 2007-01-12    -0.01974558              0.03545104              0.02334495
## 2007-01-16    -0.01190480              0.03786030              0.03545104
##            Lag.LoHi.YHOO.0.3.Lag.2 Lag.LoHi.YHOO.0.3.Lag.3
## 2007-01-08              0.05485893              0.03958828
## 2007-01-09              0.04538638              0.05485893
## 2007-01-10              0.02223846              0.04538638
## 2007-01-11              0.02334911              0.02223846
## 2007-01-12              0.05393582              0.02334911
## 2007-01-16              0.02334495              0.05393582

Note that the starting date has been automatically adjusted to start from the day all lagged values are available.

Now that the model has been specified and corresponding dataset has been produced, the next step is to build a machine learning model choosing the method and training dates. For example I choose rpart regression tree learning method and as training dates whole 2013 trading days.

model <- buildModel(q.model, method='rpart', training.per=c('2013-02-01','2013-12-31'))
rpart.plot(model@fitted.model)

It results a populated regression tree where not all terminal nodes bring a significative prediction to drive a decision of sell or buy. That is why when trading the model, we may want to specify some thresholds which drive the decision process to what we consider significative price change for action. In particular we sell in case of a daily price decrease of more than 1 percent and we buy if the gain is above 2 percent. It is a quite prudent investor strategy. So let us trade all 2014 with it.

trade.dates <- c('2014-01-01','2014-12-31')
trade <- tradeModel(model, signal.threshold=c(-0.01, 0.02), trade.dates=trade.dates)
trade
## 
##   Model:  rpart1441446786.86816 
## 
##   C.A.G.R.:  10.86%  H.P.R.:  15.96% 
## 
##   Returns by period summary:
## 
##             weekly monthly quarterly yearly
##     Max.     7.43%   3.60%     7.23% 15.96%
##     3rd Qu.  1.31%   2.93%     6.58% 15.96%
##     Mean     0.30%   1.26%     3.82% 15.96%
##     Median   0.00%   2.05%     3.99% 15.96%
##     2rd Qu. -0.59%  -0.36%     1.23% 15.96%
##     Min.    -4.25%  -2.06%     0.05% 15.96%
## 
##   Period to date returns:
## 
##              weekly monthly quarterly yearly
##              -1.46%   3.59%     6.36% 15.96%

It looks a good trade, however we have to take into account that the returns are computed based on trade$signal which does not take into account our share portfolio status and the likely necessity to open a long position first. In fact, inspecting such variable reveals the first signals different from zero are equal to -1.

trade$signal
##            Next.OpCl.YHOO signal.zoo
## 2014-01-03  -0.0009960408          0
## 2014-01-06  -0.0029962298          0
## 2014-01-07   0.0209579830          0
## 2014-01-08  -0.0065391376          0
## 2014-01-09  -0.0099202512          0
## 2014-01-10   0.0068375823          0
## 2014-01-13  -0.0284256074         -1
## 2014-01-14   0.0231285756         -1
## 2014-01-15   0.0002435217          0
## 2014-01-16  -0.0022260698          0
## 2014-01-17  -0.0027417997          0
## 2014-01-21  -0.0115057529          0
## 2014-01-22   0.0131114473          1
## 2014-01-23   0.0020350546          0
## 2014-01-24  -0.0196534274          0
## 2014-01-27  -0.0252658524         -1
## 2014-01-28   0.0377409428          1
## 2014-01-29  -0.0246016494          0
## 2014-01-30   0.0120378909         -1
## 2014-01-31   0.0380512839         -1
## 2014-02-03  -0.0289370348          0
## 2014-02-04   0.0156650238         -1
## 2014-02-05  -0.0030897755          0
## 2014-02-06   0.0165497887          0
## 2014-02-07   0.0158253197          0
## 2014-02-10  -0.0063158421          0
## 2014-02-11   0.0091742590          0
## 2014-02-12  -0.0126942235          0
## 2014-02-13   0.0158228384          0
## 2014-02-14  -0.0052042675         -1
## 2014-02-18   0.0000000000         -1
## 2014-02-19  -0.0065685758          0
## 2014-02-20  -0.0010573883          0
## 2014-02-21  -0.0160950123          0
## 2014-02-24   0.0051033575          0
## 2014-02-25  -0.0058698506          0
## 2014-02-26   0.0072289428         -1
## 2014-02-27   0.0177249211          0
## 2014-02-28   0.0031128146          0
## 2014-03-03   0.0159362010          0
## 2014-03-04   0.0224458990         -1
## 2014-03-05  -0.0082852619          0
## 2014-03-06   0.0015152021          0
## 2014-03-07  -0.0254343497          0
## 2014-03-10  -0.0150142890         -1
## 2014-03-11  -0.0180391895          0
## 2014-03-12   0.0077936310          0
## 2014-03-13  -0.0215505656          0
## 2014-03-14   0.0248023719          0
## 2014-03-17   0.0028205385          0
## 2014-03-18   0.0115384872          0
## 2014-03-19  -0.0264750126          0
## 2014-03-20  -0.0156371909         -1
## 2014-03-21  -0.0041994490          0
## 2014-03-24  -0.0347368421          0
## 2014-03-25  -0.0289189189          0
## 2014-03-26  -0.0217991434          0
## 2014-03-27   0.0025352113         -1
## 2014-03-28   0.0036343864          0
## 2014-03-31  -0.0153592160          0
## 2014-04-01   0.0091261615          0
## 2014-04-02  -0.0010905398          0
## 2014-04-03  -0.0245499727         -1
## 2014-04-04  -0.0485976145         -1
## 2014-04-07  -0.0304896209          0
## 2014-04-08   0.0220545028          0
## 2014-04-09   0.0198888570         -1
## 2014-04-10  -0.0424311628          0
## 2014-04-11   0.0070465688          0
## 2014-04-14  -0.0029805664         -1
## 2014-04-15   0.0082522546          0
## 2014-04-16  -0.0170362899          0
## 2014-04-17   0.0024800220         -1
## 2014-04-21  -0.0054643719          0
## 2014-04-22  -0.0155271048          0
## 2014-04-23  -0.0177384414          0
## 2014-04-24  -0.0161920156          0
## 2014-04-25  -0.0157007998         -1
## 2014-04-28  -0.0196133845          0
## 2014-04-29   0.0424789945         -1
## 2014-04-30   0.0016718306          0
## 2014-05-01   0.0068946501          0
## 2014-05-02   0.0076523367          0
## 2014-05-05   0.0062704471          0
## 2014-05-06  -0.0121818357         -1
## 2014-05-07  -0.0533482049          0
## 2014-05-08   0.0011805490          0
## 2014-05-09  -0.0073507796         -1
## 2014-05-12   0.0135333620          0
## 2014-05-13  -0.0008712751          0
## 2014-05-14  -0.0089907773         -1
## 2014-05-15  -0.0111176419          0
## 2014-05-16  -0.0074272133          0
## 2014-05-19   0.0143669261         -1
## 2014-05-20  -0.0035305382          0
## 2014-05-21   0.0105882647          0
## 2014-05-22   0.0028902603         -1
## 2014-05-23   0.0048781064          0
## 2014-05-27   0.0034285429          0
## 2014-05-28  -0.0105264005         -1
## 2014-05-29   0.0000000000          1
## 2014-05-30  -0.0077318447         -1
## 2014-06-02   0.0051888154          0
## 2014-06-03  -0.0043102587          0
## 2014-06-04   0.0072505800         -1
## 2014-06-05   0.0043115262          0
## 2014-06-06   0.0245292919          0
## 2014-06-09   0.0050195202          1
## 2014-06-10   0.0122665741          0
## 2014-06-11   0.0104827862          0
## 2014-06-12   0.0076712055          0
## 2014-06-13   0.0016268438         -1
## 2014-06-16  -0.0054285429         -1
## 2014-06-17  -0.0106321555          0
## 2014-06-18   0.0077877420          0
## 2014-06-19  -0.0130904671          0
## 2014-06-20  -0.0218328635          0
## 2014-06-23  -0.0143569290          0
## 2014-06-24  -0.0091743412         -1
## 2014-06-25  -0.0038945775          0
## 2014-06-26   0.0123308271          0
## 2014-06-27   0.0118168988          0
## 2014-06-30   0.0057257658          1
## 2014-07-01  -0.0042254085          0
## 2014-07-02   0.0072993264          0
## 2014-07-03   0.0019406432          0
## 2014-07-07  -0.0174274403         -1
## 2014-07-08  -0.0311447820          0
## 2014-07-09   0.0049019031          0
## 2014-07-10   0.0174773657          0
## 2014-07-11   0.0137338766          0
## 2014-07-14  -0.0027932403          0
## 2014-07-15  -0.0030795072          0
## 2014-07-16  -0.0183032259          0
## 2014-07-17  -0.0180366943          0
## 2014-07-18   0.0045208559         -1
## 2014-07-21  -0.0020989207          0
## 2014-07-22   0.0035841697          0
## 2014-07-23   0.0275310843          0
## 2014-07-24   0.0307779424          0
## 2014-07-25   0.0033333056          0
## 2014-07-28  -0.0091084184          0
## 2014-07-29  -0.0064049011          0
## 2014-07-30   0.0183639126          1
## 2014-07-31  -0.0124102875          0
## 2014-08-01  -0.0019613338          0
## 2014-08-04   0.0229627562          0
## 2014-08-05  -0.0170704570          0
## 2014-08-06   0.0059021638         -1
## 2014-08-07  -0.0094444444          0
## 2014-08-08   0.0050377834          0
## 2014-08-11  -0.0085871750          0
## 2014-08-12  -0.0078212013         -1
## 2014-08-13   0.0063959957          1
## 2014-08-14   0.0011013491          0
## 2014-08-15   0.0074585633         -1
## 2014-08-18   0.0165896383         -1
## 2014-08-19   0.0071885248          0
## 2014-08-20  -0.0029247806          0
## 2014-08-21  -0.0002656839         -1
## 2014-08-22   0.0082227319         -1
## 2014-08-25  -0.0112742530          0
## 2014-08-26   0.0007945710          0
## 2014-08-27  -0.0031331332         -1
## 2014-08-28   0.0057758204          0
## 2014-08-29  -0.0015556650          0
## 2014-09-02   0.0095115162         -1
## 2014-09-03  -0.0157002524          0
## 2014-09-04   0.0012774655          0
## 2014-09-05   0.0138284511         -1
## 2014-09-08   0.0364402826          0
## 2014-09-09  -0.0292787207          0
## 2014-09-10   0.0021924483         -1
## 2014-09-11   0.0058507557          0
## 2014-09-12   0.0275581356          0
## 2014-09-15  -0.0325148022         -1
## 2014-09-16   0.0023468199          0
## 2014-09-17   0.0051923768          0
## 2014-09-18  -0.0222996289          0
## 2014-09-19  -0.0355796191          0
## 2014-09-22  -0.0281618808          0
## 2014-09-23   0.0235910079          0
## 2014-09-24   0.0157922321         -1
## 2014-09-25  -0.0154196154          0
## 2014-09-26   0.0422969004          0
## 2014-09-29   0.0027220985          0
## 2014-09-30   0.0041892063          0
## 2014-10-01  -0.0083620266          0
## 2014-10-02   0.0064611826          0
## 2014-10-03   0.0058837459         -1
## 2014-10-06   0.0077669658          0
## 2014-10-07  -0.0031661227          0
## 2014-10-08   0.0019512683         -1
## 2014-10-09   0.0048898775          0
## 2014-10-10  -0.0277437270          0
## 2014-10-13  -0.0288461285         -1
## 2014-10-14  -0.0178478789          0
## 2014-10-15   0.0147571774          0
## 2014-10-16   0.0316643564          0
## 2014-10-17  -0.0074858282         -1
## 2014-10-20   0.0210553153          0
## 2014-10-21   0.0133669098          0
## 2014-10-22  -0.0099009434          0
## 2014-10-23   0.0047168866          0
## 2014-10-24   0.0228074541          0
## 2014-10-27   0.0320942038          0
## 2014-10-28   0.0191068882          0
## 2014-10-29  -0.0111014151          0
## 2014-10-30   0.0092900245          0
## 2014-10-31  -0.0023830373          0
## 2014-11-03   0.0062975246          0
## 2014-11-04   0.0237008035          0
## 2014-11-05  -0.0033599329          0
## 2014-11-06   0.0118218495          0
## 2014-11-07   0.0135698742          0
## 2014-11-10   0.0125000207         -1
## 2014-11-11   0.0098826230          0
## 2014-11-12   0.0257449006          0
## 2014-11-13  -0.0090266682          0
## 2014-11-14   0.0243467933          0
## 2014-11-17   0.0104186182          0
## 2014-11-18  -0.0101377010          0
## 2014-11-19  -0.0128805616          0
## 2014-11-20   0.0128458898          0
## 2014-11-21  -0.0182727633         -1
## 2014-11-24   0.0113171122          0
## 2014-11-25  -0.0050019046          0
## 2014-11-26   0.0071760860         -1
## 2014-11-28  -0.0025062079          0
## 2014-12-01  -0.0258604317         -1
## 2014-12-02   0.0079569922          0
## 2014-12-03  -0.0084795900          0
## 2014-12-04   0.0043833633          0
## 2014-12-05  -0.0007837939          0
## 2014-12-08  -0.0178147466          0
## 2014-12-09   0.0361025231          0
## 2014-12-10  -0.0222531881          0
## 2014-12-11   0.0080742429          0
## 2014-12-12   0.0141300159          0
## 2014-12-15  -0.0119000005          0
## 2014-12-16  -0.0131313535          0
## 2014-12-17   0.0224398001          0
## 2014-12-18  -0.0003926959          0
## 2014-12-19  -0.0035252643          0
## 2014-12-22   0.0031378700         -1
## 2014-12-23  -0.0279828805         -1
## 2014-12-24   0.0091652323          0
## 2014-12-26   0.0041460808          0
## 2014-12-29  -0.0027629565         -1
## 2014-12-30   0.0172791069         -1
## 2014-12-31  -0.0199845359          0

Such signal is used together with the modelData by the modelReturn() internal routine (called inside tradeModel()) by computing individual returns which are then cumulative multiplicated to build up the trade per period, as displayed by the trade variable above.

In my next post, I will elaborate a new R function which leverages as well on the quantmod package to compute the portfolio evolution.

For now, I will show how to model a weekly based trading model, which may be the case for the average investor who does not have the time to follow the market every day, i.e. able to spend some time on weekends to see how was the trading week and decide what to do on Monday. It is just enough to convert daily YHOO OHLC in a weekly one by calling to.weekly and then applying the same workflow to what results.

YHOO.TW <- to.weekly(YHOO)
head(YHOO.TW)
##            YHOO.Open YHOO.High YHOO.Low YHOO.Close YHOO.Volume
## 2007-01-05     25.85     27.87    25.26      27.74   123129500
## 2007-01-12     27.70     29.50    27.41      29.45   141003800
## 2007-01-19     29.88     29.88    27.55      27.64    90871600
## 2007-01-26     27.85     29.20    26.88      28.04   197636400
## 2007-02-02     28.05     28.92    27.61      28.77    78924200
## 2007-02-09     28.67     30.24    28.36      29.74    98566600
##            YHOO.Adjusted
## 2007-01-05         27.74
## 2007-01-12         29.45
## 2007-01-19         27.64
## 2007-01-26         28.04
## 2007-02-02         28.77
## 2007-02-09         29.74
q.model <- specifyModel(Next(OpCl(YHOO.TW)) ~ OpCl(YHOO.TW))
model.data <- modelData(q.model)
head(model.data)
##            Next.OpCl.YHOO.TW OpCl.YHOO.TW
## 2007-01-05       0.063176893  0.073114120
## 2007-01-12      -0.074966535  0.063176893
## 2007-01-19       0.006822298 -0.074966535
## 2007-01-26       0.025668486  0.006822298
## 2007-02-02       0.037321242  0.025668486
## 2007-02-09       0.089450287  0.037321242
model <- buildModel(q.model, method='rpart', training.per=c('2013-02-01','2013-12-31'))
rpart.plot(model@fitted.model)

trade.dates <- c('2014-01-01','2014-12-31')
trade <- tradeModel(model, signal.threshold=c(-0.01, 0.02), trade.dates=trade.dates)
trade
## 
##   Model:  rpart1441446787.38972 
## 
##   C.A.G.R.:  14.52%  H.P.R.:  20.72% 
## 
##   Returns by period summary:
## 
##             weekly monthly quarterly yearly
##     Max.    13.08%  10.60%    21.18% 20.72%
##     3rd Qu.  0.99%   5.20%    11.26% 20.72%
##     Mean     0.43%   1.76%     5.53% 20.72%
##     Median   0.00%   2.64%     6.77% 20.72%
##     2rd Qu. -0.13%  -1.28%     1.03% 20.72%
##     Min.    -6.90%  -9.19%   -12.59% 20.72%
## 
##   Period to date returns:
## 
##              weekly monthly quarterly yearly
##              -0.25%  -1.11%     5.57% 20.72%

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.