Saturday, April 11, 2015

Financial data exploratory analysis (part 1)

Introducing the quantmod package

In this post series, I will show how to download shares information from yahoo finance web-site and perform some basic analysis on collected data. As first step in the present post, I am going to introduce quantmod package and show some of its basic functions to explore financial data.

library(quantmod)

I herein download 2014 year Yahoo share prices from Yahoo finance web.site. At the purpose, getSymbols() function helps. A starting and ending date are specified together with a specific environment to store data.

ticker <- "YHOO"
stock <- new.env()
startDate = as.Date("2014-01-01")
endDate = as.Date("2014-12-31")
getSymbols(ticker, source="yahoo", env = stock, from = startDate, to=endDate)
##     As of 0.4-0, 'getSymbols' uses env=parent.frame() and
##  auto.assign=TRUE by default.
## 
##  This  behavior  will be  phased out in 0.5-0  when the call  will
##  default to use auto.assign=FALSE. getOption("getSymbols.env") and 
##  getOptions("getSymbols.auto.assign") are now checked for alternate defaults
## 
##  This message is shown once per session and may be disabled by setting 
##  options("getSymbols.warning4.0"=FALSE). See ?getSymbol for more details
## [1] "YHOO"
share <- stock[[ticker]]
class(share)
## [1] "xts" "zoo"
head(share)
##            YHOO.Open YHOO.High YHOO.Low YHOO.Close YHOO.Volume
## 2014-01-02     40.37     40.49    39.31      39.59    21504200
## 2014-01-03     40.16     40.44    39.82      40.12    15755200
## 2014-01-06     40.05     40.32    39.75      39.93    12467500
## 2014-01-07     40.08     41.20    40.08      40.92    14100000
## 2014-01-08     41.29     41.72    41.02      41.02    18638200
## 2014-01-09     41.33     41.35    40.61      40.92    12897300
##            YHOO.Adjusted
## 2014-01-02         39.59
## 2014-01-03         40.12
## 2014-01-06         39.93
## 2014-01-07         40.92
## 2014-01-08         41.02
## 2014-01-09         40.92

What we have obtained back from the getSymbols() download is a time series (xts, zoo) which models a OHLCV object, as the columns report Open, High, Low, Close, CloseAdj and Volume data. That is determined by the column names, if they contains all the substrings Open, Close, High, Low, Adjusted. That is verified by quantmod library function is.OHLCV()

is.OHLCV(share)
## [1] TRUE

It is interesting to take a look at how the is.OHLVC() function works. The has.XX() functions check if abovementioned specific columns are present in our data.

is.OHLCV
## function (x) 
## {
##     all(has.Op(x), has.Hi(x), has.Lo(x), has.Cl(x), has.Vo(x))
## }
## <environment: namespace:quantmod>
has.Op
## function (x, which = FALSE) 
## {
##     colAttr <- attr(x, "Op")
##     if (!is.null(colAttr)) 
##         return(if (which) colAttr else TRUE)
##     loc <- grep("Open", colnames(x), ignore.case = TRUE)
##     if (!identical(loc, integer(0))) {
##         return(if (which) loc else TRUE)
##     }
##     else FALSE
## }
## <environment: namespace:quantmod>
has.Hi
## function (x, which = FALSE) 
## {
##     colAttr <- attr(x, "Hi")
##     if (!is.null(colAttr)) 
##         return(if (which) colAttr else TRUE)
##     loc <- grep("High", colnames(x), ignore.case = TRUE)
##     if (!identical(loc, integer(0))) {
##         return(if (which) loc else TRUE)
##     }
##     else FALSE
## }
## <environment: namespace:quantmod>
has.Lo
## function (x, which = FALSE) 
## {
##     colAttr <- attr(x, "Lo")
##     if (!is.null(colAttr)) 
##         return(if (which) colAttr else TRUE)
##     loc <- grep("Low", colnames(x), ignore.case = TRUE)
##     if (!identical(loc, integer(0))) {
##         return(if (which) loc else TRUE)
##     }
##     else FALSE
## }
## <environment: namespace:quantmod>
has.Cl
## function (x, which = FALSE) 
## {
##     colAttr <- attr(x, "Cl")
##     if (!is.null(colAttr)) 
##         return(if (which) colAttr else TRUE)
##     loc <- grep("Close", colnames(x), ignore.case = TRUE)
##     if (!identical(loc, integer(0))) {
##         return(if (which) loc else TRUE)
##     }
##     else FALSE
## }
## <environment: namespace:quantmod>
has.Vo
## function (x, which = FALSE) 
## {
##     colAttr <- attr(x, "Vo")
##     if (!is.null(colAttr)) 
##         return(if (which) colAttr else TRUE)
##     loc <- grep("Volume", colnames(x), ignore.case = TRUE)
##     if (!identical(loc, integer(0))) {
##         return(if (which) loc else TRUE)
##     }
##     else FALSE
## }
## <environment: namespace:quantmod>

The has.OHLCV() function returns an array of booleans reporting a flag for each expected Open, High, Low, Close, Volume data column presence. To test for Adjusted Close presence, the has.Ad() function can help.

has.OHLCV(share)
## [1] TRUE TRUE TRUE TRUE TRUE
has.Ad(share)
## [1] TRUE

If Volume is not present, then we are in presence of a OHLC object and in case also Open is missing, a HLC object. It is possible to extract one or more column of interest by using the Op(), Hi(), Lo(), Cl(), Vo(), Ad() functions.

head(share.open <- Op(share))
##            YHOO.Open
## 2014-01-02     40.37
## 2014-01-03     40.16
## 2014-01-06     40.05
## 2014-01-07     40.08
## 2014-01-08     41.29
## 2014-01-09     41.33
head(share.high <- Hi(share))
##            YHOO.High
## 2014-01-02     40.49
## 2014-01-03     40.44
## 2014-01-06     40.32
## 2014-01-07     41.20
## 2014-01-08     41.72
## 2014-01-09     41.35
head(share.low <- Lo(share))
##            YHOO.Low
## 2014-01-02    39.31
## 2014-01-03    39.82
## 2014-01-06    39.75
## 2014-01-07    40.08
## 2014-01-08    41.02
## 2014-01-09    40.61
head(share.close <- Cl(share))
##            YHOO.Close
## 2014-01-02      39.59
## 2014-01-03      40.12
## 2014-01-06      39.93
## 2014-01-07      40.92
## 2014-01-08      41.02
## 2014-01-09      40.92
plot(share.close, type='l', main="Yahoo shares close daily price")

head(share.volume <- Vo(share))
##            YHOO.Volume
## 2014-01-02    21504200
## 2014-01-03    15755200
## 2014-01-06    12467500
## 2014-01-07    14100000
## 2014-01-08    18638200
## 2014-01-09    12897300
plot(share.volume, type='l', main="Yahoo shares daily volume")

head(share.Ad <- Ad(share))
##            YHOO.Adjusted
## 2014-01-02         39.59
## 2014-01-03         40.12
## 2014-01-06         39.93
## 2014-01-07         40.92
## 2014-01-08         41.02
## 2014-01-09         40.92

Above, I did not take advantage of quantmod graphic capaiblities in drawing chart series, something I will introduce in my next post. It is as well possible to combine those functions to obtain percentage increase within one period. For example ClCl() returns the relative increment between back-to-back time period close prices.

head(ClCl(share))
##              ClCl.share
## 2014-01-02           NA
## 2014-01-03  0.013387219
## 2014-01-06 -0.004735793
## 2014-01-07  0.024793388
## 2014-01-08  0.002443793
## 2014-01-09 -0.002437835

Relative difference between Open and Close prices associated to the same time-period (day in our case) can be computed by OpCl() function.

head(OpCl(share))
##               OpCl.share
## 2014-01-02 -0.0193212782
## 2014-01-03 -0.0009960159
## 2014-01-06 -0.0029962547
## 2014-01-07  0.0209580838
## 2014-01-08 -0.0065391136
## 2014-01-09 -0.0099201549
plot(OpCl(share), type='l', main="Yahoo share Open-to-Close daily price relative increments")

The highest and lowest open share price values (and further associated data) can be determined as follows starting from a OHLCV object.

seriesHi(share)
##            YHOO.Open YHOO.High YHOO.Low YHOO.Close YHOO.Volume
## 2014-11-18     52.28     52.62    51.34      51.75    26847300
##            YHOO.Adjusted
## 2014-11-18         51.75
seriesLo(share)
##            YHOO.Open YHOO.High YHOO.Low YHOO.Close YHOO.Volume
## 2014-04-11     32.64     33.48    32.15      32.87    28040700
##            YHOO.Adjusted
## 2014-04-11         32.87

Compare what shown above with the following results.

max(Op(share))
## [1] 52.28
min(Op(share))
## [1] 32.64

It is possible to compute an array of logical value per each time period reporting if price values are incrementing (TRUE) or not (FALSE). I herein show that using the Close share price.

share.incr <- seriesIncr(Cl(share))
head(share.incr)
##            YHOO.Close
## 2014-01-02         NA
## 2014-01-03       TRUE
## 2014-01-06      FALSE
## 2014-01-07       TRUE
## 2014-01-08       TRUE
## 2014-01-09      FALSE

Other functions within the same area can detect if the values are decreasing accelerating or decelerating, they are seriesDecr(), seriesAccel(), seriesDecel(). Further, log returns can be computed by the -Return() functions. Here I show daily returns computation.

price.daily.ret <- dailyReturn(share.close)
head(price.daily.ret)
##            daily.returns
## 2014-01-02   0.000000000
## 2014-01-03   0.013387219
## 2014-01-06  -0.004735793
## 2014-01-07   0.024793388
## 2014-01-08   0.002443793
## 2014-01-09  -0.002437835
plot(price.daily.ret, type='l', main = "Yahoo share close price daily log-return")

Returns can be computed based on other periods, such as weekly, monthly, quarterly and annual/yearly returns by further -Return quantmod package functions.

head(weeklyReturn(share.close))
##            weekly.returns
## 2014-01-03     0.01338722
## 2014-01-10     0.02766700
## 2014-01-17    -0.02959010
## 2014-01-24    -0.05248688
## 2014-01-31    -0.05011870
## 2014-02-07     0.03387948
head(monthlyReturn(share.close))
##            monthly.returns
## 2014-01-31    -0.090426875
## 2014-02-28     0.073868370
## 2014-03-31    -0.071631756
## 2014-04-30     0.001392758
## 2014-05-30    -0.036161335
## 2014-06-30     0.013852814
head(quarterlyReturn(share.close))
##            quarterly.returns
## 2014-03-31       -0.09320535
## 2014-06-30       -0.02144847
## 2014-09-30        0.15997723
## 2014-12-31        0.23950920
head(annualReturn(share.close))
##            yearly.returns
## 2014-12-31      0.2758272
head(yearlyReturn(share.close))
##            yearly.returns
## 2014-12-31      0.2758272

In next posts, I will look further into quantmod package capabilities and introduce further exploratory analysis examples of financial data.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.