Category Archives: R stuff

Heston model for Options pricing with ESGtoolkit

Hi everyone! Best wishes for 2016!

In this post, I’ll show you how to use ESGtoolkit, for the simulation of  Heston stochastic volatility model for stock prices.

If you’re interested in seeing other examples of use of ESGtoolkit, you can read these two posts: the Hull and White short rate model and the 2-factor Hull and White short rate model (G2++).

The Heston model was introduced by Steven Heston’s A closed-form solution for options with stochastic volatility with applications to bonds an currency options, 1993. For a fixed risk-free interest rate r,  it’s described as:

dS_t = r S_t dt + \sqrt{v_t}S_t dW^S_t

dv_t = \kappa(\theta - v_t) dt + \sigma \sqrt{v_t}dW^v_t

where dW^S_t dW^v_t = \rho dt.

In this model, under a certain probability, the stock price’s returns on very short periods of time of length dt, are: the risk-free rate + a random fluctuation driven by the terms dW_t and v_t. The  dW_t‘s can be thought of (very simply put) as the gaussian increments of a random walk, which are centered, and have a  variance equal to dt.

On the other hand, v_t is the stochastic variance of the stock prices.  v_t is always positive, and tends to return to a fixed level \theta at a speed controlled by \kappa. The variance of v_t is controlled by \sigma, which is called the volatility of volatility. To finish, dW^S_t and dW^v_t have an instantaneous correlation equal to \rho.

By using this model, one can derive prices for European call options, as described in Calibrating Option Pricing Models with Heuristics. The authors provide a useful function called ‘callHestoncf’, which calculates these prices in R and Matlab.

Here’s the function’s description. I won’t reproduce the function here, please refer to the paper for details:

callHestoncf(S, X, tau, r, v0, vT, rho, k, sigma){
# S = Spot, X = Strike, tau = time to maturity
# r = risk-free rate, q = dividend yield
# v0 = initial variance, vT = long run variance (theta)
# rho = correlation, k = speed of mean reversion (kappa)
# sigma = volatility of volatility
}

Now, it’s time to use ESGtoolkit for Monte Carlo pricing. We are going to price 3 European Call options on (S_t)_{t \geq 0}, with 3 different exercise prices. We use 100000 simulations on 15 years, with monthly steps. Here are the parameters which will be useful for the simulation:

rm(list=ls())
library(ESGtoolkit)

#Initial stock price
S0 <- 100
# Number of simulations (feel free to reduce this)
n <- 100000
# Sampling frequency
freq <- "monthly"
# volatility mean-reversion speed
kappa <- 0.003
# volatility of volatility
volvol <- 0.009
# Correlation between stoch. vol and spot prices
rho <- -0.5
# Initial variance
V0 <- 0.04
# long-term variance
theta <- 0.04
#Initial short rate
r0 <- 0.015

# Options maturities
horizon <- 15
# Options' exercise prices
strikes <- c(140, 100, 60)

For the simulation of the Heston model with ESGtoolkit, we first need to define how to make simulations of the terms dW^S_t and dW^v_t.  This is done by the package’s function ‘simshocks’, in which you can define the type of dependence between models’ increments:

# Simulation of shocks with given correlation
set.seed(5) # reproducibility seed
shocks <- simshocks(n =  n,
horizon =  horizon,
frequency =  freq,
method = "anti",
family = 1, par =  rho)

This function provides a list with 2 components, each containing simulated random gaussian increments. Both of these components will be useful for the simulation of (v_t)_{t \geq 0} and (S_t)_{t \geq 0}.

#  Stochastic volatility  simulation
sim.vol <- simdiff(n =  n, horizon =  horizon,
frequency =  freq, model = "CIR", x0 =  V0,
theta1 =  kappa*theta, theta2 =  kappa,
theta3 =  volvol, eps =  shocks[[1]])

# Stock prices simulation
sim.price <- simdiff(n = n, horizon = horizon,
frequency = freq, model = "GBM", x0 = S0,
theta1 = r0, theta2 = sqrt(sim.vol),
eps = shocks[[2]])

We are now able to calculate options prices, with the 3 different
exercise prices. You’ll need to have imported ‘callHestoncf’, for this piece of code to work.

# Stock price at maturity (15 years)
S_T <- sim.price[nrow(sim.price), ]

### Monte Carlo prices
#### Estimated Monte Carlo price
discounted.payoff <- function(x)
{
(S_T - x)*(S_T - x > 0)*exp(-r0*horizon)
}
mcprices <- sapply(strikes, function(x)
mean(discounted.payoff(x)))

#### 95% Confidence interval around the estimation
mcprices95 <- sapply(strikes,  function(x)
t.test(discounted.payoff(x),
conf.level = 0.95)$conf.int)

#### 'Analytical' prices given by 'callHestoncf'
pricesAnalytic <- sapply(strikes, function(x)
callHestoncf(S = S0, X = x, tau = horizon,
r = r0, q = 0, v0 = V0, vT = theta,
rho = rho, k = kappa, sigma = volvol))

results <- data.frame(cbind(strikes, mcprices,
t(mcprices95), pricesAnalytic))
colnames(results) <- c("strikes", "mcprices", "lower95",
"upper95", "pricesAnalytic")

print(results)

strikes mcprices  lower95  upper95 pricesAnalytic
1     140 25.59181 25.18569 25.99793         25.96174
2     100 37.78455 37.32418 38.24493         38.17851
3      60 56.53187 56.02380 57.03995         56.91809

From these results, we see that the Monte Carlo prices for the 3 options are
fairly close to the price calculated by using the function ‘callHestoncf’
(uses directly formulas for the prices calculation). The 95% confidence
interval contains the theoretical price. Do not hesitate to change the seed,
and re-run the previous code.

Below are the option prices, as functions of the number of simulations.
The theoretical price calculated by ‘callHestoncf’ is drawn in blue,
the average Monte Carlo price in red, and the shaded region represents
the 95% confidence interval around the mean (the Monte Carlo price).

RConvergenceplot

1 Comment

Filed under ESGtoolkit, R stuff

Calibrated Hull and White short-rates with RQuantLib and ESGtoolkit

In this post, I use R packages RQuantLib and ESGtoolkit for the calibration and simulation of the famous Hull and White short-rate model.

QuantLib is an open source C++ library for quantitative analysis, modeling, trading, and risk management of financial assets. RQuantLib is built upon it, providing R users with an interface to the library .

ESGtoolkit provides tools for building Economic Scenarios Generators (ESG) for Insurance. The package is primarily built for research purpose, and comes with no warranty. For an introduction to ESGtoolkit, you can read this slideshare, or this blog post. A development version of the package is available on Github with, I must admit, only 2 or 3 commits for now.

The Hull and White (1994) model was proposed to address Vasicek’s model poor fitting of the initial term structure of interest rates. The model is defined as:

dr(t) = \left( \theta(t) - a r(t) \right) dt + \sigma dW(t)

Where a and \sigma are positive constants, and (W(t))_{t \geq 0} is a standard brownian motion under a risk-neutral probability. t \mapsto \theta(t), which is constant in Vasicek’s model, is a function constructed so as to correctly match the initial term structure of interest rates.

An alternative and convenient representation of the model is:

r(t) = x(t) + \alpha(t),

where

dx(t) = - a x(t) dt + \sigma dW(t),

x(0) = 0,

\alpha(t) = f^M(0, t) + \frac{\sigma^2}{2 a^2}(1 - e^{-at})^2

and f^M(0, t) are market-implied instantaneous forward rates for maturities t \geq 0.

In insurance market consistent pricing, the model is often calibrated to swaptions, as there are no market prices for  embedded options and guarantees.

Two parameters a and \sigma are surely not enough to get back the whole swaptions volatility surface (or even ATM swaptions). But a perfect calibration to market-quoted swaptions isn’t vital and may lead to unnecessary overfitting. A more complex model may fit more precisely the swaptions volatility surface, but could still be proved to be more wrong for the purpose: insurance liabilities do not even match exactly market swaptions’ characteristics.

It’s worth mentioning that the yield curve bootstrapping procedure currently implemented in RQuantLib, makes the implicit assumption that LIBOR is a good proxy for risk-free rates, and collateral doesn’t matter. These assumptions are still widely used in insurance, along with simple (parallel) adjustments for credit/liquidity risks, but were abandoned by markets after the 2007 subprime crisis.

For more details on the new multiple-curve approach, see for example http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2219548. In this paper, the authors introduce a multiple-curve bootstrapping procedure, available in QuantLib.

# Cleaning the workspace
rm(list=ls())

# RQuantLib loading 
suppressPackageStartupMessages(library(RQuantLib))
# ESGtoolkit loading
suppressPackageStartupMessages(library(ESGtoolkit))

# Frequency of simulation and interpolation
freq <- "monthly" 
delta_t <- 1/12

# This data is taken from sample code shipped with QuantLib 0.3.10.
params <- list(tradeDate=as.Date('2002-2-15'),
               settleDate=as.Date('2002-2-19'),
               payFixed=TRUE,
               dt=delta_t,
               strike=.06,
               method="HWAnalytic",
               interpWhat="zero",
               interpHow= "spline")

# Market data used to construct the term structure of interest rates
# Deposits and swaps
tsQuotes <- list(d1w =0.0382,
                 d1m =0.0372,
                 d3m = 0.0363,
                 d6m = 0.0353,
                 d9m = 0.0348,
                 d1y = 0.0345,
                 s2y = 0.037125,
                 s3y =0.0398,
                 s5y =0.0443,
                 s10y =0.05165,
                 s15y =0.055175)

# Swaption volatility matrix with corresponding maturities and tenors
swaptionMaturities <- c(1,2,3,4,5)
swapTenors <- c(1,2,3,4,5)
volMatrix <- matrix(
  c(0.1490, 0.1340, 0.1228, 0.1189, 0.1148,
    0.1290, 0.1201, 0.1146, 0.1108, 0.1040,
    0.1149, 0.1112, 0.1070, 0.1010, 0.0957,
    0.1047, 0.1021, 0.0980, 0.0951, 0.1270,
    0.1000, 0.0950, 0.0900, 0.1230, 0.1160),
  ncol=5, byrow=TRUE)

# Pricing the Bermudan swaptions
pricing <- RQuantLib::BermudanSwaption(params, tsQuotes,
                            swaptionMaturities, swapTenors, volMatrix)
summary(pricing)

# Constructing the spot term structure of interest rates 
# based on input market data
times <- seq(from = delta_t, to = 5, by = delta_t)
curves <- RQuantLib::DiscountCurve(params, tsQuotes, times)
maturities <- curves$times
marketzerorates <- curves$zerorates
marketprices <- curves$discounts

############# Hull-White short-rates simulaton

# Horizon, number of simulations, frequency
horizon <- 5 # I take horizon = 5 because of swaptions maturities
nb.sims <- 10000

# Calibrated Hull-White parameters from RQuantLib
a <- pricing$a
sigma <- pricing$sigma

# Simulation of gaussian shocks with ESGtoolkit
set.seed(4)
eps <- ESGtoolkit::simshocks(n = nb.sims, horizon = horizon,
                             frequency = freq)

# Simulation of the factor x with ESGtoolkit
x <- ESGtoolkit::simdiff(n = nb.sims, horizon = horizon, 
                         frequency = freq,  
                         model = "OU", 
                         x0 = 0, theta1 = 0, 
                         theta2 = a, 
                         theta3 = sigma,
                         eps = eps)

# I use RQuantlib's forward rates. With the low monthly frequency, 
# I consider them as being instantaneous forward rates
fwdrates <- ts(replicate(nb.sims, curves$forwards), 
                start = start(x), 
                deltat = deltat(x))

# alpha
t.out <- seq(from = 0, to = horizon, by = delta_t)
param.alpha <- ts(replicate(nb.sims, 0.5*(sigma^2)*(1 - exp(-a*t.out))^2/(a^2)), 
                start = start(x), deltat = deltat(x))
alpha <- fwdrates + param.alpha

# The short-rate
r <- x + alpha

# Stochastic discount factors (numerical integration currently is very basic)
Dt <- ESGtoolkit::esgdiscountfactor(r = r, X = 1)

# Monte Carlo prices and zero rates deduced from stochastic discount factors
montecarloprices <- rowMeans(Dt)
montecarlozerorates <- -log(montecarloprices)/maturities # RQuantLib uses continuous compounding

# Confidence interval for the difference between market and monte carlo prices
conf.int <- t(apply((Dt - marketprices)[-1, ], 1, function(x) t.test(x)$conf.int))

# Viz
par(mfrow = c(2, 2))
# short-rate quantiles
ESGtoolkit::esgplotbands(r, xlab = "maturities", ylab = "short-rate quantiles", 
                         main = "short-rate quantiles") 
# monte carlo vs market zero rates
plot(maturities, montecarlozerorates, type='l', col = 'blue', lwd = 3,
     main = "monte carlo vs market \n zero rates")
points(maturities, marketzerorates, col = 'red')
# monte carlo vs market zero-coupon prices
plot(maturities, montecarloprices, type ='l', col = 'blue', lwd = 3, 
     main = "monte carlo vs market \n zero-coupon prices")
points(maturities, marketprices, col = 'red')
# confidence interval for the price difference
matplot(maturities[-1], conf.int, type = 'l', 
        main = "confidence interval \n for the price difference")

img

2 Comments

Filed under R stuff

Monte Carlo simulation of a 2-factor interest rates model with ESGtoolkit

The whole post can be found here on RPubs, as (according to me !) this website provides a nice display for R code, figures and LaTeX formulas. Plus, you can publish directly on Rpubs, simply by using Markdown within RStudio.

Concerning the post content, it’s worth saying that there also a numerical integration error, induced by the calculation of discount factors.

For more ressources on ESGtoolkit, you can read these slides of a talk that I gave last month at Institut de Sciences Financières et d’Assurances (Université Lyon 1), and the package vignette.

Please, cite the package whenever you use it, according to citation(“ESGtoolkit”). And do not hesitate to report bugs or send features request 😉

simG2plus

3 Comments

Filed under Conference Talks, R stuff

Impact of correlated predictions on the variance of an ensemble model

Let X_1 and X_2 be the prediction errors of two statistical/machine learning algorithms. X_1 and X_2 have relatively low bias, and high variances \sigma^2_1 and \sigma^2_2. They are also correlated, having a Pearson correlation coefficient equal to \rho_{1, 2}.

Aggregating models 1 and 2 might result in a predictive model with lower prediction error variance than 1 and 2. But not all the times. For those who attended statistics/probability/portfolio optimization classes, this may probably sound obvious; you can directly jump to the illustrative R part, below.

Let Z := \alpha X_1 + (1-\alpha) X_2, with \alpha \in \left[0, 1\right], be the prediction error of the ensemble model built with 1 and 2. We have :

Var(Z) = \alpha^2 \sigma^2_1 + (1 - \alpha)^2\sigma^2_2 + 2\alpha(1-\alpha)Cov(X_1, X_2)

And from the fact that :

Cov(X_1, X_2) = \rho_{1, 2} \sigma_1\sigma_2

We get :

Var(Z) = \alpha^2 \sigma^2_1 + (1 - \alpha)^2\sigma^2_2 + 2\alpha(1-\alpha) \rho_{1, 2} \sigma_1\sigma_2

Now, let’s see how Var(Z) changes with an increase of \alpha, the ensemble’s allocation for model 1 :

\frac{\partial Var(Z)}{\partial \alpha}= 2\alpha \sigma^2_1 - 2 (1 - \alpha) \sigma^2_2 + 2(1-2\alpha) \rho_{1, 2} \sigma_1\sigma_2

When \alpha is close to 0, that is, when the ensemble contains almost only model 2, we have :

\frac{\partial Var(Z)}{\partial \alpha}_{|\alpha = 0}= 2 \left( \rho_{1, 2} \sigma_1\sigma_2 - \sigma^2_2 \right) = 2 \sigma^2_2\left( \rho_{1, 2} \frac{\sigma_1}{\sigma_2} - 1 \right)

That’s the relative change in the variance of the ensemble prediction error, induced by introducing model 1 in an ensemble containing almost only 2. Hence, if \rho_{1, 2} = \frac{\sigma_2}{\sigma_1}, increasing the allocation of model 1 won’t increase or decrease the variance of ensemble prediction at all. The variance will decrease if we introduce in the ensemble a model 1, so that \rho_{1, 2} \leq \frac{\sigma_2}{\sigma_1}. If \rho_{1, 2} \geq \frac{\sigma_2}{\sigma_1}, it won’t decrease, no matter the combination of X_1 and X_2.

For a simple illustrative example in R, I create simulated data observations :

  # number of observations
  n <- 100
  u <- 1:n

  # Simulated observed data
  intercept <- 5 
  slope <- 0.2
  set.seed(2)

  ## data
  y <-  intercept + slope*u + rnorm(n)
  plot(u, y, type = 'l', main = "Simulated data observations")
  points(u, y, pch = 19)

graph1

I fit a linear regression model to the data, as a benchmark model :

  # Defining training and test sets
  train <- 1:(0.7*n)
  test <- -train
  u.train <- u[(train)]
  y.train <- y[(train)]
  u.test <- u[(test)]
  y.test <- y[(test)]

  # Fitting the benchmark model to the training set 
  fit.lm <- lm(y.train ~ u.train)

  (slope.coeff <- fit.lm$coefficients[2])
  (intercept.coeff <- fit.lm$coefficients[1])

## u.train
## 0.1925
## (Intercept)
## 5.292

Obtaining the predicted values from the benchmark model, and prediction error

  # Predicted values from benchmark model on the test set
  y.pred <- intercept.coeff + slope.coeff*u.test   
  
  # prediction error from linear regression
  pred.err <- y.pred - y.test
  (mean.pred.err <- mean(pred.err))
  (var.pred.err <- var(pred.err))

## [1] -0.1802
## [1] 1.338

Now, I consider two other models, 1 and 2 :

y = a + b \times u + sin(u) + \epsilon_1

and

y = a + b  \times u - 0.35  \times sin(u) + \epsilon_2

with \epsilon_1 and \epsilon_2 being 2 correlated gaussians with zero mean and constant variances (well, not really “models” that i fit, but these help me to build fictitious various predictions with different correlations for the illustration). The slope and intercept are those obtained from the benchmark model.

 # Alternative model 1 (low bias, high variance, oscillating)
  m <- length(y.pred)
  eps1 <- rnorm(m, mean = 0, sd = 1.5)
  y.pred1 <- intercept.coeff + slope.coeff*u.test + sin(u.test) + eps1

 # prediction error for model 1
  pred.err1 <- y.pred1 - y.test

We can visualize the predictions of model 1, 2 and the benchmark, with different prediction errors correlations between model 1 and 2, to get an intuition of the possible ensemble predictions :

  # Different prediction errors correlations for 1 and 2
  rho.vec <- c(-1, -0.8, 0.6, 1)  

  # Independent random gaussian numbers defining model 2 errors
  eps <- rnorm(m, mean = 0, sd = 2)

  # Plotting the predictions with different correlations
  par(mfrow=c(2, 2))  
  for (i in 1:4)
  {
    rho <- rho.vec[i]

    # Correlated gaussian numbers (Cholesky decomposition)
    eps2 <- rho*eps1 + sqrt(1 - rho^2)*eps

    # Alternative  model 2 (low bias, higher variance than 1, oscillating)
    y.pred2 <- intercept.coeff + slope.coeff*u.test - 0.35*sin(u.test) + eps2    

    # prediction error for model 2
    pred.err2 <- y.pred2 - y.test    

    # predictions from 1 & 2 correlation
    corr.pred12 <- round(cor(pred.err1, pred.err2), 2)

    # Plot
    plot(u.test, y.test, type = "p", 
         xlab = "test set values", ylab = "predicted values",
         main = paste("models 1 & 2 pred. values \n correlation :", 
                      corr.pred12))
    points(u.test, y.test, pch = 19)
    lines(u.test, y.pred, lwd = 2)
    lines(u.test, y.pred1, 
          col = "blue", lwd = 2)
    lines(u.test, y.pred2, 
          col = "red", lwd = 2)
  }

graph2v2

Now including allocations of models 1 and 2, we can see how the ensemble variance evolves as a function of allocation and correlation :

  # Allocation for model 1 in the ensemble
  alpha.vec <- seq(from = 0, to = 1, by = 0.05)  

  # Correlations between model 1 and model 2
  rho.vec <- seq(from = -1, to = 1, by = 0.05)  

  # Results matrices
  nb.row <- length(alpha.vec)
  nb.col <- length(rho.vec)  

  ## Average prediction errors of the ensemble
  mean.pred.err.ens <- matrix(0, nrow = nb.row, ncol = nb.col)
  rownames(mean.pred.err.ens) <- paste0("pct. 1 : ", alpha.vec*100, "%")
  colnames(mean.pred.err.ens) <- paste0("corr(1, 2) : ", rho.vec)

  ## Variance of prediction error of the ensemble
  var.pred.err.ens <- matrix(0, nrow = nb.row, ncol = nb.col)
  rownames(var.pred.err.ens) <- paste0("pct. 1 : ", alpha.vec*100, "%")
  colnames(var.pred.err.ens) <- paste0("corr(1, 2) : ", rho.vec)

  # loop on correlations and allocations 
  for (i in 1:nb.row)
  {
    for (j in 1:nb.col)
    {
      alpha <- alpha.vec[i]
      rho <- rho.vec[j]

      # Alternative model 2 (low bias, higher variance, oscillating)
      eps2 <- rho*eps1 + sqrt(1 - rho^2)*eps
      y.pred2 <- intercept.coeff + slope.coeff*u.test - 0.35*sin(u.test) + eps2
      pred.err2 <- y.pred2 - y.test

      # Ensemble prediction error
      z <- alpha*pred.err1 + (1-alpha)*pred.err2
      mean.pred.err.ens[i, j] <- mean(z)      
      var.pred.err.ens[i, j] <-  var(z)
    }
  }
  res.var <- var.pred.err.ens
 
  # Heat map for the variance of the ensemble
  filled.contour(alpha.vec, rho.vec, res.var, color = terrain.colors, 
                 main = "prediction error variance for the ensemble", 
                 xlab = "allocation in x1", ylab = "correlation between 1 and 2")

graph3v2

Hence, the lower the correlation between 1 and 2, the lower the variance of the ensemble model prediction. This, combined with an allocation of 1 comprised between 35\% and 50\% seem to be the building blocks for the final model. The ensemble models biases helps in making a choice of allocation (this is actually an optimization problem that can be directly solved with portfolio theory).

Finding the final ensemble, with lower variance and lower bias :

# Final ensemble

## Allocation
alpha.vec[which.min(as.vector(res.var))]
## [1] 0.45

## Correlation
res.bias <- abs(mean.pred.err.ens)
which.min(res.bias[which.min(as.vector(res.var)), ])
 ## corr(1, 2) : 
-0.7
 ## 7

Creating the final model with these parameters :

    rho <- -0.7
    eps2 <- rho*eps1 + sqrt(1 - rho^2)*eps

    # Alternative model 2 (low bias, higher variance, oscillating)
    y.pred2 <- intercept.coeff + slope.coeff*u.test - 0.35*sin(u.test) + eps2        

    # Final ensemble prediction
    y.pred.ens <- 0.45*y.pred1 + 0.55*y.pred2
  
    # Plot
    plot(u.test, y.test, type = "p", 
         xlab = "test set", ylab = "predicted values",
         main = "Final ensemble model (green)")
    points(u.test, y.test, pch = 19)
    # benchmark 
    lines(u.test, y.pred, lwd = 2)
    # model 1
    lines(u.test, y.pred1, col = "blue", lwd = 2)
    # model 2
    lines(u.test, y.pred2, col = "red", lwd = 2)
    # ensemble model with 1 and 2
    points(u.test, y.pred.ens, col = "green", pch = 19)
    lines(u.test, y.pred.ens, col = "green", lwd = 2)

graph4v2

Performance of the final model :

    # Benchmark
    pred.err <- y.pred - y.test
    # Model 1 
    pred1.err <- y.pred1 - y.test
    # Model 2
    pred2.err <- y.pred2 - y.test
    # Ensemble model
    pred.ens.err <- y.pred.ens - y.test
    
    # Bias comparison 
    bias.ens <- mean(y.pred.ens - y.test)  
    bias.ens_vsbench <- (bias.ens/mean(y.pred - y.test) - 1)*100
    bias.ens_vs1 <- (bias.ens/mean(y.pred1 - y.test) - 1)*100
    bias.ens_vs2 <- (bias.ens/mean(y.pred2 - y.test) - 1)*100
  
    # Variance comparison
    var.ens <- var(y.pred.ens - y.test)
    var.ens_vsbench <- (var.ens/var(y.pred - y.test) - 1)*100
    var.ens_vs1 <- (var.ens/var(y.pred1 - y.test) - 1)*100
    var.ens_vs2 <- (var.ens/var(y.pred2 - y.test) - 1)*100
  
   cbind(c(bias.ens_vsbench, bias.ens_vs1, bias.ens_vs2), c(var.ens_vsbench, var.ens_vs1, var.ens_vs2))
##        [,1]  [,2]
## [1,] -95.31 46.27
## [2,] -112.12 -62.93
## [3,] -88.33 -45.53

3 Comments

Filed under R stuff

Rmetrics Workshop on R in Finance and Insurance, Paris 2014

Here are the slides of the talk that I gave yesterday with Prof. Frédéric Planchet at the 8th Rmetrics workshop in Insurance and Finance :

 

http://www.ressources-actuarielles.net/C1256F13006585B2/0/39B54166464089AFC12572B0003D88C2/$FILE/Rmetrics.pdf?OpenElement

 

Image

The R codes can found here :

  • For ESG

http://www.ressources-actuarielles.net/EXT/ISFA/fp-isfa.nsf/34a14c286dfb0903c1256ffd00502d73/a5e99e9abf5d3674c125772f00600f6c/$FILE/examplesESG.R

  • For ESGtoolkit

http://www.ressources-actuarielles.net/EXT/ISFA/fp-isfa.nsf/34a14c286dfb0903c1256ffd00502d73/a5e99e9abf5d3674c125772f00600f6c/$FILE/examplesESGtoolkit.R

I also submitted (a bit late, maybe) a Shiny app for the  Shiny App contest (which is the example @ page 55 of the slides).

https://contest.shinyapps.io/ShinyALMapp/

The username is : contest.

The password : rmetrics.

However, in my opinion, my app is veerry slow. This is due to the way that I dealt with global/local variables. In the first section,  ‘Simulation’, I make projections of the portfolio assets, let’s call the associated R variables : S.CAC and S.SP. In server.R, the plot is obtained with  output$plotSimulation. In the second section, ‘Portfolio’, I had to duplicate the code for the simulation (veerry annoying… here’s the bottleneck), because S.CAC and S.SP could’nt be seen in the scope of output$plotPortfolio defined in server.R.

I didn’t have the time to investigate more by now. But If somebody knows how to deal with this in Shiny, I’ll be happy to hear !

Leave a comment

Filed under Conference Talks, R stuff

Solvency II yield curve ‘fits exactly’. Too tightly to explain ?

For the QIS5 and the LTGA (Quantitative Impact Studies), the Prudential Authority (EIOPA) suggested the use of Smith-Wilson method for the purpose of discounting future cash flows. A description of this method can be found here. The technical specifications for the QIS5 argued about this method that : “the term structure is fitted exactly to all observed zero coupon bond prices”. The method was subsequently used for the LTGA.

But : is fitting exactly, a desirable feature for a yield curve ? I won’t give a definitive answer here, but some hints, as the question could be answered in a lot of different ways (well… this post is mainly an excuse to present you ycinterextra in action after this, for you to tell me how to improve the package).

I’ll fit the Nelson-Siegel (NS), Svensson (SV) and Smith-Wilson (SW) models (implemented in ycinterextra) to the data from Diebold, F. X., & Li, C. (2006). Forecasting the term structure of government bond yields. Journal of econometrics, 130(2), 337-364. And  we’ll see how accurate are the  values they produce on out-of-sample data. I used mixed ideas from Diebold and Li (2006), along with Gilli, Manfred and Schumann, Enrico, A Note on ‘Good Starting Values’ in Numerical Optimisation (June 3, 2010). Available at SSRN: http://ssrn.com/abstract=1620083.

The data can be found here : http://www.ssc.upenn.edu/~fdiebold/papers/paper49/FBFITTED.txt.

# I put the data in a text file.
data.diebold.li <- scan("FBFITTED.txt", skip = 14)

Now, loading the package and formatting the data :

library(ycinterextra)

# Number of observation dates from January 1985 through December 2000
nbdates <- 372

# Time to maturities for each observation date
mats <- round(c(1, 3, 6, 9, 12, 15, 18, 21, 24, 30, 36, 48, 60, 72, 84, 96, 108, 120)/12, 3)
nbmats <- length(mats)
nbcol <- nbmats + 1

# Formatting the data
data.diebold.li.final <- NULL
for(i in seq_len(nbdates))
{
 data.diebold.li.final <- rbind(data.diebold.li.final, data.diebold.li[(nbcol*(i -1)+1):(nbcol*(i -1)+nbcol)])
}

A 3D plot of the observed data can be obtained with the following function :

x <- mats
y <- data.diebold.li.final[,1]
z <- data.diebold.li.final[,-1]

graph3D <- function(x, y, z)
{
 par(bg = "white")
 nrz <- nrow(z)
 ncz <- ncol(z)
 # Create a function interpolating colors in the range of specified colors
 jet.colors <- colorRampPalette( c("blue", "green") )
 # Generate the desired number of colors from this palette

 nbcol <- 100
 color <- jet.colors(nbcol)
 # Compute the z-value at the facet centres
 zfacet <- z[-1, -1] + z[-1, -ncz] + z[-nrz, -1] + z[-nrz, -ncz]
 # Recode facet z-values into color indices
 facetcol <- cut(zfacet, nbcol)
 persp(y, x, z, col = color[facetcol], phi = 30, theta = -30, ticktype= 'detailed', nticks = 3,
 xlab = "observation date", ylab = "maturity", zlab = "zero rates in pct.")
}

graph3D(x, y, z)

image1

Now, let’s calibrate the 3 models, NS, SV and SW to the observed yield curves (this takes a lot of time, as there are 372 curves, each with 18 maturities) :

nrz <- nrow(z)
ncz <- ncol(z)
yM.NS <- matrix(NA, nrow = nrz, ncol = ncz)
yM.SV <- matrix(NA, nrow = nrz, ncol = ncz)
yM.SW <- matrix(NA, nrow = nrz, ncol = ncz)

# Loop over the dates from January 1985 through December 2000
for (t in seq_len(nbdates))
{
 # market yields
 yM <- as.numeric(data.diebold.li.final[t, -1])

 yM.NS[t, ] <- fitted(ycinter(yM = yM, matsin = mats, matsout = mats,
 method = "NS", typeres = "rates"))

 yM.SV[t, ] <- fitted(ycinter(yM = yM, matsin = mats, matsout = mats,
 method = "SV", typeres = "rates"))

 yM.SW[t, ] <- fitted(ycinter(yM = yM, matsin = mats, matsout = mats,
 method = "SW", typeres = "rates"))
}

After a little time… SW curve seems to fit very well indeed, and the plot below looks pretty similar to the one we made before :

graph3D(x, y, yM.SW)

image2

With a plot of the average yield curve (average of the 372 observed curves as in Diebold and Li (2006)), we can see that all of the three models seem to provide fairly good fits to the data :

# Average yield curves
plot(mats, apply(z, 2, mean), pch = 3, xlab = "time to maturity",
 ylab = "yield to maturity",
 main = paste("Actual and fitted (model-based)", "\n", "average yield curves"))
lines(mats, apply(yM.NS, 2, mean), col = "blue", lwd= 2)
lines(mats, apply(yM.SV, 2, mean), col = "red", lwd= 2)
lines(mats, apply(yM.SW, 2, mean), col = "green", lwd=2, lty = 2)
legend("bottomright", c("Actual", "NS", "SV", "SW"),
 col=c("black", "blue", "red", "green"), text.col = c("black", "blue", "red", "green"),
 lty=c(NA, 1, 1, 2), lwd = c(1, 2, 2, 2), pch=c(3, NA, NA, NA))

image3

Here are plots of the residuals for NS and SW (for SV, the plot is pretty similar to the plot for NS) :

# residuals
graph3D(x, y, yM.NS-z)
graph3D(x, y, yM.SW-z)

image4image5

Now, taking a closer look at some given cross-sections (here on 3/31/1989), we see what’s actually going on (click for a better resolution) :

t <- 231

par(mfrow=c(2,2))

plot(mats, data.diebold.li.final[t, -1], xlab = "time to maturity",
 ylab = "yield to maturity", col="red", main = "Observed data")
points(mats, data.diebold.li.final[t, -1], col="red")

plot(mats, yM.NS [t,], type='l', xlab = "time to maturity",
 ylab = "yield to maturity", col = "blue", main = "Nelson-Siegel fit")
points(mats, data.diebold.li.final[t, -1], col="red")

plot(mats, yM.SV[t,], type='l', xlab = "time to maturity",
 ylab = "yield to maturity", col = "blue", main = "Svensson fit")
points(mats, data.diebold.li.final[t, -1], col="red")

plot(mats, yM.SW[t,], type='l', xlab = "time to maturity",
 ylab = "yield to maturity", col = "blue", main = "Smith-Wilson fit")
points(mats, data.diebold.li.final[t, -1], col="red")

image6

And for 7/31/1989 (with t == 235) :

image7

NS fits, well. SV fits a litte bit better, due to its extended number of parameters. SW fits perfectly to each point, due to its laaarge number of parameters.

Let’s see how accurate are the values produced by the 3 methods on out-of-sample values, by the use of a simple machine learning test set method (a  method like k-fold cross-validation can be used as well, it’s computationnally more expensive, but simple though). I divide the sample data into 2 sets :

  •  A learning set with the zero rates observed at time to maturities : c(1, 6, 9, 15, 18, 21, 24, 30, 48, 72, 84, 96, 120)/12
  • A test set with the zero rates observed at time to maturities : c(3, 12, 36, 60, 108)/12

The models’ parameters are estimated using the learning set, and the test set is used to assess how well (with less error) the model can predict unknown values :

# Maturities for test set
matsoutsample <- c(3, 12, 36, 60, 108)/12
matchoutsample <- match(matsoutsample, mats)
m <- length(matsoutsample)

# Maturities for learning set
matsinsample <- mats[-matchoutsample]

 

# Residuals matrices
res.NS <- matrix(NA, nrow = nrz, ncol = length(matchoutsample))
res.SV <- matrix(NA, nrow = nrz, ncol = length(matchoutsample))
res.SW <- matrix(NA, nrow = nrz, ncol = length(matchoutsample))

# Loop over the dates from January 1985 through December 2000
for (t in seq_len(nbdates))
{
 yM.t <- as.numeric(data.diebold.li.final[t, -c(1, matchoutsample)])

 yM.t.bechm <- as.numeric(data.diebold.li.final[t, -1])

 res.NS[t, ] <- (fitted(ycinter(yM = yM.t, matsin = matsinsample, matsout = mats,
 method = "NS", typeres = "rates")) - yM.t.bechm)[matchoutsample]

 res.SV[t, ] <- (fitted(ycinter(yM = yM.t, matsin = matsinsample, matsout = mats,
 method = "SV", typeres = "rates")) - yM.t.bechm)[matchoutsample]

 res.SW[t, ] <- (fitted(ycinter(yM = yM.t, matsin = matsinsample, matsout = mats,
 method = "SW", typeres = "rates")) - yM.t.bechm)[matchoutsample]
}

For each model, we get the 372 mean squared errors as follows :

mse.ns <- apply((res.NS), 1, function(x) crossprod(x))/m
mse.sv <- apply((res.SV), 1, function(x) crossprod(x))/m
mse.sw <- apply((res.SW), 1, function(x) crossprod(x))/m

A summary of these mean squared errors shows that SW actually overfits, and fails to predict the out-of-sample data :

summary(mse.ns)
 Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0005895 0.0054050 0.0117700 0.0227100 0.0198400 0.5957000
summary(mse.sv)
 Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0003694 0.0075310 0.0154200 0.0329400 0.0319700 0.6079000
summary(mse.sw)
 Min. 1st Qu. Median Mean 3rd Qu. Max.
 0.057 0.556 1.478 431.200 10.820 29040.000

Leave a comment

Filed under R stuff, Yield curve

How much faster is calibration with parallel processing and/or R byte-code compiling ?

The Svensson model is a 6-factor yield curve model that has been derived from the Nelson-Siegel model in : Svensson, L. E. (1995). Estimating forward interest rates with the extended Nelson & Siegel method. Sveriges Riksbank Quarterly Review, 3(1) :13-26.

In this post, I compare the performances of R package mcGlobaloptim on the model’s calibration to observed (fictitious) rates.

This will be done in pure R, with sequential or parallel processing, with or without byte-code compilation (shortly, a compact numeric code between source code and machine code, in which the source code is ‘translated’; please refer to this article or this one).

Here are the R packages required for this experiment :

# Support for simple parallel computing in R
require(snow)
# Global optimization using Monte Carlo and Quasi 
# Monte Carlo simulation
require(mcGlobaloptim)
# The R Compiler Package
require(compiler)
# Sub microsecond accurate timing functions
require(microbenchmark)
# Benchmarking routine for R
require(rbenchmark)

The R Code for the objective function (function to be minimized) used here, appears in : Gilli, Manfred and Schumann, Enrico, A Note on ‘Good Starting Values’ in Numerical Optimisation (June 3, 2010). Available at SSRN: http://ssrn.com/abstract=1620083 or http://dx.doi.org/10.2139/ssrn.1620083

### Calibrating the Nelson-Siegel-Svensson model
### : Svensson model
NSS2 <- function(betaV, mats) {
    gam1 <- mats/betaV[5]
    gam2 <- mats/betaV[6]
    aux1 <- 1 - exp(-gam1)
    aux2 <- 1 - exp(-gam2)
    betaV[1] + betaV[2] * (aux1/gam1) + betaV[3]
*(aux1/gam1 + aux1 - 1) + 
        betaV[4] * (aux2/gam2 + aux2 - 1)
}

# True parameters
betaTRUE <- c(5, -2, 5, -5, 1, 3)

# Input maturities
mats <- c(1, 3, 6, 9, 12, 15, 18, 21, 24, 30, 36, 48, 60
, 72, 84, 96, 108, 120)/12

# Corresponding yield to maturities
yM <- NSS2(betaTRUE, mats)
dataList <- list(yM = yM, mats = mats, model = NSS2)

# Bounds for parameters' search
settings <- list(min = c(0, -15, -30, -30, 0, 3), 
max = c(15, 30, 30, 30, 3, 6), d = 6)

# Objective function
OF <- function(betaV, dataList) {
    mats <- dataList$mats
    yM <- dataList$yM
    model <- dataList$model
    y <- model(betaV, mats)
    aux <- y - yM
    crossprod(aux)
}
# 

A compiled version of the objective function is then defined, using the package compiler from L. Tierney (now a part of base R) :

# define compiled objective function
OFcmp <- cmpfun(OF)
OFcmp

## function(betaV, dataList) {
##     mats <- dataList$mats
##     yM <- dataList$yM
##     model <- dataList$model
##     y <- model(betaV, mats)
##     aux <- y - yM
##     crossprod(aux)
## }
## bytecode: 0x000000000dfa98b8

The 4 situations being tested are the following :
* Optimization with sequential processing and no byte-code compilation
* Optimization with sequential processing and byte-code compilation
* Optimization with parallel processing and no byte-code compilation
* Optimization with parallel processing and byte-code compilation

Thus, the performance of multiStartoptim, based on the 4 following functions will be tested :

OF_classic <- function(.n) {
    multiStartoptim(objectivefn = OF, data = dataList, 
lower = settings$min, upper = settings$max, 
method = "nlminb", nbtrials = .n, 
typerunif = "sobol")
}

OF_cmp <- function(.n) {
    multiStartoptim(objectivefn = OFcmp, data = dataList, 
lower = settings$min, upper = settings$max, 
method = "nlminb", 
nbtrials = .n, typerunif = "sobol")
}

OF_classic_parallel <- function(.n) {
    multiStartoptim(objectivefn = OF, data = dataList, 
lower = settings$min, upper = settings$max, 
method = "nlminb", 
nbtrials = .n, typerunif = "sobol", 
nbclusters = 2)
}

OF_cmp_parallel <- function(.n) {
    multiStartoptim(objectivefn = OFcmp, data = dataList, lower = settings$min, 
        upper = settings$max, method = "nlminb", 
nbtrials = .n, typerunif = "sobol", 
nbclusters = 2)
}

First of all, it's important to notice that parallel processing is not necessarily faster than sequential processing when nbtrials is low :

nbtrials <- 5
benchmark1 <- benchmark(OF_classic(nbtrials), 
OF_classic_parallel(nbtrials), 
columns = c("test", "replications", 
"elapsed", "relative"), 
replications = 10, 
relative = "elapsed")

benchmark1

##                            test replications elapsed relative
## 1          OF_classic(nbtrials)           10    0.38     1.00
## 2 OF_classic_parallel(nbtrials)           10    7.39    19.45

Moreover, a little attention should be paid to :
* The choice of the number of clusters in terms of availability
* More clusters doesn't necessarily mean faster calibration.

Now, increasing the number of trials, we verify first that the results obtained in the 4 cases are all the same :

nbtrials <- 500
t1 <- OF_classic(nbtrials)
t2 <- OF_cmp(nbtrials)
t3 <- OF_classic_parallel(nbtrials)
t4 <- OF_cmp_parallel(nbtrials)

We have :

t1$par
## [1]  5 -2  5 -5  1  3
t1$objective
## [1] 2.795e-25

and :

all.equal(t1$objective, t2$objective, 
t3$objective, t4$objective)
## [1] TRUE

And now, the final comparison of the 4 cases :

benchmark2 <- microbenchmark(OF_classic(nbtrials), 
OF_cmp(nbtrials), OF_classic_parallel(nbtrials), 
OF_cmp_parallel(nbtrials), times = 100)
print(benchmark2, unit = "ms", order = "median")
## Unit: milliseconds
##  expr                min   lq   median uq     max
##  OF_cmp_parallel     2697  2753   2787 2883   3194
##  OF_classic_parallel 2849  2925   2971 3061   4121
##  OF_cmp              3712  3753   3809 3860   4430
##  OF_classic          4060  4121   4153 4228   4801

Here is how pretty ggplot2 displays it :

require(ggplot2)
plt <- ggplot2::qplot(y = time, data = benchmark2, 
colour = expr, xlab = "execution date", 
    ylab = "execution time in milliseconds")
plt <- plt + ggplot2::scale_y_log10()
print(plt)

parallelorbytecodeggplot

With this being said, a sequential implementation in C code might be much faster than these examples in pure R.

Leave a comment

Filed under R stuff