Kings College Jan2020
Kings College Jan2020
Kings College Jan2020
●
Python code:
https://gist.github.com/robcarver17/5204ea7b1a
7d5723da0b01b8ba413e72
●
Data:
https://drive.google.com/file/d/1IBgMNuYivYR4
hb_rHuIopZKnyOR7W5yb/view?usp=sharing
Todays talk: Portfolio optimisation
●
The classic Markowitz optimisation
●
Two major problems
●
Building a better optimiser
The problem is to find portfolio weights to assets that
maximise [something] given some estimates of likely
future returns
def addem(weights):
# Used for constraints
return 1.0 – sum(weights)
# Useful for working in standard deviation and correlation space
# rather than covariance space
return sigma
●
The classic Markowitz optimisation
●
Two major problems
●
Building a better optimiser
Instability
optimise_with_corr_and_std([.04,.04,0.04], [0.08,0.08,0.08],
np.array([[1,.9,.9],[.9,1,.9],[.9,.9,1]]))
optimise_with_corr_and_std([.04,.04,0.04], [0.08,0.08,0.08],
np.array([[1,0,0],[0,1,0],[0,0,1]]))
optimise_with_corr_and_std([.04,.04,0.04], [0.08,0.08,0.08],
np.array([[1,0.7,0],[0.7,1,0],[0,0,1]]))
optimise_with_corr_and_std([.04,.04,0.04], [0.08,0.08,0.12],
np.array([[1,0,0],[0,1,0],[0,0,1]]))
optimise_with_corr_and_std([.04,.04,0.04], [0.08,0.08,0.085],
np.array([[1,.9,.9],[.9,1,.9],[.9,.9,1]]))
optimise_with_corr_and_std([.04,.04,0.045], [0.08,0.08,0.08],
np.array([[1,.9,.9],[.9,1,.9],[.9,.9,1]]))
optimise_with_corr_and_std([.04,.04,0.06], [0.08,0.08,0.08],
np.array([[1,0,0],[0,1,0],[0,0,1]]))
Instability...
High correlation + small difference in m or s →
extreme portfolios
20
15
10
0
0.07 0.13 0.2 0.27 0.33 0.4
Parameter sampling distributions
●
All statistics estimated from past data are
subject to uncertainty
●
Less data → more uncertainty
●
We can quantify the uncertainty by
bootstrapping the data
Parameter weights are calculated
from statistical estimates based on
past data
…. and so subject to sampling uncertainty!
Estimating with real data
import pandas as pd
WEEKS_IN_YEAR = 365.25/7
def annual_returns_from_weekly_data(data):
return data.mean()*WEEKS_IN_YEAR
def annual_stdev_from_weekly_data(data):
return data.std()*(WEEKS_IN_YEAR**.5)
def sharpe_ratio(data):
return annual_returns_from_weekly_data(data)/annual_stdev_from_weekly_data(data)
def optimise_with_data(data):
mean_list = annual_returns_from_weekly_data(data).values
stdev_list = annual_stdev_from_weekly_data(data).values
corrmatrix = data.corr().values
return weights
data = pd.read_csv("returns.csv")
SP500 US10 US5
def optimisation_with_random_bootstrap(data):
bootstrapped_data = get_bootstrap_series(data)
weights = optimise_with_data(bootstrapped_data)
return weights
def get_bootstrap_series(data):
length_of_series = len(data.index)
random_indices = [int(random.uniform(0,length_of_series)) for _unused in
range(length_of_series)]
bootstrap_data = data.iloc[random_indices]
return bootstrap_data
dist_of_weights = []
for i in range(monte_count):
single_bootstrap_weights = optimisation_with_random_bootstrap(data)
dist_of_weights.append(single_bootstrap_weights)
dist_of_weights = pd.DataFrame(dist_of_weights)
dist_of_weights.columns = data.columns
return dist_of_weights
We can plot:
●
The sampling distribution of the relevant input
●
The effect on the portfolio weights from changing that
(and keeping all other inputs the same ‘ceritus paribus’)
Sampling distribution code
ax1.hist(factor_distribution, bins=50)
weights_data.plot(ax=ax2)
factor_func = func_dict[factor]
factor_distr = []
for not_used in range(monte_length):
bootstrap_data =get_bootstrap_series(data)
factor_estimate = factor_func(bootstrap_data)
factor_estimate_for_code = factor_estimate[code]
factor_distr.append(factor_estimate_for_code)
return factor_distr
def split_corr_code(code):
split_code = code.split("/")
instr1 = split_code[0]
instr2 = split_code[1]
return instr1, instr2
def correlation_for_code(data):
corr = data.corr()
instruments = data.columns
results_dict = {}
for instr1 in instruments:
for instr2 in instruments:
code = join_corr_code(instr1, instr2)
results_dict[code] = corr.loc[instr1, instr2]
return results_dict
def get_weights_data(code, factor, data, factor_distribution,
points_to_use=100):
factor_range = [np.min(factor_distribution), np.max(factor_distribution)]
factor_step = (factor_range[1] - factor_range[0])/points_to_use
factor_values_to_test = np.arange(start = factor_range[0], stop =
factor_range[1], step = factor_step)
weight_results = []
for factor_value_to_use in factor_values_to_test:
weights = optimise_with_replaced_factor_value(data, code, factor,
factor_value_to_use)
weight_results.append(weights)
# nice format
weight_results = pd.DataFrame(weight_results)
weight_results.columns = data.columns
return weight_results
def optimise_with_replaced_factor_value(data, code, factor, factor_value_to_use):
return weights
sharpe_list = sharpe_ratio(data)
if factor=="sharpe":
sharpe_list[code] = factor_value_to_use
sharpe_list = sharpe_list.values
corrmatrix = corrmatrix.values
return corrmatrix
def analyse_changeling_results(code, factor, factor_distribution, data):
factor5 = np.percentile(factor_distribution, 5)
factor95 = np.percentile(factor_distribution, 95)
print("There is a 90%% chance that %s for %s was between %.2f and %.2f" %
(factor, code, factor5, factor95))
weights5 = optimise_with_replaced_factor_value(data, code, factor, factor5)
weights95 = optimise_with_replaced_factor_value(data, code, factor,
factor95)
if factor=="corr":
code = split_corr_code(code)[0]
instruments = list(data.columns)
code_index = instruments.index(code)
weight5_code = weights5[code_index]
weight95_code = weights95[code_index]
There is a 90% chance that sharpe for SP500 was between -0.18 and 0.50
Giving weights for SP500 between 0.098 and 0.170
plot_changeling("US10", "sharpe", data)
There is a 90% chance that sharpe for US10 was between 0.31 and 0.98
Giving weights for US10 between 0.000 and 0.846
plot_changeling("US5", "sharpe", data)
There is a 90% chance that sharpe for US5 was between 0.37 and 1.05
Giving weights for US5 between 0.000 and 0.860
plot_changeling("SP500", "stdev", data)
There is a 90% chance that stdev for SP500 was between 0.16 and 0.18
Giving weights for SP500 between 0.149 and 0.142
plot_changeling("US10", "stdev", data)
There is a 90% chance that stdev for US10 was between 0.05 and 0.06
Giving weights for US10 between 0.000 and 0.000
plot_changeling("US5", "stdev", data)
There is a 90% chance that stdev for US5 was between 0.04 and 0.04
Giving weights for US5 between 0.857 and 0.853
plot_changeling("SP500/US10", "corr", data)
There is a 90% chance that corr for SP500/US10 was between -0.32 and -0.19
Giving weights for SP500 between 0.145 and 0.145
plot_changeling("SP500/US5", "corr", data)
There is a 90% chance that corr for SP500/US5 was between -0.34 and -0.22
Giving weights for SP500 between 0.160 and 0.128
plot_changeling("US5/US10", "corr", data)
There is a 90% chance that corr for US5/US10 was between 0.95 and 0.96
Giving weights for US5 between 0.855 and 0.855
So what?
●
Always:
– Uncertainty in standard deviation estimates (which is tiny) has zero
effect on portfolio weights
●
For pairs of assets with high correlation:
– Uncertainty in correlations (which is tiny) has zero effect on portfolio
weights
– Uncertainty in Sharpe Ratios (which is large) has a massive effect on
weights: We see flipping from 0% to ~85% in a tiny range for bonds
●
For pairs of assets with low correlation:
– Uncertainty in correlations (which is medium sized) has no significant
effect on portfolio weights: For SP500 / US5 there is a 90% chance
the correct SP500 weight was between 12.8% and 16.0%
– Uncertainty in Sharpe Ratios (which is large) has a significant effect
on weights: For S&P 500 there is a 90% chance the correct weight
was between 9.8% and 17%
If you prefer closed forms...
How uncertain are SR estimates?
●
Under assumptions of independent Gaussian
returns the variance wSR of the SR estimate is:
(1+0.5SR2)/N
Where SR is the mean estimate of the Sharpe
Ratio
●
A two sided 95% confidence interval is:
(SR – 1.96* √wwSR, SR + 1.96* √wwSR)
53
How uncertain are estimates of s?
The two sided confidence interval is:
[s√{(n-1) / c21-a/2,n-1 } , s√{(n-1) / c2a/2,n-1 }]
Where a is the power (0.05 for a two sided 95%
confidence range), n is the sample size, s is the
standard deviation estimate and c2 is the chi-
squared distribution.
54
How uncertain are correlation
estimates?
We use the Fisher transformation:
zr = 0.5 ln [(1+r) / (1-r)]
z then has upper and lower confidence intervals:
zL = zr – z1-a/2 √{1/(n-3)}, ZU = zr + z1-a/2 √{1/(n-3)}]
Where Zk is the kth percentage point of the standard normal
distribution. We then transform these back into the correlation
space:
rL = [exp(2ZL) – 1] / [exp(2ZL)+1],
rU = [exp(2ZU) - 1] / [exp(2ZU)+1]
55
An aside
●
We have been talking about “the uncertainty of the past”
●
But what matters is the future. How well can we predict
future inputs into the optimisation process?
●
It turns out the ranking is the same as for past
uncertainty:
– Standard deviation is very predictable (R2 0.38, using last months value
to predict next months)
– Correlations are somewhat predictable (R2 0.2, between 1 and 6
months)
– Sharpe Ratios (and means) are very hard to predict (R2<0.05 at all
horizons)
Todays talk: Portfolio optimisation
●
The classic Markowitz optimisation
●
Two major problems
●
Building a better optimiser
Don’t:
●
Use constraints: in-sample and breaks
optimiser
●
Tweak inputs
Try:
●
Bootstrapping the weights
Bootstrap code
weight_distr = bootstrap_optimisation_distributions(data, monte_count=1000)
weight_distr.mean()
Try:
●
Bootstrapping the weights
– Can also bootstrap a distribution
– Can also bootstrap the efficient frontier
– Computationally slow
– Doesn’t use constraints efficiently
– Doesn’t distinguish between different forms of uncertainty
●
Set some/all inputs to identical values
Set some/all inputs to identical
values
●
Using equal portfolio weights, if correlations and
standard deviations are sufficiently similar (eg
S&P 500 stocks)
●
Inverse volatility weighting (takes account of
standard deviations but not correlations)
●
Setting all Sharpe Ratios to be equal to each
other in the inputs for the optimisation
def optimise_with_identical_values(data, identical_SR=False,
identical_stdev=False, identical_corr=False):
if identical_stdev:
stdev_list = get_identical_stdev(data)
else:
stdev_list = annual_stdev_from_weekly_data(data).values
if identical_SR:
mean_list = get_means_assuming_identical_SR(data, stdev_list)
else:
mean_list = annual_returns_from_weekly_data(data).values
if identical_corr:
corr_matrix = get_identical_corr(data)
else:
corr_matrix = data.corr().values
return weights
●
def get_identical_corr(data):
instrument_count = len(data.columns)
estimated_corr = data.corr().values
avg_corr = get_avg_corr(estimated_corr)
corrmatrix = boring_corr_matrix(instrument_count, offdiag=avg_corr)
return corrmatrix
def get_identical_stdev(data):
estimated_stdev = annual_stdev_from_weekly_data(data)
instrument_count = len(data.columns)
average_stdev = estimated_stdev.mean()
stdev_list = [average_stdev]*instrument_count
return stdev_list
average_SR = sharpe_ratio(data).mean()
mean_list = [stdev*average_SR for stdev in using_stdev]
return mean_list
from copy import copy
def get_avg_corr(sigma):
new_sigma = copy(sigma)
np.fill_diagonal(new_sigma, np.nan)
if np.all(np.isnan(new_sigma)):
return np.nan
avg_corr = np.nanmean(new_sigma)
return avg_corr
Try:
●
Bootstrapping the weights
●
Set some/all inputs to identical values
– But what if the data clearly shows differences?
●
Bayesian
Bayesian shrinkage
Average between estimates and a prior
mE estimated mean
w shrinkage factor (w=0, use just prior; w=1, use
estimate)
mP prior
prior_corr_matrix = get_identical_corr(data)
estimated_corr_matrix = data.corr().values
corr_matrix = prior_corr_matrix*corr_shrinkage + estimated_corr_matrix*(1-
corr_shrinkage)
return weights
Full shrinkage on all Equal weights
inputs
Full shrinkage on Inverse volatility
Sharpe Ratios and portfolio
Correlations; none
on std. deviation
No shrinkage on Normal optimisation
anything
Bayesian
Advantages
– Intuitive results (with no shrinkage will recover original
optimisation results; with full shrinkage will recover prior;
shrinkage related to uncertainty)
– Can use different shrinkage for different parameter estimates
– Can be used with constraints
– Computationally fast (single optimisation)
Disadvantages
– What prior to use? (no cheating!) (Black-Litterman?)
– How much shrinkage? (depends on data, underlying stability of
problem)
Try:
●
Bootstrapping the weights
●
Set some/all inputs to identical values
●
Bayesian
– Black-Litterman
Try:
●
Bootstrapping the weights
●
Set some/all inputs to identical values
●
Bayesian
– Black-Litterman
●
Hierarchical methods
– HRP
– https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2708678
– https://mlfinlab.readthedocs.io/en/latest/implementations/portfolio_optimisati
on.html#hierarchical-risk-parity-hrp
●
Heuristic methods
– Handcrafting
– https://qoppac.blogspot.com/2018/12/portfolio-construction-through.html
●
Neural networks, machine learning, …
– Be careful!
Conclusions...
●
Understand your tools
●
Understand your data
●
Be careful out there!
My website: My code:
systematicmoney.org github.com/robcarver17/
My blog: Twittering:
qoppac.blogspot.com @investingidiocy