Safe portfolio optimization

Chris Watkins

Safe portfolio optimization

1999

Safe Portfolio Optimisation Robert Macrae Chris Watkins Arcus Investment Limited Royal Holloway, University of London Abstract We show how to optimise portfolios safely using limited amounts of historic price data, by regularising an estimate of the historical covariance. We describe a simple form of regularisation with a particularly direct financial interpretation, and explain how this avoids various problems that are commonly associated with optimised portfolios including under-estimated risk, poor capture of expected returns and over-trading. Introduction Many authors describe how unconstrained optimisation using the historic covariance matrix produces portfolios containing extreme long and short positions. These portfolios are alarmingly sensitive to small changes in the covariance matrix, which leads to overtrading1. Some authors2 recommend introducing constraints on maximum allowed position sizes to prevent this. This can improve performance, but is clearly not addressing the root of the problem. We do not believe that it is widely known that a simple modification of the historic covariance matrix improves performance enormously. We explain why the raw covariance matrix needs to be changed, and show how this improves optimisation. The paper is divided into two sections that introduce the problem and our proposed solution and illustrate them with synthetic data. The Optimisation Problem For simplicity we discuss only the case of unconstrained optimisation with longs and shorts. The same considerations apply to long-only portfolios but constraints tend to obscure the problem. Formally, let the investor’s anticipated returns relative to the risk free rate for securities 1 ... p be !=(!1 … !p). The anticipated return of a portfolio consisting of an amount x1 … xp of each security is then !1x1+…+!pxp. Let the covariance matrix of future returns be V, so that the variance of the return of the i-th security is Vii and the covariance of securities i and j is Vij (which is equal to Vji). The variance of the portfolio return is then !ij Vij xi x j so the Anticipated Sharpe Ratio (ASR) is: ! ! i xi ASR = i ! xi x j Vij i, j ! Tx ! x T Vx The ASR is maximised when x = k V-1!, where k is an arbitrary constant, as the ASR of a portfolio does not depend on its size. The ASR of the optimal portfolio is 1 (! T V !1! ) 2 . The covariance matrix V of future returns is unknown when the portfolio is optimised, and optimisation is performed using some estimated or approximated covariance matrix A. The portfolio chosen is proportional to A-1!. Many methods of portfolio construction can be expressed in these terms -- for example the widely used and very robust method of weighting assets proportional to ! corresponds to A = I. This is the asymptote of the regularisation of A that we will investigate. Initially, however, we will start with the simplest choice of making A equal to the historical covariance matrix C in order to explain the problems mentioned in the introduction. Optimisation Using the Historic Covariance Matrix To understand the implications of inverting a matrix it is useful to conduct a principal components analysis. This decomposes the matrix A into two orthonormal matrices U, V and a diagonal weight matrix W. In practice it is easier to base calculations on the decomposition of the matrix of returns R. R = U.W.Vt If the mean return is subtracted from each series, we have C = Rt.R = V.W.Ut.U.W.Vt = V.W2.Vt This decomposition has a particularly simple interpretation. Each column of V is a portfolio for which the sum of squares weights is one, and which is uncorrelated to all other portfolios. For convenience we refer to these as “eigenportfolios”. The corresponding element from W2 is the historic variance of this portfolio. Each eigenportfolio can be treated as a synthetic asset with alpha given by Vi.! and volatility W2ii. Optimisation of these uncorrelated assets gives them weights proportional to Vi.! / W2ii so there will be large weights given to those eigenportfolios with small estimated variances. Any practical historic covariance matrix is likely to be near singular, so some W2ii will be very small indeed resulting in the familiar list of problems outlined in the introduction. Assuming for ease of description that the matrix is singular: 1) Gross underestimate of Volatility -- all of the portfolio weight is applied to eigenporfolios in the null space of C so estimated risk is zero. 2) Wasted Alpha -- alpha that lies along directions not in the null space will be ignored in the pursuit of the apparently riskless portfolios. The optimiser will give up real alpha exposure in the pursuit of illusory zero variance. 3) Excess Volatility -- possibilities of diversification into low, but non-zero volatility portfolios are not exploited. 4) Overtrading -- the nullspace will shift with every trivial modifications to C, causing excessive trading. The same problems apply to all near-singular matrices, with the subspace in which there is little variation taking the role of the nullspace. Synthetic Example To see if the problem was serious with practical amounts of data, we generated artificial random returns data for various numbers of uncorrelated securities of equal volatility. We chose to examine the range of 5 to 50 assets, with a number of observations ranging from 1 to 50 times the number of assets. To permit comparison of portfolios with different numbers of assets, we normalised the sum-of-squares position sizes to give all portfolios equal true volatility of 5%. For each set of data we construct the covariance matrix and invert this to find the eigenportfolios with maximum and minimum estimated volatilities. Figure 1 is generated by replicating each case many times and plotting the average values for both maximum and minimum risk portfolios. Fig 1: Volatility Estimates as a Function of Data to Assets Ratio 10% 8% Max 50 Max 20 6% Max 10 Max 5 Theoretical Min 5 Min 10 4% Min 20 Min 50 2% 0% 1.0 10.0 100.0 In practice we are only concerned with low-risk portfolios, for which the error is almost independent of the number of assets involved. This permits us to construct Figure 2, which shows quartiles of the ratio of out-of-sample to in-sample volatility for the minimum risk portfolio. Fig 2: Volatility Underestimate for Optimised Portfolios due to Finite Data 10 Upper Quartile Lower Quartile 9 Factor Underestimate 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 10 Ratio of Series Length to Number of Assets There are two key conclusions from these graphs: 1) The required historic data is roughly proportional to number of assets. So long as you have ten times as many data points as you have assets, selection bias leads to an average volatility underestimate of about 40%, which is not unreasonable in comparison the other uncertainties of the risk-control process. However, if you use only twice as many, selection bias may lead to out-of-sample volatility being 2.5 to 3.5 times as large as you estimate. This result has been found to hold for much more general artificial cases including those with a shared "Market" factor, and APT structure, and we believe it to be a good general guide. This guide can, however, be optimistic with longtailed and heteroskedastic distributions, since relatively few of the observations make any significant contribution to the covariance and the effective number of data points may be far lower than the total number. 2) Sufficient historic data is often unavailable in practice. At first sight, a requirement of 10 data points per asset does not appear too demanding since daily data is widely available, but there are pitfalls in using high-frequency data. As a rule of thumb, if you are interested in forecasting financial series on a certain timescale there are limited benefits to using data that is spaced more closely than 1/10 of this time-scale (since short term effects dominate and closer points cease to be independent), or greater than 10x this time-scale (since non-stationarity becomes a problem and older points cease to be relevant). If we were to take these guides as hard limits then we would be limited to 100 independent and relevant data points per security, so we could safely estimate risk for optimised portfolios containing at most 10 securities! This conclusion is over-pessimistic, but it indicates that optimisation using historic covariances may perform poorly for any practical optimisation involving hundreds of securities, even if several years of daily data are available. Regularisation This analysis suggests that regularisation is required. We have investigated several methods of regularisation but, unless you have a strong prior about the form of the covariance matrix, there seems little benefit to using complex methods. We prefer the single-parameter approach of decomposing C = V.W2.Vt as above, and then setting Wnew 2 2 2 2 ii = w min + W ii. This methods is equivalent to approximating A = C + I.w min and can be regarded as regularising by moving from C towards I to an extent given by w2min. How should the regularising parameter be chosen? Too little regularisation will lead to an incomplete cure for these problems, but too much will ignore the historic covariance structure. Choosing an appropriate amount of regularisation is a compromise, on which there is a large research literature in statistics. We have experimented with three general-purpose methods: leave-one-out cross validation, a bootstrap method, and using an approximate Bayesian criterion. All of these methods tended to under-regularise, but the approach is reassuringly robust to the level of regularisation chosen and any of these methods, or indeed an educated guess, are sufficient to achieve sensible results in practice. We call this process “plausibility editing”, because it is removes implausible eigenportfolio variances that are implicit in the historic covariance matrix. Benefits of Plausibility Editing 1) Guaranteed Floor on Estimate of Risk -- by construction no unit-norm portfolio can have a variance estimate below w2min. 2) Efficient Use of Alpha -- alpha which lies along any direction i will be exploited to an extent given by w2min / w2i. 3) Efficient Use of Diversification Opportunities -- Only portfolios with w2i >> w2min are unavailable for diversification 4) No Overtrading -- the portfolio is insensitive to shifts in the nullspace. In practical cases the most volatile few eigenportfolios will account for a large proportion of the total volatility. Their volatilities will be almost unaffected by any sensible w2min, so these benefits come at a very limited cost to the accuracy with which the historic volatility structure is reproduced. Plausibility editing has also been applied to the construction of a wide range more complex synthetic examples and to practical portfolios, and has produced consistently sensible results. We believe it to have very broad applicability. Conclusion Principal components analysis of the historic covariance matrix shows why it is inappropriate to use this matrix in portfolio optimisation, and suggests plausibility editing as a form of regularisation which solves the associated problems. This technique can easily be shown to work on synthetic data. 1 Black F. and Litterman R., Global Portfolio Optimisation, Financial Analysts Journal, September-October 1992 give a description of what happens during optimisation with an unregularised covariance matrix. They describe an optimisation method in which the historic covariance matrix is unregularised, but the investor’s views (!s) are modified to achieve a similar effect. 2 Frost, P.A. and Savarino, J.E., For Better Performance, Constrain Portfolio Weights, Journal of Portfolio Management, Fall 1988, pp29-34.

Log In

Safe portfolio optimization

Related papers

Related papers