- A Review of Financial Studies paper examines the pitfalls of mean-variance portfolio optimisation.
- It confirms a well-known result – historical sample moments are suboptimal proxies for true means and variances, and that optimisation methods estimating these past values have poor forecasting performance.
- Instead, the authors propose a new approach, coined the ‘Galton’ strategy, which uses out-of-sample forecasting errors to correct for changing relationships between past values and ex-post realisations.
- The Galton approach places fewer restrictions on past data and so avoids suffering from some of the well-documented concerns in Modern Portfolio Theory.
American Economist Harry Markowitz passed away last month. His work on Modern Portfolio Theory (MPT) remains relevant today. A Review of Financial Studies paper shows how to calibrate mean-variance inputs when designing a portfolio to deliver performance in line with ex-ante expected values – a rare feat for optimised portfolios.
The process is called the ‘Galton’ correction. In essence, it corrects for historical deviations between the actual (realised) means and variances of assets from their forecasted values – akin to Bayesian learning.
Relative to other mean-variance portfolio optimisation methods, a Galton-optimised portfolio has better Sharpe Ratios, smaller risk forecast errors, and more attractive performance after transaction costs.
The correction is also applicable in a machine learning and AI context to help nonlinear portfolio optimisation, and the authors suggest it is more robust in a real-world context than other methods.
Modern Portfolio Theory
In Modern Portfolio Theory (MPT), portfolio allocation decisions are traditionally based on mean-variance analyses, that help investors balance expected returns with risk. Weights operationalise this trade-off and determine the portfolio’s asset class composition. Selecting the ‘best’ weights according to a set of constraints is known as portfolio optimisation.
A strong critique of MPT is that the inputs (means and variances) used to solve the optimisation problem are unobserved. Instead, the true mean and variances are proxied by their sample counterparts in some rolling window. Under this method, the historical sample moments are implicitly assumed to be the best predictors of future realisations.
The authors show why this is not necessarily the case (Chart 1). Take Panel B, for example. A traditional portfolio optimiser who uses Markowitz’s ‘plug-in’ approach (relative weights are determined by a set of covariance matrices and mean returns) assumes past correlations between two assets are good predictors of future values and so lie on the blue 45-degree line. However, plotting ex-post realisations against historical values shows only a slightly positive relation for correlations and none for mean returns (Panel D).
The large difference between the historical estimates of optimisation inputs and the values they assume out-of-sample (OOS) give rise to portfolio optimisation forecast errors. For example, using past correlations to proxy true correlations would, according to Chart 1, lead to excessive forecasts, which would clearly be suboptimal.
Instead, the authors propose using the OOS forecast errors as inputs to improve forecast accuracy. In essence, each forecast period of, say, mean returns still uses historical mean returns but corrects for differences in the relationship between past and ex-post values. This ‘Galton’ correction minimises the mean-squared forecast error through OLS and would be visually represented as a line of best fit through the green crosses above.
The Galton Corrections
Continuously correcting for differences between past and realised input values is akin to the Bayesian estimation. This is where the priors and posteriors are dynamically updated as new information arrives. It has the advantage of converging (or ‘shrinking’) to a truer forecast than simply using past values.
The crux, then, of Galton corrections is that they allow for the covariance matrix that underpins the portfolio weighting system to change dynamically over time. The method does this by estimating the relationship of historical means, variances, and correlation to ex-post realisations. These are the three shrinkage parameters, and in each period, OLS computes them by minimising their squared errors from ex-post realisations.
Intuitively, the method optimises on how well the realised moments of assets deviate from their forecasts. If the historical estimate has proven to be a ‘bad’ input in the past, the procedure will give less weight to its predictions and more weight to a prediction that says all elements of the input vector are equal to their global mean. The opposite is true for a ‘good’ input.
As an example, the regression slope in Panel D of Chart 1 is negative for the mean returns of individual securities. The corrected forecast will account for this and suggest that the expected return input vector for portfolio optimisation should have small or no differences across assets. These inputs will be more accurate compared with putting more weight on historical top performers and being exposed to mean reversion.
According to the authors, there are three benefits to this approach. (i) It assumes very little about the data – better forecasts are tied to past errors, not real-world observations. (ii) All past errors are relevant, and therefore the historical sample period is larger – the more errors, the more accurate the shrinkage intensity. (iii) It can be applied to a wider set of optimisation methods that are built on estimates of expected returns and covariates.
Also, because the new information that enters the covariance matrix is based on errors from past forecast performance, the Galton correction places fewer constraints on the distribution of the underlying data. This means the Galton approach performs better in real-world environments than other portfolio optimisers that update/estimate the covariance matrix with actual data, which often assume a Gaussian distribution.
Applying Galton Corrections
The Galton portfolios are simply the result of plain Markowitz optimisation applied to corrected inputs. Two portfolios are constructed: (i) the Galton mean-variance portfolio (MV), and (ii) the Galton global minimum variance portfolio (GMV). The approach also nests other portfolio rules, such as the Talmud portfolio, the sample global variance portfolio, and the Markowitz ex-post tangency portfolio.
To operationalise the Galton corrections, the paper observes five-year rolling windows for each asset to make forecasts for the subsequent year. The sample starts in 1962, which means the first OOS return is in 1967. The authors use Fama-MacBeth regressions and estimate the five-year rolling window forecast performance to the OOS realisation. The first forecast is trained using a 10-year window from 1952-1961. The main dataset for optimising portfolios of individual stocks consists of monthly returns of the entire universe of US-listed stocks on the Centre for Research in Security Prices (CRSP). In total, around 10mn observations inform the regression results in a single month.
First, the authors find empirically that historical inputs are indeed on average far from their true values, highlighting the issue of using the plug-in approach. Instead, they find the best estimate for the future correlation between a pair of stocks is somewhere between the past correlation of the same pair of stocks and the mean correlation for all pairs of stocks.
Next, the authors compare the performance of the Galton portfolios against other important optimisation methods studied in the literature. These include (1) a value-weighted portfolio (by market cap); (2) an equal-weighted portfolio (1/N); (3) a plug-in Markowitz portfolio; (4) a Jorian portfolio, which also uses a shrinkage strategy; (5) an Elton and Gruber (EG) portfolio, which shrinks correlations to the global average correlation; (6) a Ledoit and Wolf (LW) portfolio, which is somewhere between EG and Markowitz; (7) a Kan and Zhou (KZ) portfolio; (8) a Tu and Zhou (TZ) portfolio.
The final two mean-variance portfolios optimise in the presence of estimation error. This approach explicitly accounts for the impact of estimation error in expected OOS performance and aims to minimize it by diversifying across funds. In essence, the methods try to wash out the estimation error from the inputs in the first place then optimise as if working with the cleansed true quantities.
Picking the 50 largest market cap US stocks, the only optimised portfolios that return a positive Sharpe Ratio (SR) are the Galton portfolios and portfolios 1,2,7, and 8 above (Table 1). An investor following strategies 1 or 2, which assume nothing can be learned from the sample, do better than all strategies apart from the Galton-optimised portfolio.
Turnover and Weight Diversification
The authors report measures of portfolio concentration and stability for each method. These are the active share, the turnover of the portfolio, the average minimum and maximum weight, the portfolio concentration (the mean, over the time series, of the standard deviation of weights in the cross-section), the time-series standard deviation of this standard deviation (a measure capturing instability of that concentration), and the sum of negative weights.
First, the value-weighted and equally weighted portfolios generally have a lower turnover than the other mean-variance portfolios. Mean-variance portfolios can also have extreme weights (positive and negative) that deviate substantially from the market. This shows that mean-variance-optimised portfolios display ‘exuberant’ behaviour, and, in one case, the strategy led to bankruptcy in 6.5% of months invested. The Galton strategies, on the other hand, have lower turnover, and therefore have more attractive performance after transaction costs.
It is important in portfolio management that portfolios meet expectations, so risk forecasts must be accurate. For this, the Value-at-Risk (VaR) measure is key. The authors present several measures to compare ex-post performance with ex-ante expectations.
They find that optimised portfolios have systematically large expected returns and risk forecast errors. The Markowitz strategy, for example, gets on average 17 times the anticipated standard deviation. The Galton portfolios, on the other hand, are the best at predicting risk. They expect a standard deviation of 16.5% versus an average realisation of 16.1%.
Measuring left-tail risk is exceptionally important. The LW and EG strategies have losses that exceed the 1% VaR only 8% and 10.8% of the time, respectively. The Galton methods, on the other hand, have hit rates of around 1%. Among the compared methods, the Galton is the only one achieving this level of accuracy. In some instances, the authors find that the strategies mislead investors into believing the risk is smaller than it really is.
The paper is valuable for institutional investors that optimise portfolios using a mean-variance weighting system. The authors emphasise that the single most important portfolio optimisation improvement investors should use is to correct for means (‘mean shrinking’). After that, shrinking correlations is more important than shrinking variances, but shrinking all three (the Galton correction) significantly improves portfolio expectations around risks and rewards.
Overall, the authors show the Galton corrections provide better predictions of Value at Risk, forecast expected returns closer to ex-post performance, and achieve robust OOS performance even after transaction costs. They also provide a linear benchmark for nonlinear machine learning and AI methods to help nonlinear portfolio optimisation.
Sam van de Schootbrugge is a Macro Research Analyst at Macro Hive, currently completing his PhD in Economics. He has a master’s degree in economic research from the University of Cambridge and has worked in research roles for over 3 years in both the public and private sector. His research expertise is in international finance, macroeconomics and fiscal policy
Photo Credit: depositphotos.com