The portfolio optimization model that we have considered in Example 12.5 does not place any restriction on the composition of the portfolio. In practice, bounds are enforced, e.g., to limit exposure to certain risk factors; for instance, we might wish to limit exposure to emerging markets or to the energy sector. Another practical issue that is worth mentioning concerns the cardinality of a portfolio, i.e., the number of different stock names that we include. If we have a universe of n assets, indexed by i = 1, …, n, enforcing a cardinality constraint means limiting the number of assets with a positive weight in the portfolio. This aims at simplifying portfolio management and the analysis of the related risk, but arguably the best reason for considering cardinality-constrained portfolios is index tracking. In fact, equilibrium models like CAPM, if taken literally, imply that there is no way to systematically outperform the market. Hence, a sensible strategy is to invest in a way that mimics a broad market index. However, doing this might require the inclusion of too many assets in the portfolio, with a possible increase in management and transaction costs, which is really unpleasing in a portfolio just tracking an index, since its main virtue should be a low cost; indeed, since this is a passive management style, there is no reason why a customer should pay high fees, which can only be justified by an active strategy based on thorough financial analysis.
Based on these considerations, we want to build a portfolio tracking a target portfolio with a limited number of assets. This is also called portfolio compression, and the target portfolio can be an index or whatever you like. As usual, we denote the random return of asset i by Ri and its weight by wi ≥ 0, ruling out short selling. We are also given the composition of a benchmark portfolio, expressed by weights . The return of our portfolio is
and the return of the benchmark portfolio is
We want to track the benchmark with a cardinality-constrained tracking portfolio, i.e., a portfolio including at most Cmax assets.
The first thing we need is a way to measure the distance between our portfolio and the benchmark. A trivial distance metric can be defined, based on absolute values of portfolio weight discrepancy:27
By proper model formulation, this metric yields a MILP model. However, this distance metric does not take into account many important facets of the problem. To see this, consider two assets, i1 and i2, with a strong positive correlation, and assume that , . The trivial distance measure above would include the following two terms:
If wi1 = 0.11 and wi2 = 0, how much is the distance from the target? In the limit, if the correlation coefficient is ρi1, i2 = 1, the actual distance is zero, but the above distance misses the point completely. Hence, we might consider correlations and covariances as well, and focus on actual portfolio return. An alternative distance metric is tracking error variance (TEV), defined as
To relate TEV to portfolio weights, let us rewrite its definition and use what we know about linear combinations of random variables:
where σij is the usual covariance between the returns of assets i and j.
In order to express the cardinality constraint, we need to introduce a set of binary variables δi, one for each asset, modeling the inclusion of asset i in the tracking portfolio:
Binary variables δi and continuous variables wi should be linked by a big-M constraint such as
where M is the by now familiar suitably large constant. A natural choice is to set M = 1, as no portfolio weight should exceed 100%. Since we know that performance of branch and bound methods depends on the strength of the continuous relaxation, the model can be improved by tightening the constraint; if we know that, because of policy restrictions, there is an upper bound on the weight of asset i, we may write the constraint as
Straightforward minimization of TEV, subject to a cardinality constraint, is accomplished by the following model:
This model is a mixed-integer quadratic programming problem. Since the covariance matrix is positive semidefinite, the continuous relaxation of binary variables (δi ∈ [0, 1]) yields a convex quadratic program which is solved very efficiently. In fact, recent commercial software packages can tackle the problem; for large-scale problem instances, specific approximate algorithms have also been devised.
Leave a Reply