Factor investing is a well-known investment strategy used mostly by quant funds. Even though the factors are well published, it’s important to distinguish 2 types of factors:

- Explicit factors: these are for example momentum, value, size, quality, etc.
- Implicit factors: these are statistical features determined by using e.g. maximum likelihood, Principal Component Analysis (PCA).

Thanks to their statistical nature, the implicit factors are often easier to calculate. However, we often don’t know what they exactly are, and we need to use intuition and statistical techniques in order to assign sensible economic variables to them. For example, in fixed income, the first PCA eigenvector (factor) can be interpreted as the level of the interest rate, while the second one would be the slope of the yield curve. Similarly, in the volatility space, the first eigenvector would be associated with the volatility level, while the second one would be the volatility skew.

Applying Principal Component Analysis to a basket of stocks and assigning sensible market variables to the eigenvectors is, however, less trivial. The first eigenvector is usually the market level, but the second and higher ones often carry no meaning and/or are difficult to interpret.

To mitigate this problem, Reference [1] proposed a so-call Hierarchical Principal Component Analysis (HPCA), a variant of the PCA method in which stocks are divided into clusters that are believed to share common features such as an industry sector, a country, or a statistical measure,

*To mitigate this problem and account for hidden risk factors, we adopt a purely statistical technique. This is a simple and still powerful tool that dynamically adapts to changes in market conditions over time, which makes it suitable for managing trading portfolios. Also, it is a parsimonious approach since it does not rely on too many parameters. The user only needs to define the number of clusters, which depends on the number of K eigenvectors, without specifying any other parameters or hyper-parameters*

Using a statistical clustering technique, the authors developed an investment portfolio and managed to outperform the market,

*To illustrate an application, we show it in the context of portfolio optimization for the US stock market. We provide evidence that using HPCA statistical-based factor models outperform other classical portfolio construction methodologies such as the shrinkage covariance matrix and the HPCA GICS-based factor models.*

We find that it makes sense to use statistical features to partition stocks into clusters. We believe, however, that the traditional PCA can also be used, in conjunction with common sense and intuition, to identify clusters; e.g. we were able to use the second eigenvalues to divide utility stocks into regulated/unregulated groups. Similar results were also obtained in the fixed income space.

**References**

[1] M. Avellaneda and JA. Serur, *Hierarchical PCA and Modeling Asset Correlations* (2020). https://ssrn.com/abstract=3903460