Friday, December 20, 2013

Factor analysis based stock classification

Factor analysis is a multivariate statistical method aimed at data reduction and summarization. It can be used to describe the covariance relationships among many variables in terms of a few hidden underlying factors.

Suppose we have a number of correlated variables. Using the correlation matrix, we can group these variables such that the variables within a particular group are highly correlated among themselves, but have relatively small correlations with variables in other groups. This means that each group of variables represents a single underlying construct or factor. These factors can have a fundamental meaning attached to them.

Use of Factor Analysis in trading
Factor analysis are used in trading and portfolio management for various reasons:
  • It is used to identify hidden factors/trends which drive the asset returns. These factors will typically have a fundamental meaning(like sector/style) attached to them. 
  • It is used to classify assets into groups based on their returns. There is a gamut of trading strategies(like basket long short) that can be implemented within each of these groups.
  • It gives a clear picture of the major source of the portfolio risk. These risks can be either systematic (common variance) or unsystematic (specific variance) and hence handled accordingly.
Classification of LIX15 stocks
LIX15 is an Indian equity market index that consists of 15 highly liquid stocks traded on National stock exchange. Factor analysis is performed on the returns of these 15 stocks to identify any hidden trends. The observations matrix consists of normalized daily returns of these 15 stocks sampled from February to November 2013. A two factor model is chosen to decompose the data. The factor loadings are determined using maximum likelihood estimation method. It is seen that these factors accounts for about 60 % of the total variance. Now VARIMAX rotation is performed to group stocks based on their loadings. The aim of this rotation is to achieve simple structures which will possibly have a fundamental reasoning behind them. The following is the table of top 6 stocks with highest loading on each factor:


Factor I
Factor II
AXISBANK
TATASTEEL
YESBANK
HINDALCO
SBIN
JSWSTEEL
IDFC
JPASSOCIAT
BANKBARODA
RCOM
MARUTI
JINDALSTEL

Looking at the above stock list we can say that these factors approximately represent different sectorial themes. The first factor is populated with financial services stocks. The second factor has a large number of metal stocks.


Fundamental theme
Factor 1
Financial services stocks
Factor 2
Metal stocks

There are some stocks in each factor that do not concur with the corresponding fundamental interpretation. This would primarily be sample bias. Another reason could be that these fundamental factors indirectly affect the returns of the corresponding stocks. Also there are some stocks which do not have significant loading on any of the factors. MCDOWELL-N, TATAMOTORS and CAIRN are some of these stocks. These stocks do not fall in either of the sector and hence have remained unclassified.

Classification of BANKNIFTY stocks
BANKNIFTY is the primary banking sector index of India. Similar to the LIX15 analysis, a two factor decomposition of the twelve BANKNIFTY constituents is performed. About 75% of the total variance is explained by these two factors. Following is the table of top 6 stocks with the highest loading each factor.

Factor I
Factor II
AXISBANK
CANBK
ICICIBANK
BANKINDIA
HDFCBANK
UNIONBANK
INDUSINDBK
BANKBARODA
YESBANK
PNB
KOTAKBANK
SBIN

It is clear that the Factor 1 corresponds to private sector banks and Factor 2 corresponds to public sector banks. Hence within the banking sector the most dominant segregation is along the public verses private lines.


Fundamental theme
Factor 1
Private sector banks
Factor 2
Public sector banks

Conclusion:
The prices of stocks are typically correlated. Using factor analysis, we can group the variability in the stock market into categories. We can now view fluctuations in the stock market based on groups rather than the individual stocks. Using factor analysis we have been able to conclude that among the LIX15 constituents the major classification is on the sectorial lines. Also among the BANKNIFTY constituents the classification lies along the public vs private ownership lines.


Monday, December 16, 2013

PCA on NIFTY stocks

Principal component analysis can help to reduce a complex data set to a lower dimension to highlight a hidden structure. It is used to convert a set of observations of correlated variables to a set of values of uncorrelated variables called principal components. These principal components are linear combinations of the actual variables. The linear combinations are chosen in such a manner that the first principal component has the largest possible variance. The successive principal components have the largest amount of variance given that they are orthogonal to the preceding principal components. Hence the variance explained by the first few components tells us about the strength of the underlying trend. An implicit assumption here is that large variance represents important dynamics of the variables.

PCA on NIFTY stocks
NIFTY is one of the most widely tracked index for Indian equity market. It consists of 50 major Indian companies. PCA is used to analyze the correlated returns of these 50 stocks. The observations matrix consists of normalized daily returns of Nifty 50 stocks sampled from February to November 2013. Below is the Scree plot of the resulting principal components:


The first principal component is typically assumed to represent the broad market. The next few are assumed to be the sector/style related factors. The remaining components represent the idiosyncratic properties of stocks. For the given set of NIFTY stocks we can conclude that about 35% of all the variance is because of the broad market factor (systematic risk). Also about 15% of the variance can be explained by the sector/style related factors. The remaining 50% corresponds to the stock specific factors (unsystematic risk).  Below is the factor correlation plot:


While performing PCA, an important thing to look for is the variation of variance explained by the principal components over time.


Looking at the above plots, we can say that barring extreme rises in the broad market, stocks tend to fall together and rise independently of each other. This means that the component of market risks is higher on the downside. Also over the last couple of years, the sector/style factors have gotten stronger. It seems that the investing paradigm has shifted from timing markets to taking stock/sector level calls.

Thursday, September 26, 2013

Market Impact Cost

Trade cost is a tricky subject. It has significant impact on the strategy performance and yet it is one of the least understood subjects. It is typically assumed away in literature. Both underestimation and overestimation of trade cost has detrimental effects. Underestimation of cost can lead to higher churning and thus hurt us in multiple ways. Over estimation of trade cost can force one to trade too slowly and hence can lead to significant market risk.

Components of trade cost:
  • Direct costs such as brokerage, taxes etc. This component is easy to measure.
  • Indirect costs such as market impact and opportunity cost. Market impact is the cost incurred from the market in order to enter or exit a position. This component is hard to measure
Why bother with an accurate trade cost models?
  • Trading cost can have a very significant impact on returns of strategies with high churning ratio. This mean a trading cost model is imperative if you play in the domain of statistical arbitrage or HFT. 
  • An accurate impact cost model can help one alter the trading strategy to achieve higher returns. Given a generic trading idea, should I trade on large cap or on small cap stock universe? The answer to this question would not only depend on gross returns in both these universes but also on the incurred impact cost (for a given trade size).
  •  It also helps one determine the maximum capacity of a particular fund. Beyond a certain size, the fund returns might fall down to less than acceptable levels
  • It can help us design and evaluate order execution algorithms.



Forces at play:
Market impact cost is the difference between the pre trade paper price (LTP or mid price as the modeling may be) and the realized trade price. The characteristics of the underlying asset (liquidity, volatility, resiliency etc) and our order (size, execution strategy etc) both determine the impact cost. The following observable variables play a major role in determining the impact cost: 
  • Relative order size (ROS): It is the ratio of our order size to the total traded quantity in the period. As it is hard to figure out how much quantity is going to be traded in the future, average of last n day trade volume can be used as a proxy.  Impact increases with trade size either linearly or at a decreasing rate.
  • Bid/Ask spread (S): It is the normalized spread of the order book. Impact increases linearly with the spread.
  • Volatility (V): More volatile assets tend to have higher impact cost (because of price chase and premature order fills). Impact increases with volatility at a decreasing rate.
  • Trading rate: If trades follow one another in quick succession, there will be an increased market impact (as each trade will move the price from where the previous trade left it). This happens because the order book has no time to recover from the impact of previous trades.
  • Trend cost: Selling a falling stock will have much higher impact cost than buying it. This implies that the delay between trade signal generation and trade execution will affect short term momentum strategies a lot more than their counterpart.

A calibrated impact cost model would look something like this:

Best way to get an accurate picture of impact cost is to collect the impact data on a trade by trade basis. And then use this data to calibrate the chosen model. There is an easier way to get a rough estimate of the incurred impact cost. By comparing the live trade returns with the back test returns (rerun the back test model during the live period) approximate values can be arrived at. Here one would be assuming that the real live effects not taken into account while back testing (stock alterations, inability to trade on closing price etc) average out to zero.
A word of caution, these impact cost estimates will work only in usual circumstances. For very big orders or extreme level of market volatility these models can give unrealistic estimates of market impact.  So with the calibrated model one must also specify the boundary conditions.

Sector Dispersion

Sector dispersion is the variation of returns between sector indices over time. It is a quick measure of industry level dislocations. Mathematically, it’s cross sectional variance of sector returns. A look back of around 10 to 40 trading days is typically used for calculating sector dispersion.

Indicators like index returns, implied volatility, stock dispersion etc have been widely used to classify market into different regimes. Use of sector dispersion as a regime indicator is largely unheard of. Sector dispersion measures the industry level deviations and can be high even in tranquil markets. Large values primarily indicate a fundamental shift in the outlook of market participants. It is a subtle yet powerful indicator of changing market dynamics.


Effect on strategy performance:
Sector ignorant trading strategies are stock trading strategies which do not take sector level information into account while generating trading signals. Jegadeesh and Titman’s medium term momentum strategy will fall into this category. These strategies unknowingly end up with large sectorial exposures when sector dispersion is high. On days following the high sector dispersion times, they can show very erratic returns. This is because the unmodeled sector level dynamics would play a significant role in determining the future stock returns. Hence sector dispersion can be used to dynamically alter the exposure of such strategies. 


Sector dispersion is also a good predictor of future market volatility. High sector dispersion typically precedes high market volatility.






Friday, September 20, 2013

How much to pay your fund manager?

An individual wants to invest his hard earned money in a fund seeking good returns on a long term basis. Now there are different kinds of fee structure that he can opt for including fixed fee, profit sharing or a mixture of both. How to go about evaluating the efficiency of these structures?
A rational client would ideally want his manager to work very hard and do the best he can and at the same time charge no fee at all. Of course the manager would not be so eager to manage the capital of such a client.  Hence the client has to take the manager’s perspective into account while negotiating for the best fee structure.
By altering the fee structure the client attempts at maximizing his net returns (Returns from investment – Fee he pays). At the same time, for a fee structure chosen by the client, manager attempts at maximizing his own returns (Fee he receives - cost of effort he put in).

Mathematically this problem can be formulated as following:
Client’s Perspective:
Find F(R) such that [R-F(R)] is maximized. Here F(R) is the opted fee structure which is a function of returns R. Here R is based on the performance of manager.
Manager’s Perspective:
Find E such that [U(F(R(E)))-C(E)] is maximized. Here E is the efforts put in by the manger. This effort would translate into R, returns on investment for client. Based on these returns the client would pay manager some fee, F. This fee would have some utility, U, for manager. C is the cost of effort the manager is putting in to achieve the returns.  

Following is the sample mappings of these variables.

  • The more the effort put in the higher the returns on investment. Here a linear mapping is used for the sake of simplicity.

  • The larger the fee received by the manager, the more is the utility for him. Here the utility function of a typical risk-averse agent is used.

  • The more effort the manager puts in the higher is the cost of effort. Also as the absolute level of effort increases it becomes harder and harder to do additional effort.
Case 1: Fixed fee structure
Here Kfr is the fixed fee percentage. Typical values are 0.5-5%. After calculating, the optimal effort for the mangers comes out to be zero. Client net returns would be (–Kfr). In real life situations, given a fixed fee structure the manager chooses to do the minimum work required to retain the client. This situation is clearly not favorable for the client.

Case 2: Variable (profit sharing) fee structure
Here Kfr is the profit-sharing percentage. The typical values will lie between 10%-50%. As we can see that in a profit sharing framework there is a nonzero effort the manager would put in. This situation is better for both the client and the manager (for certain values of Kfr) when compared with case 1.

Also we can see that the profit sharing percentage (Kfr) has a significant impact on the net returns of the client.  When the profit sharing percentage is very low (let’s say 5%) the agent does not have enough incentive to work. Hence the effort would be low and so will be the net returns for the client. On the other hand let’s assume that the profit sharing percentage is 95%. Though the manager’s effort increases monotonically with the profit sharing percentage, it does not happen linearly.  Hence the client will end up paying more for the incremental increase in effort.  This means there is an optimal profit sharing percentage at which the client can make the maximum out of the agreement.

Case 3: Fixed + Variable fee structure
This leads to two distinct types of behavior that mangers would exhibit.  The manager either chooses to do no work and still charge the fixed fee component (Kfr*Rlb). Or he chooses to do non zero effort so as to benefit from the variable fee component. The actual behavior depends on the exact value of Kfr and Rlb. 


Conclusion:
Based on the above analysis the client should keep the following points in mind before deciding on the structure:
  • A fix + variable fee structure is usually better than a fix fee structure. The profit sharing component gives the manager an incentive to work harder.
  • While going for a variable fee structure, both very low and very high profit sharing percentage should be avoided.
  • While going for a mix structure, ensure that either returns cut off (Rlb) for profit sharing is sufficiently low (realizable with good efforts) or profit sharing percentage (Kfr) is relatively high. Otherwise there would be no incentive for the manager to work.

Thursday, September 19, 2013

Upside potential ratio: A better alternative to Sharpe ratio

An nth order moment of x is defined as:
An nth order central moment is defined as:
In terms of density function f(x) moments can be expressed as:

Sharpe Ratio:
Sharpe Ratio is a popular measure of risk adjusted rewards. The rewards are considered to be the first moment of excess returns(x-MAR). Here x is the series of returns and MAR is the minimum acceptable return. The square root of second central moment of returns is considered as the risk.  Its popularity can be attributed to the simplicity with which it can be calculated.
But one must be aware that these implicit assumptions about the nature of risk and rewards can be misleading. Sharpe ratio treats volatility on either side of MAR equally. This means that for a uniformly distributed return series with mean of 10%; a return of 110% is as worse as a return of -90%!!! Simply put volatility is a poor measure of risk. On similar lines Sharpe, considers any returns below the MAR as rewards.

Upside potential ratio:
Upside potential ratio helps us to eliminate both these concerns with the use of partial moments. An nth order partial moment is a one sided moments above/below the threshold. An nth order lower partial moment (lpm) for a threshold t is defined as:


Similarly an nth order upper partial moment (upm) is defined as:


Upside potential ratio is the ratio of first order upper partial moment to that of square root of second order lower partial moment both about MAR. Hence essentially it’s a measure of upside performance with respect to downside risk.
Comparison:
 Let us consider two series A and B both identical in all respects except that B has some positive skew. Clearly B would be preferred over A as returns of a trading strategy.



Series
Mean
Variance
Skew
MAR
Sharpe Ratio
UPR
A
1.4973
0.9966
-0.0108
0
1.4999
10.0886
B
1.4940
0.9992
0.4088
0
1.4946
19.0381


The Sharpe ratio of A and B are essentially the same but there is a significant difference in the Upside potential ratio thus giving us better resolution of the returns. An important thing to note here is that though Upside potential ratio does not take higher moments into consideration per say, the asymmetry caused by skew has been accounted by the partial moments.

A practical use of Upside potential ratio is in strategy selection. Typically, a mean reverting (MR) strategy has a much higher Sharpe ratio when compared with a momentum (MM) strategy. But MR strategies are more exposed to tail risk due to absence of stop loss.  Hence in terms of moments, for the same mean returns:
1) Volatility of MM strategy > Volatility of MR strategy
2) Skew for MM strategy> Skew of MR strategy
This means, Sharpe ratio gives undue advantage to MR strategies. With the use of Upside potential ratio this bias can be minimized and a better strategy level diversification can be achieved.