Kmeans clustering is one of the simplest techniques used for classification. It partitions n observations into k clusters in which each observation belongs to the cluster with nearest center. Mathematically, Kmeans clustering tries to find the set of μ such that the following expression should be minimized.
Here d(x,y) is the distance function. Typical distance
functions used are squared euclidean, sum of absolute differences and correlation. μi is the center(mean/median as per the definition of distance function) of the observations in Si.
In line with my
previous post on Factor analysis based stock classification, we will attempt
to classify stocks into groups to uncover hidden trends if any exists.
Classification of
LIX15 stocks:
LIX15 is
an Indian equity market index that consists of 15 highly liquid stocks traded
on NSE. The observations
matrix consists of normalized daily returns of these 15 stocks sampled from
February to November 2013. Kmeans clustering is applied on the data using
squared euclidean distance function. Following is the result of a two cluster classification:
Cluster 1

Cluster 2

AXISBANK

CAIRN

BANKBARODA

MCDOWELLN

HINDALCO

TATAMOTORS

IDFC


JINDALSTEL


JPASSOCIAT


JSWSTEEL


MARUTI


RCOM


SBIN


TATASTEEL


YESBANK

The result are clusters with disproportionate size and non obvious interpretations. Interestingly enough the stocks in cluster 2 are the stocks which do not show any significant
loading on factors during the factor analysis. Hence prima facie kmeans has classified LIX15 constituents into two groups, one that moved with the broad market and the other which exhibited heavy idiosyncratic movements during the analysis period. Following is the outcome of a three cluster classification:
Cluster 1

Cluster 2

Cluster 3

CAIRN

AXISBANK

MCDOWELLN

HINDALCO

BANKBARODA

TATAMOTORS

JINDALSTEL

IDFC


JPASSOCIAT

MARUTI


JSWSTEEL

SBIN


RCOM

YESBANK


TATASTEEL

The clusters roughly corresponds with sectorial themes.

Fundamental theme

Cluster 1

Metal stocks

Cluster 2

Financial services stocks

Cluster 3

Erratic/heavily idiosyncratic stocks

Classification of BANKNIFTY stocks:
As with the LIX15 analysis, a two cluster based classification
is performed on the BANKNIFTY constituents.Following are the resulting clusters:
Cluster 1

Cluster 2

AXISBANK

BANKBARODA

HDFCBANK

BANKINDIA

ICICIBANK

CANBK

INDUSINDBK

PNB

KOTAKBANK

SBIN

YESBANK

UNIONBANK

The fundamental interpretation of the resulting clusters is
quite clear.
Fundamental theme


Cluster 1

Private sector banks

Cluster 2

Public sector banks

Conclusion:
Using clustering techniques, we have been able to group stocks. These grouping tend to convey a particular fundamental meaning. Among the LIX15 constituents the major classification is on the sectorial line. Among the BANKNIFTY constituents the classification lies along the public vs private ownership lines. These conclusions are in line with the one obtained from factor analysis based classification of stocks..
Nice blog!
ReplyDeletebtw...What is the distance function in your analysis?
Thanks.
ReplyDeleteI have used squared euclidean distance function for this analysis.