Date of Award


Degree Type


Degree Name

Master of Science in Statistics


Computer Science and Statistics

First Advisor

Natallia Katenka


To maximize returns and diversify a financial portfolio, the stock price market participants have always been interested in learning associations of stock price returns for different companies. Five primary goals of this thesis are: (1) to evaluate and infer associations of stock returns between different companies in selected industrial sectors and countries, and (2) to identify groups of companies that exhibit the most similar stock market trends, and (3) to evaluate changes in associations between companies in time period from 2009 to 2015, (4) to forecast future return movements using selected classification methods, and (5) to explore the relationship between the accuracy of classification of stock return movements and network node properties.

This thesis analyzed daily stock price data collected from publicly available sources, Yahoo Finance, for a sample of eighty-nine selected companies from four industrial sectors and three countries (China, Germany, and the US) for a time period of seven years from 2009 to 2015. Daily prices were converted into returns and then used to compute a correlation matrix and a corresponding association network. Obtained network was employed to identify clusters of companies that exhibit similar return trends and to evaluate the relationships within and between different industrial sectors. To assess changes in associations between companies during special financial events, annually dynamic networks were created. Four classification methods, namely Linear Discriminant Analysis, Quadratic Discriminant Analysis, k-Nearest Neighbors, and Logistic Regression were built to predict price movements for all selected companies. The relationships between classification accuracy rates and network properties were evaluated graphically.

The results of the network-based analysis showed that the companies that traded in the same stock market and/or belonged to the same industrial sector had significant associations. Specifically, Chinese companies had higher inner correlations in banking and telecommunication sectors; the US and German companies had stronger associations in banking and auto manufacturing sectors. Interestingly, the associations among companies became stronger and more companies tended to be grouped together in the network during significant financial events and in the early recovery periods. The results of classification analysis revealed the superior performance of logistic regression method compared to other three classification methods, particularly for the Chinese companies. Remarkably, companies that acted as followers and belonged to medium-size clusters with eight to thirteen neighbors in the association network were easier to classify than other companies, thereby supporting the relationship between classification and network-based methods.



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.