Date of Degree


Document Type


Degree Name





Wim Vijverberg

Committee Members

Chu-Ping Vijverberg

Liuren Wu

Subject Categories

Econometrics | Finance


In the age when "Big Data" is becoming almost a household word, such abundance of information in different forms and representations can be of a great help for one's decision-making let it be a trader betting on a stock, or a policy-maker assessing the potential impact of proposed regulation. Whereas traditional economic research is primarily based on the use of numerical data continuous or discrete, there is a great deal of useful information that can be extracted from text data. Such information can power novel identification strategies or help perceive solutions from a different angle, but observed volumes of such data as well as its textual format require additional preprocessing techniques. Recent expansions in the Data Science methods allow to successfully use this alternative source of information and complement traditional economic modelling with a new insight. Abundant "Big Data" information fuels search for yet new market predictors. Some areas of research, asset pricing for instance, produced hundreds of such factors over the past few decades. However, many of these factors are weak and often lose their predicting power altogether after the date of publication. Moreover, traditional least squares approach would break down when studied by the hundreds of these factors, some of which are, in addition, quite highly correlated. To tame the "zoo" of the factors in a given problem, one would want to consider alternative methods suggested in the data science literature, which allow for regularization as well as potential non-linearities in the unobserved model structure. Regularization can also be useful in building a counterfactual { a technique so often used in a policy evaluation questions. In assessment of certain legislation's' effect on the individual or a firm, a construction of hypothetical world in which this regulation never took place is often done through weighted averaging of the units (other individuals, or firms) which were not subject to the regulation. Calculation of weights is an optimization process stability of which requires a set of certain constraints. Some of these constraints contradict natural way of things in certain policy evaluation problems. Thus, to relax these constraints, especially in the case where the number of unaffected units is large, one could apply regularization methods and derive the weights that are suitable to the true nature of the problem. In my dissertation, I assess the potential of usage of the machine learning methods in Finance and policy evaluation. The results show that these methods prove to be useful additions to the traditional econometric approach.

This dissertation consists of two chapters.

Chapter 1 In this study, I assess presence of herding and contrarian behavior in the stock market proxied by StockTweets authors. The personal signal of the traders about changes in the stock price is approximated by the news headlines from Reuters. I find that the market populated by the small (retail) traders which are eager to exchange their thoughts on the micro-blog is likely to exhibit presence of herding and contrarian behavior. Moreover, these social behavior estimates can be used as an insight about the stock price volatility, especially in the case of tech firms, which product/service is harder to evaluate.

Chapter 2 The Volcker Rule (also known as Section 619 of the Dodd-Frank Act) was put into law in 2010 to regulate prop trading activities of Wall Street banks. To assess the effect of the Rule on Banks' revenues and riskiness of their activity I apply a synthetic control method in a form of the elastic net regression. The identification strategy is based on 10-K filings text data. I find that the total gross notional amounts of derivative contracts held for the purpose other than trading would have been lower in the post-Volcker era, had there been no Volcker Rule signed into law, possibly suggesting an increase in the market-making or hedging activity of the banks. The bigger banks also show a decrease in riskiness of their activity and trading assets, although this result is statistically significant only for the few last observed months in the testing window. No significant impact of the Rule on profitability or assets available for sale has been found.