Date of Degree


Document Type


Degree Name





Yildiray Yildirim


Linda Allen

Committee Members

Liuren Wu

Johannes Stroebel

Subject Categories

Real Estate


Strategic Behavior, Moral Hazard, Machine Learning, NLP, Race, COVID-19


Strategic default has been the achilles heel in academic finance for decades. By definition, whether a default has occurred due to strategic motive is unobservable. Moreover, a household has only so many avenues of conducting a strategic default. I use the context of commercial mortgages as property value as well property cashflow co-determine the default decision of these borrowers. I tease out the different strategic aspects of default from the ones emanating from liquidity constraints. The recent advances in Deep Neural Network (DNN), the advent of big data and the computational power associated with it has enabled me to disentangle the motive of default. Also, agency conflicts of brokers during origination of a mortgage loan and the moral hazards thereof has been documented based on the soft information about the borrowers. However, there have been few, if any paper, which retains the soft information about the borrowers, post origination, during the life of the loans. There has been a plethora of research about the biases generated towards foreclosures and other adverse outcomes post securitization for the last decade. But the soft information about the borrowers obtained by the brokers have been lost during the pooling process in securitization and there have been famous papers on the loss of information during the securitization process which happens at arms’ length from the original lender. I bridge this gap by using novel data on proprietary call transcripts (textual data) between borrowers and servicers. I am also in the process of procuring audio files which can capture mood, content, tone of these communications. My dissertation documents the use of machine learning (ML) techniques in commercial and residential real estate to answer long-standing questions, which could not previously be answered due to paucity of data and computational resources. In the first chapter, Irun a horserace of Deep Neural Network with other ML models and parametric models to provide a new identification strategy to disentangle liquidity-constrained default and incentives for strategic default. The second chapter attempts to answer the most pressing current socio-economic issue in the United States. Specifically, I compute the social, racial and dollar cost of the CARES Act and find these adhoc policies are as expensive as direct payment of $2,000 to households, if not worse. Finally, in the third chapter I create a novel framework to ingest quantified time-varying soft information from call transcript text data about borrowers in ML models on hard information. I alleviate the information asymmetry between the borrowers and issuers, increase mortgage market efficiency and mitigate the conflict of interest between master servicers and special servicers. There has been recent literature on the applications of supervised, unsupervised and reinforcement learning in mainstream academic finance. But, very little work is done in the highly illiquid opaque real estate literature using the cutting edge methods in Machine Learning. I take a fresh look at some of the long-debated questions in the literature using some of the machine learning techniques. I am also able to able to use the current COVID-19 pandemic as an exogenous shock for robustness check in most of my current research.

Included in

Real Estate Commons