Dissertations and Theses

Date of Degree


Document Type


Degree Name

Doctor of Philosophy (Ph.D.)


Epidemiology and Biostatistics


C. Mary Schooling

Committee Members

Sheng Li

Katarzyna Wyka

Jamie Geier

Subject Categories

Epidemiology | Public Health


Mendelian randomization, Phenome-wide associations study, Rheumatoid arthritis, Crohn's disease, tocilizumab



Rheumatoid arthritis (RA) and Crohn’s disease (CD) are two among a group of immune-mediated inflammatory diseases (IMIDs). These diseases are common, collectively affecting as many as two million Americans and as many as 28 million people worldwide. They have different clinical presentations but may have a common pathogenesis. An over-expression of tumor necrosis factor (TNF) and interleukin-6 (IL-6) have been proffered as a causal factor for IMIDs. These and other IMIDs may co-occur in patients, and it has been shown that patients with one IMID may have an increased risk of another IMID. What is not clear is whether the co-occurrence of a second IMID is caused by the existence of the first IMID.

Mendelian Randomization (MR) has emerged as an important tool that facilitates causal inference from observational studies. This study design can allow for avoidance of a common problem in epidemiologic studies, confounding,because it takes advantage of the random allocation of genetic make-up at conception. MR compares disease status by genetically predicted exposure to obtain unconfounded estimates. MR implementation has been facilitated by the increasing availability of Genome-wide association studies (GWAS) genotyping millions of single nucleotide polymorphisms (SNPs).

A recent extension of Mendelian Randomization is to conduct Phenome-wide Association Studies (PheWAS) to use genetic variants that predict response to treatment, for drug target validation and repurposing studies using these genetic variants as genetic proxies to identify both on-target and off-target mechanistic effects of a pharmacologic intervention. As an example, the minor allele of rs7529229, which is on the IL6R gene and is associated with interleukin 6 (IL6) concentrations, predicts response to tocilizumab, a pharmacologic treatment approved for the treatment of RA. A further goal of the current study is to use PheWAS to explore the scope for re-purposing tocilizumab as to characterize the off-target effects of this treatment.

The hypothesis that one IMID causes another was addressed by two MR studies, in which one of the IMIDs was tested as a cause of the other. Additionally, a PheWAS evaluated whether genetic proxies of therapeutic treatments for RA can identify any new effects.


A two-sample Mendelian randomization study was conducted to assess the possible causal effect of RA on CD for the first specific aim. The primary analysis used the inverse variance weighted (IVW) method; sensitivity analyses include weighted median, MR-Egger, and MR-PRESSO (Pleiotropy RESidual Sum and Outlier) methods. Genetic predictors of RA obtained from a GWAS including 22 studies with participants from European and Asian backgrounds, with 19,234 cases and 61,565 controls, were applied to summary genetic associations for CD, from the International IBD Genetics Consortium (IIBDGC), (17,897 cases, 33,977 controls) of European descent.

A two-sample Mendelian randomization study was conducted to assess a possible causal role of CD on RA for the second specific aim. The primary analysis used IVW to determine the effect of CD on RA. Sensitivity analyses included weighted median, MR-Egger, and MR-PRESSO. Genetic predictors of CD from summary statistics from the IIBDGC, (17,897 cases, 33,977 controls of European descent) were applied to a GWAS of RA that included 22 studies including patients of European and Asian backgrounds (19,234 cases, 61,565 controls).

One single nucleotide polymorphism (SNP) predicting response to TCZ in RA patients was identified, rs7529229. For the third specific aim, the PheWAS (Phenome-wide Association Study) function in the MR-base web-based application was used to identify traits associated with this SNP to use as potential drug targets for inclusion in the analysis. As this is an agnostic search, a p-value threshold of 2.4x10-6 was used based on a Bonferroni correction. The PheWAS provided effect estimates (provided as beta) and p-values which enabled evaluation of potential new targets for TCZ.


For specific aim 1, all four analyses showed a protective effect of RA on CD. (IVW odds ratio [OR] 0.89, 95% confidence interval [CI] 0.77-1.00, p=0.042; MR-Egger OR 0.71, 95% CI 0.50-0.93, p=0.004; weighted median OR 0.78, 95% CI 0.72-0.84, p=8.4x10-15; MR-PRESSO OR 0.92, 95% CI 0.84-1.00, p= 0.047).

For specific aim 2, the primary analysis and all three sensitivity analyses showed no causal effect of CD on RA (IVW OR 0.97, 95% CI 0.88-1.07, p=0.59; weighted median OR 0.98, 95% CI 0.94-1.03, p=0.49; MR-Egger OR 0.92, 95% CI 0.67-1.18, p=0.54, MR-PRESSO OR 1.00, 95% CI 0.64-1.36, p=0.90).

For specific aim 3, the SNP rs7529229 predicting response to TCZ in RA was associated with 21,031 traits. Seventeen of these met the Bonferroni corrected p-value threshold for statistical significance. Of these, three were excluded from consideration: abdominal aortic aneurysm, due to missing beta, and two traits described as “Blood clot DVT bronchitis emphysema asthma rhinitis eczema allergy diagnosed by doctor: None of the above,” as these did not indicate a trait. Among the other 10 unique traits, four showed an inverse association with the SNP predicting response to TCZ: RA, coronary heart disease (CHD), and two blood counts (red cell distribution width and granulocyte percentage of myeloid white cells). Six traits were positively associated with this SNP, meaning use of TCZ may increase the risk of the following: eczema, asthma, a less specific trait of hay fever, allergic rhinitis or eczema, tonsillectomy, and two additional blood counts (mean corpuscular hemoglobin and monocyte percentage of white cells).


The first MR study suggests that RA protects from the development of CD. It is possible this is due to issues with the data sources, or the infrequency of co-occurrence of the two diseases. There also may be a currently unknown biological mechanism causing this effect.

The second MR study suggests that CD does not cause rheumatoid arthritis. Co-occurrence of CD and RA is fairly uncommon, and when it does occur, may be due to shared causes of both conditions such as overexpression of TNF-a or IL-6.

The PheWAS did not yield any new targets for tociluzumab. As rs7529229 predicts response to TCZ in RA patients, it was expected that there would be an inverse association between the SNP and RA. Other studies have already evaluated the inverse association of this SNP with CHD, suggesting CHD as a potential target for TCZ. The positive association with asthma was somewhat unexpected, as some studies have suggested a potential for IL-6 inhibition in asthma. The positive association with eczema were expected as this was reported in the clinical trial program for TCZ and in other PheWAS of SNPs associated with IL-6 levels. The impact of TCZ on various blood counts is difficult to interpret. The direction of the association with asthma is surprising, even though a recent PheWAS reported a similar result.

