Dissertations, Theses, and Capstone Projects
Date of Degree
9-2023
Document Type
Dissertation
Degree Name
Ph.D.
Program
Biology
Advisor
Lei Xie
Committee Members
Weigang Qiu
Eugenia G. Giannopoulou
Shaneen Singh
Subhash Sinha
Subject Categories
Bioinformatics
Keywords
drug repurposing, gene expression, data mining
Abstract
The conventional drug discovery process that employs the "one disease, one target, one drug'' paradigm is expensive, time-consuming, and has a high rate of failure for multi-genic complex diseases. An alternative approach to drug discovery is to repurpose an existing drug that has been used to treat some medical conditions. Drug repurposing is considered a promising method due to its accelerated the process of drug discovery and lower overall cost and risk.
Drug-perturbed gene expression profiles are powerful phenotype readouts of biological systems, and they have been widely used in drug repurposing studies. However, the existing drug-perturbed gene expression datasets are extremely noisy and the profiling is performed only in selected cell lines and compounds, limiting its applications to drug repurposing and compound screening. This thesis focuses on addressing those challenges---I have developed several novel computational methods and have demonstrated their powers in discovering novel therapeutics for multiple diseases.
First, we have designed a Bayesian signature detection pipeline to process raw data from L1000 assays into robust z-scores. The pipeline produced drug signatures for in silico drug screening and repurposing with excellent accuracy and robustness. Based on these drug signatures, we have developed a phenotypic screening pipeline to repurpose Ibudilast and MK-2206 for Alzheimer's Disease.
Second, we have employed machine learning models to predict gene expression patterns perturbed by new chemicals in new cell types. The predicted gene expression profiles are used for drug repurposing for COVID-19, pancreatic cancer, and Alzheimer's Disease, without prior experimental data. This method greatly expands the domain of in silico drug screening and phenotype-based drug repurposing.
Third, I have applied the knowledge graph model to drug repurposing. The model integrates multiple sources of information from diverse biomedical databases, including genes, drugs, phenotypes, and patients. The knowledge graph embeddings provide representations of biological entities and knowledge, helping us to uncover the relationships between the drugs and diseases.
Recommended Citation
Qiu, Yue, "Drug Repurposing Using Gene Expression Data Mining" (2023). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/5430