Date of Award
2021
Document Type
Thesis (Master's)
Department or Program
Department of Computer Science
First Advisor
Chris Bailey-Kellogg
Second Advisor
Gevorg Grigoryan
Third Advisor
Saeed Hassanpour
Abstract
Computational methods for predicting binding interfaces between antigens and antibodies (epitopes and paratopes) are faster and cheaper than traditional experimental structure determination methods. A sufficiently reliable computational predictor that could scale to large sets of available antibody sequence data could thus inform and expedite many biomedical pursuits, such as better understanding immune responses to vaccination and natural infection and developing better drugs and vaccines. However, current state-of-the-art predictors produce discontiguous predictions, e.g., predicting the epitope in many different spots on an antigen, even though in reality they typically comprise a single localized region. We seek to produce contiguous predicted epitopes, accounting for long-range spatial relationships between residues. We therefore build a novel Graph Convolution Network (GCN) that performs graph convolutions at multiple resolutions so as to represent and constrain long-range spatial dependencies. In evaluation on a standard epitope prediction benchmark, we see a significant boost with the multi-resolution approach compared to a previous state-of-the-art GCN predictor, with half of the test cases increasing in AUC-PR by an average of 0.15 and the other half decreasing by only 0.05. We further introduce a clustering algorithm that takes advantage of the contiguity yielded by our model, grouping the raw predictions into a small set of discrete potential epitopes. We show that within the top 3 clusters, 73% of test cases contain a cluster covering most of the actual epitope, demonstrating the utility of contiguous predictions for guiding experimental methods by yielding a small set of reasonable hypotheses for further investigation.
Recommended Citation
Oh, Lisa, "A Multi-Resolution Graph Convolution Network for Contiguous Epitope Prediction" (2021). Dartmouth College Master’s Theses. 46.
https://digitalcommons.dartmouth.edu/masters_theses/46
Included in
Bioinformatics Commons, Biology Commons, Computer Sciences Commons, Data Science Commons