Date of Award

Spring 3-14-2024

Document Type

Thesis (Ph.D.)

Department or Program

Quantitative Biomedical Sciences

First Advisor

Saeed Hassanpour

Abstract

Accurate prediction of patient outcomes is crucial for shared clinical decision-making, treatment planning, and patients' psychological adjustment. Histopathological features of cancer, including tumor size, lymph node involvement, and metastasis, are commonly incorporated into survival prediction models, underscoring the prognostic value of whole slide images (WSIs). Concurrently, studies have highlighted the significance of omics data, such as transcriptomics, in providing valuable insights into cancer prognosis.

The emerging deep learning methods have brought new opportunities in biomedical informatics. Despite a growing body of studies on the application of deep learning methods for predicting prognosis using WSIs, the results are varied, primarily due to the extensive size of WSIs and the limited amount of labeled datasets. Regarding omics data, the high dimensionality poses a widely known computational challenge, and obtaining whole transcriptome sequencing in clinical settings is also cost-prohibitive. Additionally, while various fusion approaches have been developed for prediction tasks involving multiple modalities, their effectiveness varies by context, necessitating scrutiny for choosing the appropriate method for specific tasks.

In this thesis, we present comprehensive deep learning frameworks to extract prognostic information from clinical, WSIs, and omics data, leveraging recent methodological breakthroughs. We developed self-supervised vision transformer (MaskHIT) model to analyze WSIs and automatically delineate their regions of interest, without explicit annotations, for prognosis prediction. We also pre-trained a BERT-like model using gene expression information (GexBERT) to represent genetic biomarkers and evaluated its utility in assisting survival prediction when only limited data is available or with missing data. Finally, utilizing data collected from the New Hampshire Colonoscopy Registry (NHCR), we combined MaskHIT-derived feature representations with demographic and clinical information, and explored multiple fusion techniques to predict patients' risk five years after the baseline.

This thesis contributes to the advancement of prognostic prediction in cancer through innovative deep learning approaches. By addressing challenges in WSIs analysis and leveraging gene expression associations, the research enhances the accuracy and applicability of prognosis prediction models, offering valuable insights for the improvement of patient outcome prediction in the field of oncology.

Available for download on Wednesday, May 14, 2025

Share

COinS