Document Type
Technical Report
Publication Date
Spring 2025
Technical Report Number
TR2025-1004
Faculty Approver
Soroush Vosoughi
Abstract
As digital pathology becomes increasingly popular, it is critical to develop machine learning solutions to utilize this data. While other image modalities have seen exponential increases in methodology availability, the same has not been true for histopathology images. This is likely in part because histopathology whole slide images possess unique characteristics that prevent simply applying existing methods as-is.
In this thesis, we identify and propose solutions to 3 open problems with histopathology images: 1. large raw image size (up to 150,000×150,000 pixels in size), 2. low class-positivity (low ratio of positive to negative patches), and 3. limited image availability with existing images having weak or no labels. We address the large raw image size problem by designing a knowledge distillation-based approach to reduce computational cost significantly with a modest decrease in classification performance. The computational cost reductions are substantial enough to enable real time use in clinical scenarios. For the low class-positivity issue, we develop a custom view generation approach for self-supervised representation learning. This view generation approach takes advantage of the low class-positivity to increase possible view pairings and produce better classification outcomes. Lastly, we present an image generation approach using existing image-spatial transcriptomics pairs to generate synthetic histopathology patches. We demonstrate these generated patches are clinically useful through evaluations including nuclei distribution quantification and downstream tasks.
Dartmouth Digital Commons Citation
DiPalma, Joseph, "Deep Learning for Fine-Grained Digital Histopathology Image Analysis" (2025). Computer Science Technical Report TR2025-1004. https://digitalcommons.dartmouth.edu/cs_tr/386
