Date of Award

Spring 5-31-2023

Document Type

Thesis (Undergraduate)


Computer Science

First Advisor

Inas Khayal


Improving patient-centered care necessitates accurate documentation of care preferences, a crucial aspect often underrepresented in administrative data. Most studies apply care documentation to specific patient populations, rather than more appropriately broad population of `seriously ill' patients. This paper addresses this gap by leveraging transformer-based machine learning models, exhibiting an improvement over traditional keyword-based search methods in identifying care preference documentation.

In order to capture a broad spectrum of seriously ill patients, we matched decedent patients to non-decedent counterparts by utilizing a propensity score matching, accounting for important variables like age, gender, primary diagnoses and commodities. We trained and fine-tuned Bio_ClinicalBERT and ClinicalLongformer models on a large dataset consisting of patient discharge summaries from last visit admissions and admissions within 6 months of last visit. By concentrating on key textual components within these summaries, we were able to enhance the signal-to-noise ratio, which consequently led to the capture of contextually nuanced mentions of care preference documentations missed by traditional keyword-based search methods.

These models demonstrated high sensitivity and specificity compared to industry-standard keyword-search methods, proving adept at interpreting complex clinical concepts, particularly for the multi-label text classification task for identifying care preference subdomains like goals-of-care conversations, code status clarification, referral to a palliative care specialist and hospice care. This study serves as a strong argument for the continued need for domain-specific pre-training of language models, particularly in the biomedical domain. Our findings not only contribute to enhancing end-of-life communication and aligning treatment with patients' care objectives but also pave the way for future research in this promising domain, with potential implications for improving patient care quality.

