Date of Award
Spring 5-31-2023
Document Type
Thesis (Undergraduate)
Department
Computer Science
First Advisor
Inas Khayal
Abstract
Improving patient-centered care necessitates accurate documentation of care preferences, a crucial aspect often underrepresented in administrative data. Most studies apply care documentation to specific patient populations, rather than more appropriately broad population of `seriously ill' patients. This paper addresses this gap by leveraging transformer-based machine learning models, exhibiting an improvement over traditional keyword-based search methods in identifying care preference documentation.
In order to capture a broad spectrum of seriously ill patients, we matched decedent patients to non-decedent counterparts by utilizing a propensity score matching, accounting for important variables like age, gender, primary diagnoses and commodities. We trained and fine-tuned Bio_ClinicalBERT and ClinicalLongformer models on a large dataset consisting of patient discharge summaries from last visit admissions and admissions within 6 months of last visit. By concentrating on key textual components within these summaries, we were able to enhance the signal-to-noise ratio, which consequently led to the capture of contextually nuanced mentions of care preference documentations missed by traditional keyword-based search methods.
These models demonstrated high sensitivity and specificity compared to industry-standard keyword-search methods, proving adept at interpreting complex clinical concepts, particularly for the multi-label text classification task for identifying care preference subdomains like goals-of-care conversations, code status clarification, referral to a palliative care specialist and hospice care. This study serves as a strong argument for the continued need for domain-specific pre-training of language models, particularly in the biomedical domain. Our findings not only contribute to enhancing end-of-life communication and aligning treatment with patients' care objectives but also pave the way for future research in this promising domain, with potential implications for improving patient care quality.
Recommended Citation
Arora, Saksham, "Utilizing Natural Language Processing for Automated Clinical Text Review: Identification of Care Preference Documentation in Patients’ Discharge Summaries" (2023). Computer Science Senior Theses. 18.
https://digitalcommons.dartmouth.edu/cs_senior_theses/18