Computer Science Senior Theses

Utilizing Natural Language Processing for Automated Clinical Text Review: Identification of Care Preference Documentation in Patients’ Discharge Summaries

Saksham Arora, Dartmouth CollegeFollow

Date of Award

Spring 5-31-2023

Document Type

Thesis (Undergraduate)

Department

Computer Science

First Advisor

Inas Khayal

Abstract

Improving patient-centered care necessitates accurate documentation of care preferences, a crucial aspect often underrepresented in administrative data. Most studies apply care documentation to specific patient populations, rather than more appropriately broad population of `seriously ill' patients. This paper addresses this gap by leveraging transformer-based machine learning models, exhibiting an improvement over traditional keyword-based search methods in identifying care preference documentation.

In order to capture a broad spectrum of seriously ill patients, we matched decedent patients to non-decedent counterparts by utilizing a propensity score matching, accounting for important variables like age, gender, primary diagnoses and commodities. We trained and fine-tuned Bio_ClinicalBERT and ClinicalLongformer models on a large dataset consisting of patient discharge summaries from last visit admissions and admissions within 6 months of last visit. By concentrating on key textual components within these summaries, we were able to enhance the signal-to-noise ratio, which consequently led to the capture of contextually nuanced mentions of care preference documentations missed by traditional keyword-based search methods.

These models demonstrated high sensitivity and specificity compared to industry-standard keyword-search methods, proving adept at interpreting complex clinical concepts, particularly for the multi-label text classification task for identifying care preference subdomains like goals-of-care conversations, code status clarification, referral to a palliative care specialist and hospice care. This study serves as a strong argument for the continued need for domain-specific pre-training of language models, particularly in the biomedical domain. Our findings not only contribute to enhancing end-of-life communication and aligning treatment with patients' care objectives but also pave the way for future research in this promising domain, with potential implications for improving patient care quality.

Recommended Citation

Arora, Saksham, "Utilizing Natural Language Processing for Automated Clinical Text Review: Identification of Care Preference Documentation in Patients’ Discharge Summaries" (2023). Computer Science Senior Theses. 18.
https://digitalcommons.dartmouth.edu/cs_senior_theses/18

Download

Included in

Bioinformatics Commons, Computer Sciences Commons

COinS

Computer Science Senior Theses

Utilizing Natural Language Processing for Automated Clinical Text Review: Identification of Care Preference Documentation in Patients’ Discharge Summaries

Date of Award

Document Type

Department

First Advisor

Abstract

Recommended Citation

Included in

Browse

Search

Contribute

Links

Questions?

Computer Science Senior Theses

Utilizing Natural Language Processing for Automated Clinical Text Review: Identification of Care Preference Documentation in Patients’ Discharge Summaries

Author

Date of Award

Document Type

Department

First Advisor

Abstract

Recommended Citation

Included in

Share

Browse

Search

Contribute

Links

Questions?