Date of Award
Spring 2022
Document Type
Thesis (Undergraduate)
Department
Computer Science
First Advisor
Sarah Masud Preum
Abstract
This work explores entity based sentiment analysis for textual health advice through deep learning. We fine tuned a pretrained BERT model to analyze sentiments across five different predetermined categories which consist of food, medicine, disease, exercise, and vitality for three different sentiments: positive, negative, and neutral. Original set of annotated medical dataset from Dartmouth College’s Persist Lab was used to conduct the experiments. For the aim of tailoring the data for the purpose of entity based sentiment analysis, we explored data transformation techniques to generate optimum training examples. During the experiments, we were able to discover that the wide variety and complexity of terms for the medicine and the disease category posed difficulty on the BERT model which we utilized masking techniques to mitigate. We also demonstrated that our model struggles to learn neutral sentiments in cases where instances of all labels are balanced(in both training and testing). Our system was able to achieve an overall F1 score of 0.739 when conducting the experiment with masked text on a balanced dataset.
Recommended Citation
Chung, Dae Lim, "Entity Based Sentiment Analysis for Textual Health Advice" (2022). Computer Science Senior Theses. 17.
https://digitalcommons.dartmouth.edu/cs_senior_theses/17