Date of Award

Winter 12-13-2023

Document Type

Thesis (Ph.D.)

Department or Program

Computer Science

First Advisor

Dan Rockmore

Abstract

Transformer models have achieved remarkable success in the widest variety of domains, spanning not just a multitude of tasks within natural language processing, but also those in computer vision, speech, and reinforcement learning. The key to this success is largely attributed to the self-attention mechanism, particularly its ability to scale in performance as it grows in the number of parameters. Extensive effort has been underway to study the major linguistic properties learned by these models during the course of their pretraining. However, the role of certain finer linguistic phenomena present in language and their utilization by Transformers has not been the most active target of scientific inquiry. One such phenomenon is the phonetic aspect of language, which we theorize plays an important part in understanding of its usage and is a significant facet of the stylistic fingerprint of any given text. In this work, we attempt to address this property of English by focusing on poetic properties, as expressed in the ``model organism'' of limericks. That is, the Transformers' awareness of and reliance on poetic concepts using limericks as a case study. To this end, we introduce a dataset of limericks, a suite of tasks designed to study the comprehension of rhyming based on this dataset, as well as a range of algorithmic filters that identify limericks given a limerick-like text.\ Treating Transformers as a black box, we use these tools to analyze the presence of poetic information in the decision-making process of these models during the generation of text. Lastly, we will present a phonetic training regimen, which we hope will shed light on factors that enhance the ability of Transformers to utilize rhyming in different contexts.

Original Citation

@phdthesis{abdibayev2023enhancing, title={Probing and enhancing the reliance of Transformer models on poetic information}, author={Abdibayev, Almas}, year={2023}, school={Dartmouth College} }

Share

COinS