Date of Award
Spring 5-15-2024
Document Type
Thesis (Master's)
Department or Program
Computer Science
First Advisor
Soroush Vosoughi
Second Advisor
Peter Chin
Third Advisor
SouYoung Jin
Abstract
Human natural language communication frequently relies on extra-linguistic information to fill in gaps in the linguistic signal left by semantic underspecification, or the omission of details that can be inferred from prior knowledge or other modalities. Underspecification is particularly common in conversations between acquaintances, since these interlocutors share context. Underspecification is a key and beneficial feature of natural language that improves efficiency, although it can cause communication to fail if it is not resolved correctly. For language models to communicate effectively and in a human-like fashion, they must learn how to recognize and utilize underspecified language. This thesis argues that proficiency with underspecification and ambiguity are critical components of a pragmatic approach to machine learning systems. It then explores recent work with the goal of clarifying the current state of research into the impact of these phenomena in machine learning. Finally, it suggests potential strategies to make progress on quantifying and enhancing machine comprehension of underspecification, offering directions for future research to improve the communicative fluency of models.
Recommended Citation
Gottesman, Zachary S., "Towards Machine Proficiency With Semantic Underspecification" (2024). Dartmouth College Master’s Theses. 144.
https://digitalcommons.dartmouth.edu/masters_theses/144