Date of Award
Spring 5-28-2024
Document Type
Thesis (Undergraduate)
Department
Computer Science
First Advisor
Alberto Quattrini Li
Second Advisor
Jeremy Manning
Abstract
Humans excel at understanding the thoughts and intentions of others (theory of mind) and leverage this ability to learn and adapt in social environments. However, replicating this capability in artificial agents remains a challenge. This paper explores the gap between fast, efficient learning often achieved by Reinforcement Learning (RL) algorithms and the interpretability and adaptability desired in agents interacting with humans. We propose a novel approach that integrates an inference network within existing RL frameworks. This allows agents to reason about the beliefs of others (nested reasoning) while learning optimal actions. Our method leverages approximate solutions to the I-POMDP framework, known for its ability to model complex multi-agent interactions, by adopting neural function accelerators for reusable computation across different beliefs an agent could have. We explore the feasibility of our method across environments testing prosocial, adversarial, and agent-agnostic relationships, and show how training policy networks on probablistic inferences results in better performance than traditional model-free methods. This approach offers several advantages, namely improved generalization to unseen tasks and the ability to trace back an agent's behavior to its evolving understanding of others' goals. This research paves the way for more interpretable and adaptable intelligent agents with applications in various real-world scenarios.
Recommended Citation
Jha, Kunal; Manning, Jeremy R.; and Li, Alberto Quattrini, "RIPL: Recursive Inference for Policy Learning" (2024). Computer Science Senior Theses. 48.
https://digitalcommons.dartmouth.edu/cs_senior_theses/48