Date of Award

Spring 5-28-2024

Document Type

Thesis (Undergraduate)

Department

Computer Science

First Advisor

Alberto Quattrini Li

Second Advisor

Jeremy Manning

Abstract

Humans excel at understanding the thoughts and intentions of others (theory of mind) and leverage this ability to learn and adapt in social environments. However, replicating this capability in artificial agents remains a challenge. This paper explores the gap between fast, efficient learning often achieved by Reinforcement Learning (RL) algorithms and the interpretability and adaptability desired in agents interacting with humans. We propose a novel approach that integrates an inference network within existing RL frameworks. This allows agents to reason about the beliefs of others (nested reasoning) while learning optimal actions. Our method leverages approximate solutions to the I-POMDP framework, known for its ability to model complex multi-agent interactions, by adopting neural function accelerators for reusable computation across different beliefs an agent could have. We explore the feasibility of our method across environments testing prosocial, adversarial, and agent-agnostic relationships, and show how training policy networks on probablistic inferences results in better performance than traditional model-free methods. This approach offers several advantages, namely improved generalization to unseen tasks and the ability to trace back an agent's behavior to its evolving understanding of others' goals. This research paves the way for more interpretable and adaptable intelligent agents with applications in various real-world scenarios.

Recommended Citation

Jha, Kunal; Manning, Jeremy R.; and Li, Alberto Quattrini, "RIPL: Recursive Inference for Policy Learning" (2024). Computer Science Senior Theses. 48.
https://digitalcommons.dartmouth.edu/cs_senior_theses/48

Download

Included in

Cognitive Psychology Commons, Computer Sciences Commons, Social Psychology Commons

COinS

Computer Science Senior Theses

RIPL: Recursive Inference for Policy Learning

Date of Award

Document Type

Department

First Advisor

Second Advisor

Abstract

Recommended Citation

Included in

Browse

Search

Contribute

Links

Questions?

Computer Science Senior Theses

RIPL: Recursive Inference for Policy Learning

Author

Date of Award

Document Type

Department

First Advisor

Second Advisor

Abstract

Recommended Citation

Included in

Share

Browse

Search

Contribute

Links

Questions?