Author ORCID Identifier

https://orcid.org/0009-0008-5167-9048

Date of Award

Spring 5-20-2026

Document Type

Thesis (Master's)

Department or Program

Computer Science

First Advisor

Soroush Vosoughi

Abstract

Vision Language Models (VLMs) have become a popular tool for everyday use in many domains. However, these models have showcased many flaws, including poor visual acuity, geometric reasoning, and internalized biases. In this paper, we will focus on improving model performance on the visual acuity and geometric reasoning aspects. We work on tasks where the VLM is asked to check if two circles are touching, count the number of shapes in an image, and so on. We contribute a schema-based framework for improving model performance on these tasks. This framework comprises three key components: (1) a proof-of-concept pipeline adaptable for inference, (2) a Reinforcement Learning framework designed to optimize the schemas, and (3) a model-based verification and refinement pipeline that iteratively improves schema quality. We evaluate and test this pipeline across closed and open-source models, showing significant performance boosts as well as indications of the concept’s efficacy. We also make our code available at https://github.com/carlosguealv/vlms_symb_project.

Recommended Citation

Guerrero Alvarez, Carlos, "MAKING VLMS LESS BLIND: A NEURO-SYMBOLIC APPROACH" (2026). Dartmouth College Master’s Theses. 311.
https://digitalcommons.dartmouth.edu/masters_theses/311

Dartmouth College Master’s Theses

MAKING VLMS LESS BLIND: A NEURO-SYMBOLIC APPROACH

Author ORCID Identifier

Date of Award

Document Type

Department or Program

First Advisor

Abstract

Recommended Citation

Included in

Browse

Search

Contribute

Questions?

Dartmouth College Master’s Theses

MAKING VLMS LESS BLIND: A NEURO-SYMBOLIC APPROACH

Author

Author ORCID Identifier

Date of Award

Document Type

Department or Program

First Advisor

Abstract

Recommended Citation

Included in

Share

Browse

Search

Contribute

Questions?