Date of Award

Spring 6-1-2021

Document Type

Thesis (Undergraduate)

Department or Program

Department of Computer Science

First Advisor

Daniel Rockmore

Second Advisor

Allen Riddell

Abstract

The migration of datasets online has created a near-infinite inventory for big name retailers such as Amazon and Netflix, giving rise to recommendation systems to assist users in navigating the massive catalog. This has also allowed for the possibility of retailers storing much less popular, uncommon items which would not appear in a more traditional brick-and-mortar setting due to the cost of storage. Nevertheless, previous work has highlighted the profit potential which lies in the so-called "long tail'' of niche, unpopular items. Unfortunately, due to the limited amount of data in this subset of the inventory, recommendation systems often struggle to make useful suggestions within the long tail, lending them prone to a popularity bias.

Our work explores different approaches which recommendation systems typically employ and evaluate the performance of each approach on various subsets of the Netflix Prize data to the end of determining where each approach performs best. We survey collaborative filtering approaches, content-based filtering approaches, and hybrid mechanisms utilizing both of the previous methods. We analyze their behavior on the most popular items, the least popular items, and a composite of the two subsets, and we judge their performance based on the quality of the clusters they produce.

COinS