Date of Award

Spring 5-14-2021

Document Type

Thesis (Master's)


Department of Computer Science

First Advisor

Sebastiaan Joosten

Second Advisor

Douglas McIlroy

Third Advisor

Sean Smith


This paper describes a programming tool, DoRP, that can detect the differences between two LLVM programs. In particular, we wish to enable a comparison of the optimized and unoptimized output of GHC (Glasgow Haskell Compiler) at the LLVM stage. Because LLVM is a low-level language, programs written in this language are often large and hard to read. Moreover, the generated LLVM program is not only extensive but also has randomly named variables and randomly ordered functions,which increases its complexity. Thus, we have designed DoRP to help users learn the relationships between two LLVM programs quickly.

This tool is designed to compare two machine-generated LLVM programs. It can do the parameterized matching to find matched statements with or without substitutions of variable names, regardless of semantically insignificant reordering of these statements. Our method is parsing LLVM programs into a Haskell AST (abstract syntax tree) datatype, using these ASTs to build a graph, and finally finding the matched statements with a matching theory. DoRP can also reorder the input pro-grams to improve the readability. The output is shown in a spreadsheet. Using the spreadsheet, users can directly see the similarity of structures of two programs and can adjust the output format. This paper discusses the method we used, experiments and results, and future work.

Available for download on Saturday, June 11, 2022