Date of Award

5-1-2011

Document Type

Thesis (Undergraduate)

Department or Program

Department of Computer Science

First Advisor

Bill McKeeman

Abstract

Dynamic languages provide new challenges to traditional static analysis techniques, leaving most errors to be detected at runtime and making many properties of code difficult to infer. Ruby code usually takes advantage of both dynamic typing and metaprogramming to produce elegant yet difficult-to-analyze programs. Function evalpq and its variants, which usually foil static analysis, are used frequently as a primitive runtime macro system. The goal of this thesis is to answer the question: What useful information about real-world Ruby programs can be determined statically with a high degree of accuracy? Two observations lead to a number of statically-discoverable errors and properties in parseable Ruby programs. The first is that many interesting properties of a program can be discovered through traditional static analysis techniques despite the presence of dynamic typing. The second is that most metaprogramming occurs when the program files are loaded and not during the execution of the "main program." Traditional techniques, such as flow analysis and Static Single Assignment transformations aid extraction of program invariants, including both explicitly programmed constants and those implicitly defined by Ruby's semantics. A meaningful, well-defined distinction between load time and run time in Ruby is developed and addresses the second observation. This distinction allows us to statically discern properties of a Ruby program despite many idioms that require dynamic evaluation of code. Lastly, gradual typing through optional annotations improves the quality of error discovery and other statically-inferred properties.

Comments

Originally posted in the Dartmouth College Computer Science Technical Report Series, number TR2011-686.

COinS