Author ORCID Identifier

0000-0002-9211-6224

Date of Award

2025

Document Type

Thesis (Ph.D.)

Department or Program

Computer Science

First Advisor

Christophe Hauser

Second Advisor

Sergey Bratus

Third Advisor

Stefan Nagy

Abstract

Fuzzing is one of the most effective techniques for improving software reliability and security. By automatically generating diverse inputs and executing programs at scale, it exposes numerous implementation flaws and design weaknesses. This dissertation enhances the effectiveness and efficiency of fuzzing through three complementary components.

First, applying fuzzing to software libraries requires high-quality harnesses that invoke APIs with valid sequences and parameters, yet manual harness development is labor-intensive and error-prone. This work proposes a framework that automatically generates fuzzing harnesses by extracting real-world API usage patterns from external source code. The underlying static analysis efficiently recovers API interactions at scale without sacrificing precision, enabling the synthesis of effective harnesses for previously untested library functions.

Second, traditional fuzzing relies on coverage-based guidance to explore a program’s input space. While effective for general code exploration, such metrics overlook deeper semantic behaviors. This dissertation introduces a fine-grained coverage guidance technique that captures program semantics beyond simple code coverage, facilitating the detection of complex logical and semantic issues that conventional fuzzers may miss.

Third, modern systems depend on standardized specifications for interoperability. However, real-world implementations often diverge from these standards, introducing subtle inconsistencies and security risks. To detect such semantic deviations, this component presents a differential testing framework that systematically compares independent implementations by generating inputs targeting potential divergence points derived from grammar-based specifications. The framework uncovered multiple previously unknown parsing vulnerabilities in widely used open-source software, demonstrating its effectiveness in identifying semantic inconsistencies beyond conventional fuzzing capabilities.

Together, these contributions extend fuzzing beyond standard coverage-based exploration toward deeper semantic understanding and specification conformance. By integrating automated harness generation, fine-grained semantic guidance, and differential analysis, this work advances the precision and applicability of fuzz testing, ultimately improving the robustness and security of modern software systems.

Available for download on Wednesday, January 06, 2027

Included in

Cybersecurity Commons

Share

COinS