Date of Award

Spring 6-9-2024

Document Type

Thesis (Undergraduate)

Department

Computer Science

First Advisor

Sean Smith

Abstract

File type identification is a vital step in automated file processing, especially in the realm of malware detection. The challenges with file type identification and evasion techniques that take advantage of them were pointed out over a decade ago. We show that this remains the case: file type identification implementations are still fragile, especially for files with ambiguous file types. We present a novel antivirus bypass technique via crafted tar archives that evades all detection from VirusTotal and numerous antiviruses: BitDefender, F-Secure, Kaspersky, Panda Dome, Trend Micro, Quick Heal, IKARUS, Avira. These crafted files evade detection by tricking file type identification implementations, but can still be unpacked on end-host machines using GNU tar or 7-zip. We show that these file type-masquerading archives are also incorrectly labeled by popular file type identifiers. We present a survey of publicly available tools for file type identification and shared file signature databases. Finally, we discuss countermeasures for this evasion technique by detecting files with ambiguous file types.

Share

COinS