McCormick Events | Northwestern Engineering

Mar

Zheng Yu PhD Final Defense March 4, 2026: Toward Practical Vulnerability Repair

Department of Computer Science (CS)

10:30 AM

EVENT DETAILS

The increasing scale and complexity of modern software have led to a surge in security vulnerabilities, outpacing the ability of developers to provide timely repairs. While Large Language Models (LLMs) have catalyzed advancements in Automated Vulnerability Repair (AVR), existing approaches often struggle in real-world settings due to an over-reliance on idealized assumptions, such as perfect fault localization and patch validation methodologies. This dissertation addresses these gaps by developing autonomous, LLM-based systems designed to navigate the end-to-end complexities of the repair process, including 0-day vulnerability mitigation, legacy system maintenance, and rigorous correctness evaluation.

To improve the practicality of AVR, we first introduce PatchAgent, an autonomous agent that integrates fault localization, patch generation, and patch validation. By utilizing a language server and interaction optimization, it mimics human-like reasoning to repair vulnerabilities triggered by proof-of-concept (PoC) inputs, achieving a 90% success rate on real-world datasets. Extending these capabilities to 1-day vulnerabilities, we present PortGPT, a system designed to automate patch backporting for large-scale projects like the Linux kernel. By autonomously navigating Git history and refining patches based on compiler feedback, PortGPT demonstrates high efficacy in migrating security fixes to older software branches, with several generated patches successfully merged into the Linux mainline.

Finally, to address the lack of reliable evaluation standards in the field, we present PVBench, a benchmark designed for the rigorous assessment of patch correctness. Our analysis reveals that existing AVR systems frequently produce "overfitted" patches that pass basic tests but fail to capture the original developer's intent or the true root cause of the vulnerability. By introducing PoC+ tests that encode complex semantic requirements, we demonstrate that current success rates are often significantly overestimated. Collectively, this work provides a comprehensive framework for building and evaluating AVR systems that are robust, autonomous, and applicable to the demands of real-world software maintenance.

TIME Wednesday March 4, 2026 at 10:30 AM - 1:00 PM

CONTACT Wynante R Charles wynante.charles@northwestern.edu

CALENDAR Department of Computer Science (CS)