LLM vs LLM: Fault Localization

This project explores a novel framework where two large language models (LLMs) collaborate for automated fault localization. One LLM functions as a fault injector, generating synthetic buggy C++ programs, while the other acts as a debugger, fine-tuned to pinpoint bug locations in code. Inspired by generative adversarial setups, this iterative loop creates a self-improving debugging ecosystem.

We implemented a prototype using the Gemma-3-12B model to generate 5000 buggy programs and fine-tuned a 12B-parameter debugger model with PEFT and QLoRA. Preliminary results showed improved accuracy and reduced training loss, though challenges such as limited resources, long training times, and fault realism emerged. The project validates this approach against literature and outlines directions such as incorporating real bug benchmarks (Defects4J, CodeNet), optimizing pipelines, and reinforcing LLM feedback loops.

With adequate resources, this LLM-vs-LLM framework has the potential to significantly advance automated debugging and accelerate software development.