SWE-agent: an AI agent that resolves real GitHub issues at 12.5%
In one sentence Princeton presents SWE-agent, an agent with a dedicated ACI interface that resolves real GitHub issues on SWE-bench at 12.5% — 6x to 12x better than previous systems.
Fixing a bug in a real GitHub repository is complex work: you need to understand existing code, navigate files, modify the right spot, run tests, and verify the bug is actually fixed. Until now, no AI agent could do this reliably.
SWE-agent introduces a special interface called ACI (Agent-Computer Interface) — designed specifically for AI agents, not humans — that simplifies code navigation, file opening, and precise snippet editing.
With this dedicated interface, SWE-agent resolves 12.5% of real issues in the SWE-bench benchmark, compared to 1-2% for previous systems. Not yet enough to replace a developer, but a massive qualitative leap and the beginning of the "coding agent" category.
Companies
Princeton University
Tools
SWE-agent, GPT-4, SWE-bench
Tags
Sources