A new state-of-the-art Artificial Intelligence (AI) model has been introduced by a new startup called Cognition AI.
Cognition AI have released Devin, the world’s first fully autonomous AI software engineer. Devin has achieved groundbreaking success on the SWE-bench coding benchmark, demonstrating its ability to execute complex tasks and even surpass top human engineers.
Devin’s Capabilities
Devin can plan and execute complex engineering tasks requiring thousands of decisions. Devin can recall relevant context at every step, learn over time, and fix mistakes.
According to Cognition AI, Devin can perform autonomous coding and also fine-tune its AI models. While it shares certain similarities with GitHub and Microsoft’s Copilot developer tool, Cognition AI took it a notch further with Devin.
What Can Devin Do?
- Learn how to use unfamiliar technologies
- Build and deploy apps end-to-end
- Autonomously find and fix bugs
- Train and fine-tune its own AI models
Devin’s Performance
Devin’s capability was evaluated on the SWE-Bench benchmark. Devin was presented with some GitHub issues inherent in real-world open-source projects. The AI tool successfully resolved about 13.86% of the issues end-to-end. Compared to a previous model which registered a performance of 1.96% unassisted and 4.80% assisted, Cognition AI’s tool made a significant difference.