Latest

ARC-AGI-2 Public Eval: Agentica vs. CoT0%10%20%30%40%50%60%70%80%90%100%$0.1$1$10$100Cost Per Task ($)Score (%)SOTAAgentica Opus 4.6 (120k) HighOpus 4.6 (120k) HighAgentica GPT 5.2 (XHigh)GPT 5.2 (XHigh)*Agentica Opus 4.5Opus 4.5 (32k)AgenticaOpenAIAnthropic
FEBRUARY 10, 2026

SotA ARC-AGI-2 Results with REPL Agents

Exploiting the reasoning capabilities of code mode agents and RLMs with the Agentica framework.

Our implementation achieves a score of 85.28% with Opus 4.6 (120k) High and increases the scores of GPT 5.2 (XHigh) and Opus 4.5 by 10 and 20 percentage points respectively. The agent is 350 lines of Python and uses the Agentica framework.

Check out the code on GitHubsymbolica-ai/arcgentica
MORE STORIES