Blog

ARC-AGI-3: Score vs Cost0%10%20%30%40%Score (%)$1$10$100$1k$10kCost ($)Gemini 3.1Pro(Preview)Grok 4.20(BetaReasoning)GPT-5.4(High)Opus4.6(Max)SOTAAgentica Opus4.6 (High)
March 25, 2026Research

From 0% to 36% on Day 1 of ARC-AGI-3

Achieving 36% on ARC-AGI-3 using the Agentica framework.

Our implementation achieves a score of 36.08% with the Agentica SDK on the ARC-AGI-3 public evaluation set, outperforming base model CoT baselines of 0.2% (Opus 4.6 Max) and 0.3% (GPT 5.4 High).

Check out the code on GitHubsymbolica-ai/ARC-AGI-3-Agents
Read
March 19, 2026Publication

Introducing Agentica: Agents by Anyone, for Everyone

Agentica Beta is live. A builder for long-running AI agents. Describe a task in plain English, connect your tools, and deploy an agent that keeps working in the background.

Read
Runtime as Context: How Agentica SDK Agents Reason Over Data
February 19, 2026Publication

Runtime as Context: How Agentica SDK Agents Reason Over Data

Introducing a new paradigm for LLMs that leverages runtime context to enhance reasoning and decision-making.

Read
ARC-AGI-2 Public Eval: Agentica vs. CoT0%10%20%30%40%50%60%70%80%90%100%$0.1$1$10$100Cost Per Task ($)Score (%)SOTAAgentica Opus 4.6 (120k) HighOpus 4.6 (120k) HighAgentica GPT 5.2 (XHigh)GPT 5.2 (XHigh)*Agentica Opus 4.5Opus 4.5 (32k)AgenticaOpenAIAnthropic
February 10, 2026Research

SotA ARC-AGI-2 Results with REPL Agents

Exploiting the reasoning capabilities of code mode agents and RLMs with the Agentica framework.

Our implementation achieves a score of 85.28% with Opus 4.6 (120k) High and increases the scores of GPT 5.2 (XHigh) and Opus 4.5 by 10 and 20 percentage points respectively.

Check out the code on GitHubsymbolica-ai/arcgentica
Read
Beyond Code Mode: The Agentica SDK
December 9, 2025Publication

Beyond Code Mode: The Agentica SDK

Build agents that interact with runtime objects through code.

Read