Architecture Performance How it works Team Join the Hub →
Shadow Demo · 10M Parameters

Reason smarter,
not larger.

ARA Cognitives builds AI that solves hard problems through iterative self-modification — not brute-force scale. A 10M-parameter model that rivals billion-parameter baselines on mathematical reasoning.

10M Parameters
~775M Effective capacity
1.8× Cheaper per problem
40/60 Core / workspace split
Iterative Reasoning Loop · ARA Architecture
REASONING CORE · θ_R · FROZEN Propose Selects next operation o_t from 25+ primitives. Checks coherence. Detects knowledge gaps. 4M PARAMS apply(o_t, s_{t-1}) PLASTIC WORKSPACE · θ_P · LIVE Update Gradient step on L_coherence. Resistance scores track which neurons to protect. 6M PARAMS Cons(s_t) · Prog(t) COHERENCE CHECK ✓ PASS → s_t+1 ✗ FAIL → backtrack next iteration t+1 MICRO-ITERATIONS 0

A frozen core.
A live workspace.

ARA partitions every weight matrix into two structurally distinct zones. The 40% reasoning core is permanently frozen after pre-training — it encodes the invariant logic of mathematical thought. The 60% plastic workspace updates via gradient descent at every inference step, building temporary circuits tuned to the specific problem at hand.

Frozen reasoning core — θ_R (40%)

4M parameters. Pre-trained on symbolic solution traces, then permanently frozen. Encodes 25+ mathematical primitives: distribute, substitute, differentiate, factorise, and more. Proposes operations and runs the coherence checker.

Adaptive plastic workspace — θ_P (60%)

6M parameters. Updated by gradient descent on L_coherence at every micro-iteration — during inference, not training. Holds intermediate state, absorbs retrieved knowledge, and self-organises into a three-tier resistance hierarchy.

Elastic weight consolidation (EWC)

Per-neuron resistance scores R_i track which plastic weights have proven useful. Successful neurons are consolidated; neurons responsible for failed steps soften, enabling backtracking without forgetting hard-won progress.

Progressive knowledge absorption

When the core detects a knowledge gap, it issues structured library queries at increasing depth — from domain identification through edge-case retrieval. Retrieved content is absorbed via K_absorb=3–5 gradient steps into the live workspace.

Smaller model.
Deeper reasoning.

The iterative architecture gives ARA structural advantages that raw parameter counts cannot buy. More iterations on harder problems. No depth limit. Knowledge decoupled from memory.

🎯 ≥75% GSM8K target accuracy

Within 7 points of billion-parameter baselines, at a fraction of the compute cost.

📐 ≥40% MATH benchmark target

Competitive across all MATH categories. Projected to exceed 1B baselines on Level 5 hard problems.

1.8× FLOP efficiency

570B FLOPs for 1,000 micro-iterations vs 1,030B FLOPs for a single 1B model forward pass.

📦 120 MB Total memory footprint

Fits on any modern GPU or Apple Silicon. No datacenter required for inference.

♾️ No limit Reasoning chain depth

Standard transformers are capped at L layers. ARA iterates for as long as the problem requires.

🔭 ~775M Effective parameter equivalents

N_eff = N × T^0.5 × (1 − f_facts)⁻¹. 10M parameters punch at 775M equivalent capacity.

The iterative
reasoning engine.

ARA solves problems through a structured loop — propose, check, update, adapt — until the solution converges or the budget is exhausted. Every step is mathematically grounded.

01 — PROPOSE

Operation proposal

The frozen reasoning core proposes the next logical operation based on the current problem state and its pre-trained operation library.

theta_R · frozen
02 — EXECUTE

State update

The proposed operation is applied to the current state, producing a candidate new state with updated symbolic assertions and variable bindings.

Symbolic engine
03 — CHECK

Coherence verification

The frozen coherence checker validates consistency and measures progress toward the goal. Failed checks trigger backtracking with resistance softening.

Cons(s_t) · Prog(t)
04 — ADAPT

Workspace update

The plastic weights update via gradient descent on L_coherence. Active neurons gain resistance; failed neurons soften. The workspace self-organises.

theta_P · EWC
05 — ABSORB

Knowledge retrieval

When the core detects a knowledge gap, it queries an external library and absorbs the retrieved content through mini inference-time fine-tuning.

Progressive deepening
06 — CONVERGE

Solution delivery

The loop terminates when all subgoals are satisfied, the budget is exhausted, or confidence falls below threshold. Result is returned with full step trace.

h(s_t, s_goal) = 0

Every claim,
quantified.

ARA is not a metaphor. Every architectural decision — the 40/60 split, the coherence loss, the adaptive resistance — is derived from first principles and rendered as an equation.

ara_scaling.py
# ARA Scaling Identity

N_eff = N_ARA × T^beta × (1 - f_facts)^-1

# Shadow Demo values
N_ARA = 10_000_000 # 10M params
T = 1_000 # micro-iters
beta = 0.5 # iter exponent
f_facts = 0.6 # fact fraction

N_eff 775_000_000 # ~775M equiv.

Built by researchers,
for the real world.

ARA Cognitives was founded to answer a single question: can rigorous mathematical architecture make AI genuinely more capable — without requiring more power?

A
Founder
Kelvin
E
Engineering Lead
Systems & Infrastructure
M
ML Research
Training & Evaluation
S
Applied Science
Benchmarks & Products

We're inviting builders who believe AI efficiency matters as much as capability.

Sign in to the Hub →

Ready to reason
differently?

The Shadow Demo is in active development. We're working with a small group of research partners and enterprise pilots. Apply to join the early access programme.