Shadow Demo · 10M Parameters

Reason smarter,
not larger.

ARA Cognitives builds AI that solves hard problems through iterative self-modification — not brute-force scale. A 10M-parameter model that rivals billion-parameter baselines on mathematical reasoning.

Explore the architecture → See benchmarks

10M Parameters

~775M Effective capacity

1.8× Cheaper per problem

40/60 Core / workspace split

Iterative Reasoning Loop · ARA Architecture

The architecture

A frozen core.
A live workspace.

ARA partitions every weight matrix into two structurally distinct zones. The 40% reasoning core is permanently frozen after pre-training — it encodes the invariant logic of mathematical thought. The 60% plastic workspace updates via gradient descent at every inference step, building temporary circuits tuned to the specific problem at hand.

Frozen reasoning core — θ_R (40%)

4M parameters. Pre-trained on symbolic solution traces, then permanently frozen. Encodes 25+ mathematical primitives: distribute, substitute, differentiate, factorise, and more. Proposes operations and runs the coherence checker.

Adaptive plastic workspace — θ_P (60%)

6M parameters. Updated by gradient descent on L_coherence at every micro-iteration — during inference, not training. Holds intermediate state, absorbs retrieved knowledge, and self-organises into a three-tier resistance hierarchy.

Elastic weight consolidation (EWC)

Per-neuron resistance scores R_i track which plastic weights have proven useful. Successful neurons are consolidated; neurons responsible for failed steps soften, enabling backtracking without forgetting hard-won progress.

Progressive knowledge absorption

When the core detects a knowledge gap, it issues structured library queries at increasing depth — from domain identification through edge-case retrieval. Retrieved content is absorbed via K_absorb=3–5 gradient steps into the live workspace.

Performance

Smaller model.
Deeper reasoning.

The iterative architecture gives ARA structural advantages that raw parameter counts cannot buy. More iterations on harder problems. No depth limit. Knowledge decoupled from memory.

🎯 ≥75% GSM8K target accuracy

Within 7 points of billion-parameter baselines, at a fraction of the compute cost.

📐 ≥40% MATH benchmark target

Competitive across all MATH categories. Projected to exceed 1B baselines on Level 5 hard problems.

⚡ 1.8× FLOP efficiency

570B FLOPs for 1,000 micro-iterations vs 1,030B FLOPs for a single 1B model forward pass.

📦 120 MB Total memory footprint

Fits on any modern GPU or Apple Silicon. No datacenter required for inference.

♾️ No limit Reasoning chain depth

Standard transformers are capped at L layers. ARA iterates for as long as the problem requires.

🔭 ~775M Effective parameter equivalents

N_eff = N × T^0.5 × (1 − f_facts)⁻¹. 10M parameters punch at 775M equivalent capacity.

How it works

The iterative
reasoning engine.

ARA solves problems through a structured loop — propose, check, update, adapt — until the solution converges or the budget is exhausted. Every step is mathematically grounded.

01 — PROPOSE

Operation proposal

The frozen reasoning core proposes the next logical operation based on the current problem state and its pre-trained operation library.

theta_R · frozen

02 — EXECUTE

State update

The proposed operation is applied to the current state, producing a candidate new state with updated symbolic assertions and variable bindings.

Symbolic engine

03 — CHECK

Coherence verification

The frozen coherence checker validates consistency and measures progress toward the goal. Failed checks trigger backtracking with resistance softening.

Cons(s_t) · Prog(t)

04 — ADAPT

Workspace update

The plastic weights update via gradient descent on L_coherence. Active neurons gain resistance; failed neurons soften. The workspace self-organises.

theta_P · EWC

05 — ABSORB

Knowledge retrieval

When the core detects a knowledge gap, it queries an external library and absorbs the retrieved content through mini inference-time fine-tuning.

Progressive deepening

06 — CONVERGE

Solution delivery

The loop terminates when all subgoals are satisfied, the budget is exhausted, or confidence falls below threshold. Result is returned with full step trace.

h(s_t, s_goal) = 0

The mathematics

Every claim,
quantified.

ARA is not a metaphor. Every architectural decision — the 40/60 split, the coherence loss, the adaptive resistance — is derived from first principles and rendered as an equation.

        ●
        ara_scaling.py
      
        # ARA Scaling Identity

        N_eff = N_ARA × T^beta × (1 - f_facts)^-1

        # Shadow Demo values

        N_ARA     = 10_000_000   # 10M params

        T         = 1_000        # micro-iters

        beta      = 0.5          # iter exponent

        f_facts   = 0.6          # fact fraction

        N_eff     ≈ 775_000_000  # ~775M equiv.

The team

Built by researchers,
for the real world.

ARA Cognitives was founded to answer a single question: can rigorous mathematical architecture make AI genuinely more capable — without requiring more power?

Founder

Kelvin

Engineering Lead

Systems & Infrastructure

ML Research

Training & Evaluation

Applied Science

Benchmarks & Products

We're inviting builders who believe AI efficiency matters as much as capability.

Reason smarter,not larger.

A frozen core.A live workspace.