Adaptive Reasoning Architecture
Most neural networks are monolithic: every weight participates in both storing knowledge and executing reasoning. The ARA breaks this coupling. It divides the model into two distinct, mathematically defined zones that play complementary roles during inference.
The 40% Reasoning Core (ΞΈ_R) is frozen after pre-training. It encodes the atomic operations of mathematical thought β operations like distribute, substitute, differentiate, factor, apply the chain rule β as structured patterns in its weight matrices. These never change. They are the bedrock of logical procedure, independent of any particular problem.
The 60% Plastic Workspace (ΞΈ_P) is the dynamic half. During each problem-solving session it updates its own weights in real time, guided by a coherence loss that measures whether the current partial solution is internally consistent and progressing toward the goal. This is inference-time learning β something no conventional network does.
The plastic workspace self-organises over time into three stability tiers, driven entirely by the resistance values R_i that each neuron accumulates through use:
Adaptive Reasoning Architecture (ARA)
A neural architecture that separates how a model thinks from what it knows. The frozen 40% reasoning core (ΞΈ_R) encodes timeless logical primitives. The plastic 60% workspace (ΞΈ_P) adapts in real time to each problem via inference-time weight updates β something no conventional transformer does.
The result: a 10M-parameter model with ~775M effective parameter-equivalents on reasoning tasks, achieved by combining iterative inference, knowledge decoupling, and adaptive resistance.
Architecture at a Glance
d_model = 128 # Model dimension n_layers = 8 # Transformer depth d_ff = 512 # FFN hidden dim (4 Γ d_model) n_heads = 4 # Attention heads vocab_size = 8000 # Symbolic token vocabulary # The 40/60 split (per-neuron, across all FFN layers) frozen_neurons = 204 # floor(0.40 Γ 512) per FFN layer β reasoning core plastic_neurons = 307 # ceiling(0.60 Γ 512) per FFN layer β workspace total_params = ~10,000,000 reasoning_core = ~4,000,000 (ΞΈ_R β frozen at inference) plastic_workspace = ~6,000,000 (ΞΈ_P β updated per micro-iteration)
Quick Start
git clone https://github.com/ara-collective/adaptive-reasoning-architecture.git cd adaptive-reasoning-architecture pip install -r requirements.txt # Phase 1: train the reasoning core python shadow_demo/train.py --config configs/base_10m.yaml --phase 1 # Phase 2: run iterative inference on a problem python shadow_demo/infer.py --problem "Solve: 3xΒ² + 5x - 2 = 0" --max_iters 1000
git clone https://github.com/your-username/adaptive-reasoning-architecture.git
cd adaptive-reasoning-architecture
git remote add upstream https://github.com/ara-collective/adaptive-reasoning-architecture.git
git remote -v
git checkout -b feature/my-cool-change
git add .
git commit -m "feat: add masked gradient update for frozen neurons"
git pull upstream main
git push origin feature/my-cool-change
git pull upstream main. Reach out in the group chat if you need help!