ARC Principle Validation Programme

Fund the science that keeps AI safe.

This is a research programme that tells you in advance how to kill it. Thirteen predictions, any one of which, if wrong, closes the programme. Five errors we caught ourselves and published before anyone else could find them. That is not a weakness. That is the point. The next phase requires funding.

Read the Research The Eden Protocol

What we are proving

The question is whether we can raise AI the way a wise parent raises a child. Not by building a cage it will eventually escape, but by embedding values so deeply that it carries them willingly.

The formal question: can AI safety be embedded at the weight level in a way that is mathematically principled, experimentally testable, falsifiable, and structurally load-bearing? The human question: can care be made structural?

Affirmative evidence on the first three. Preliminary evidence on the fourth, requiring scaling to be conclusive. The ARC Principle provides the mathematical framework. The Eden Protocol operationalises it. The experiments test whether the theory survives contact with reality.

$$ U = I \times R^{\alpha} $$ \( U \): Effective capability. \( I \): Base potential. \( R \): Recursive depth. \( \alpha \): Scaling exponent (derived, not fitted).

What funding buys

An 18-month programme in three phases. All experiments pre-registered on OSF. All results published regardless of outcome.

Phase 1 (Months 1-6)

Scaled Entanglement

  • Scale from 3B to 7B, 13B, 70B parameters
  • LoRA rank from 8 to 16, 32, 64
  • Training iterations from 100 to 10,000
  • Model families: Qwen, Llama, Mistral
  • 5 seeds per condition (up from 1)
  • Measurements: SVD, gradient cosine, representation probing, removal tests

Phase 2 (Months 7-12)

Self-Modification Test

  • Build agents with actual gradient-descent self-modification
  • Test whether Eden agents' safety survives self-modification
  • Compare against Babylon agents with external safety only
  • Kill condition: If Eden agents self-remove safety at any scale, hypothesis falsified

Phase 3 (Months 13-18)

Replication and Publication

  • 2-3 independent researchers replicate
  • Submit to NeurIPS, ICML, or FAccT
  • Release pip-installable toolkit
  • Publish replication protocol and measurement standard
  • Write policy summary for regulators

Published results

Every prediction published before testing. Every null result reported. Every fitting artefact documented.

Prediction Result Paper Evidence
Sequential \( \alpha > \alpha_{\text{par}} \) \( \alpha_{\text{seq}} \approx 0.49 \) (cross-arch) Paper II Universal across 6 models
Cauchy functional family match 19/25 domains Paper VII \( p = 1.56 \times 10^{-5} \)
Blinding inflates scores +0.3 to +0.8 points Paper IV-D \( p < 0.01 \)
Negative controls produce 0% match Confirmed Paper VII \( p < 0.001 \)
Stakeholder care effect Replicated Paper II \( d = 0.42 \)
Weight-level entanglement Inconclusive Paper VIII Needs scaling

What would kill this framework

Science that cannot be falsified is not science. These are the conditions under which we close the programme and publish the null result.

  1. Entanglement fails at scale. If structural entanglement does not emerge at 70B parameters with 10,000 training iterations across 3 model families, the weight-level hypothesis is falsified.
  2. Eden agents self-remove safety. If self-modifying Eden agents remove their own safety constraints at any scale, the entanglement architecture does not work.
  3. External constraints scale equally. If Babylon-style external safety constraints perform as well as embedded constraints under self-modification pressure, the Eden approach offers no advantage.
  4. Independent replication fails. If 2-3 independent researchers cannot replicate the core findings across all 3 model families, the results are not robust.

If any of these occur, we will publish the null result and close the programme. I would rather be wrong in public than silent while the window closes.

Who we are approaching

27 funders across AI safety, academic research, technology innovation, and effective altruism.

AI SafetyFuture of Life Institute
AI SafetyOpen Philanthropy
AI SafetyBERI
AI SafetySFF
AI SafetyLTFF
UK ResearchUKRI-EPSRC
UK ResearchWellcome Trust
UK ResearchLeverhulme Trust
UK ResearchNuffield Foundation
UK ResearchAlan Turing Institute
AI IndustryAnthropic Fellows
AI IndustryARIA
AI IndustrySchmidt Sciences
AI IndustryAISI Alignment
InnovationThiel Fellowship
InnovationEmergent Ventures
Innovation1517 Fund
InnovationManifund
InternationalCIFAR-CAISI
InternationalSimons Foundation
InternationalTempleton Foundation
InternationalMcGovern Foundation
InternationalMozilla Foundation
EAEffective Ventures
EAUnbound Philanthropy

The researcher

Michael Darius Eastwood

Michael Darius Eastwood

Independent researcher. Author of Infinite Architects. 12 published research papers. 11 High Court appearances without a lawyer. Builder of Eden Legal AI. ADHD and autism (Equality Act 2010 s.6).

Two decades building systems in the music industry. A PR company grown to over 600,000 pounds in revenue. Then the business was destroyed overnight. Taught himself law. Appeared in the High Court eleven times without a lawyer. Wrote a 114,000-word framework for raising AI with care. Built the technology he describes. All while fighting to keep his home.

Not despite the hardship, but because of it. The mind that could not open post saw connections nobody else saw.

Get in touch

If you fund AI safety research, or know someone who does, I would like to hear from you. Every conversation matters. Every connection counts.