Valleys can overlap
Cat, dog and horse carve nearby valleys that share
geometry. From that overlap a higher valley — “animal” — appears
on its own. Nobody ever wrote animal =. It emerges.
A field-native intelligence
Hylaean does not store knowledge as numbers in a table. It behaves like a physical material: knowledge becomes shape, a question becomes a disturbance, and the answer is the field settling into a new equilibrium.
Research log · 29 Jun – 3 Jul 2026 — the last wall now has a name, and the first two doors are built →“Do not make the system look intelligent. Make the field become intelligent.”
01 — The core idea
A piece of metal does not “know” what sound is. Yet strike it and it rings with resonances, modes and standing waves. That geometry is the knowledge. Hylaean is built to work the same way — intelligence as physics, not as a Python calculation.
weight += lr02 — The one field
state.S — the brain itself
There is only one thing inside Hylaean's head: a single
continuous field called state.S. Imagine a huge elastic crystal.
Every point carries energy, direction, tension, momentum, couplings, and memories
of earlier deformations.
It is not memory. It is not a hidden state. It is not an embedding. It is the brain. A question, an answer, and a model of the world do not live in separate stores — they live as different regions of the same field.
03 — Knowledge is geometry
In Hylaean a concept is not a point in space — it is a valley (an attractor) in an energy landscape. Drop the field near it and it rolls in and rests. The deeper the valley, the stronger it pulls. Hover or tap the landscape below to disturb it.
Cat, dog and horse carve nearby valleys that share
geometry. From that overlap a higher valley — “animal” — appears
on its own. Nobody ever wrote animal =. It emerges.
Deep knowledge is not a longer vector. It is a richer topology: cat → animal → mammal → lives → moves → hunts → prey → night grows into a whole mountain range that all means “cat”.
04 — A question
A question is not a data structure. It is a boundary condition — like a stone dropped into still water. It sets the edges of the problem and lets the field run.
What spreads outward are not pieces of information. They are tensions — waves that push the field out of balance and start it searching for rest.
05 — The answer
Once disturbed, the field does one thing: it tries to lower its energy. Like water finding its level, like heat spreading out, it relaxes until it reaches a stable resting state.
06 — Language
After the field settles, an idea already exists in its geometry. Language comes last: a simple decoder asks only one thing — “which words describe this shape?” — and reads them off.
The decoder is a sensor, not the thinker. The rule the project holds itself to:
“If you could remove the decoder and a normal program still knew the answer, then the architecture has failed.”
07 — Thinking
A language model thinks token → token → token. Hylaean is meant to think differently: a disturbance grows into structure, structure collides, and structure fuses into one stable shape.
08 — Creativity
Why do genuinely new answers appear? Because two “mountains” that were never combined are forced to relax together.
Cat and water may never have met. Ask “Can a cat swim?” and both landscapes relax into a shared process — and a new valley forms. The answer was nowhere stored. It came into being.
09 — Learning
A neural network learns by nudging millions of weights. Hylaean learns by changing geometry. Use a connection often and:
This is the idea of well_depth: a concept that proves useful
deepens its own basin, so the field is pulled toward it more easily
next time. There is deliberately no back-propagation on the answer path
— learning is a change of shape, not a training step.
10 — The field's grammar
The field has a tiny grammar of what it can do. These moves are learned, not hand-written, and each is a physical transformation of the field.
Torsion — the field's memory. It transports and twists state within a place.
The metric — it binds things together, complementary to the twist.
The only way to move content between places — discovered from examples.
Out of these three, richer verified moves have grown — each one measured before it was allowed to run:
11 — Memory layers
The field has no separate database. What it “remembers” is layered into its own physics, each layer holding on for a different length of time. A moment passes through all of them — and leaves a deeper trace the further down it reaches.
state.SThe living state itself: the present moment. It is not storage — it is the thought currently happening. Disturb it and the trace fades within ticks.
K_fast / K_slowTwo timescales of the twist operator. The fast bank picks things up within a conversation; the slow bank changes over long exposure — held in a verified homeostatic equilibrium so it can never silently run away (shipped, on by default).
well_depthConcepts that prove useful deepen their own valley. This loop is live: a grounded, committed answer feeds the deepen consumer directly (measured 14 vs 0 in the on/off ablation) — geometry-only, no gold labels.
RecordStoreThe only episodic memory path. New words become real basins (17/17 acquired in the vocabulary test), every write carries provenance, and promotion is gated — nothing sneaks in.
SkillRegistryRecurring competent dynamics crystallise into procedures. Credit flows only on fresh, field-produced grounded commits (measured 63–65 vs 0) — live, on by default.
meta.scratchField-native working memory: a region the field writes mid-thought and reads back. It carries the verified two-step composition chain (X → X+1 → X+2, 4/4) — live in production.
And it sleeps. A contract-bound dream consolidation pass replays and admits records under a dream provenance tag with its own quarantine — memory grows offline too, verifiably and without touching the honesty gates (no-teach score unchanged, zero false commits).
Honest edges: retention across idle time is verified (teach → idle → re-ask still recalls), and the checkpoint “carry wall” — freshly taught words being pruned as duplicates on reboot — was closed this week by per-token signatures. Still open: recalling a fact when the question is paraphrased (exact 5/5, paraphrased 0/15) — an encoding problem, named and on the plan.
12 — Regions & networking
The one field is not a uniform blob. Its points organise into regions — a place for vision, a place for language, a place that holds the question, a place for a model of the world. Crucially, nobody names them in code. They are discovered by the field, emerging from how points cluster and couple.
Regions are wired together not by code but by operator structure: where the twist (K) is strong, where binding (L) holds, where transport (T) routes, where energy pulls. This difference between regions — their heterogeneity — is exactly what lets a question, a world model and an action carry different dynamics while living in the same field. Hierarchies appear on their own: cat, dog and horse share geometry, so a higher valley — “animal” — forms by itself.
13 — Microcells & tabs
Zoom into a region and you find microcells: local, programmable clusters of the field. Every tick, each one reads the field around it, runs a little local physics, and writes the result back — always through the field, never around it.
Each cell carries a short program tape (the “tabs”) — only a few steps long. The tape rotates and updates only when it earns credit by being useful. This is how a flat sheet of points becomes a structured, programmable substrate, and it is the rung between raw points and full skills:
14 — Skills & chained skills
When a region keeps doing something competently, that recurring dynamic crystallises into a skill — the field's procedural memory. It is not a separate store bolted on; it is the same field, remembering how to move.
A simple skill is a learned twist with a gate. A richer one carries both moves at once: K to move state and L to bind it — so it can run a small inner simulation of what would happen.
Rewarded pairs (A then B) crystallise into a chain — an executable graph of operators. A chain can contain chains, gated by usefulness, so the field builds depth without anyone wiring a pipeline.
A chained skill materialises as an operator graph: fixed steps joined by typed hand-offs. The field selects which program to run by resonance — what fits the current shape — not by a Python interpreter choosing for it.
15 — Organs of the field
Beyond regions and microcells, the field grows organs: contract-declared patches of the substrate with their own local dynamics — arithmetic, sequences, symmetry, analogy, a world model, a workspace. Each one must declare where it writes, what it reads, and which experiment proves it. No consumer on the answer path? It gets deleted.
Booked honestly. When an organ computes an answer, the scoreboard counts it as organ compute — a separate bucket from answers the raw field produced on its own. This week's audit even moved three wrongly-credited answers out of the field-produced bucket (3 → 0). The benchmark measures the architecture, not the demo.
16 — Reasoning & deep reasoning
Hylaean does not reason because a prompt told it to. It reasons when something does not add up — a gap between what it expected and what it actually sees. That gap (the residual) becomes a pressure in the field.
Pressure opens a reasoning frame: a small nested workspace where the field can twist and bind without disturbing everything else. Deep reasoning is just this going deeper — a local segment where microcell programs and operator chains keep working until the residual shrinks. It is budgeted: when the pressure is gone, it stops, and a runaway is cut off by a health gate.
And if no stable shape ever forms, the honest result is to abstain — never a confident-sounding guess.
What is measured so far: two-step composition through the scratchpad works (4/4, live), and escalated commits are only allowed with grounded evidence (8 correct, 0 false with the gate; 0 without). What is measured dead: the whole “think harder by re-relaxing” family — re-relaxation descended an empty gradient, and the full escalation cluster even produced a false commit. The next frontier is decomposition: today the field still gets the splitting of a multi-step problem handed to it.
17 — Vision
The eye of Hylaean is not a labelled classifier. It is a predictive cortex that learns from raw visual change — no reward, no labels. It constantly guesses the next frame; where it is surprised, it learns. (With no camera attached it still runs on its own picture curriculum, so the field always has something to see.)
What it sees is poured into the same field everything else lives in: it grows a real
2-D map (a retinotopic region) inside state.S, and an object
tracker forms stable clusters on its own. The field even
self-names recurring things it was never taught. As always, the decoder
that turns a basin into a word is a sensor, not the thinker; the camera
adapter only encodes pixels.
18 — Sudoku
Sudoku is a clean test that the same physics works on more than language. There is no Sudoku algorithm inside. The rules are written as energy, and a grid relaxes until that energy is lowest — exactly the question-to-answer move from earlier.
The main field plays the brain: it reads the rules as text, learns a strategy, and can be asked “row r, column c — which digit?”. A separate grid is working memory, where every cell holds candidate strengths for each digit and the constraints push duplicates apart. The two only talk through declared bridges — no Python solver ever writes the digits.
To avoid locking in a wrong guess too early, the grid follows a temperature schedule: it heats up to explore, then cools to commit (annealing), reheating if it gets stuck. The answer is the settled, sharpened grid.
Concrete probe results (violations left after settling; 0 = solved; “best of 4” = four relaxation seeds):
Honest result. The remaining conflicts on hard boards trace to one named cause: value-permutation ambiguity — the physics cannot tell two globally swapped digit labellings apart locally. The architecture's answer was not a solver patch but a global digit gauge: a small latent inside the field that can absorb a whole relabelling at once. A convergence gap of the physics, worked on as physics.
19 — ARC-AGI-2
ARC-AGI-2 is a benchmark of little grid puzzles where every task is new. We deliberately do not build an ARC solver. A task is written into the field as a boundary condition; the answer must settle out. A strict firewall forbids the usual shortcuts — no grid tricks, no program search, no pretraining — so a solve only counts if it truly came from the physics.
These are not guesses — each was isolated with its own experiment.
The field's move-set preserves shape and count. About 99.6% of ARC transforms simply cannot be expressed — even with the answer in hand.
Where a transform is expressible, 2–4 examples under-determine it. The field can land confidently on the wrong one of many consistent rules.
It recognises the kind of task (~82% overlap) but transfers 0% of bespoke solutions — by design, ARC punishes memorised replay.
The bet paid its first instalments. Treating a task as
coupled worlds in one field — with laws written as
conservation constraints that the answer must settle under — added the
first shape-changing solves and lifted train from 74 to 92, still with
false=0. A verified certificate discipline (every rule must
reconstruct every demonstration, survive leave-one-out, and be unique — otherwise
abstain) is what keeps the new solves honest.
And for the first time the remaining 910 unsolved tasks are mapped, not mysterious: a full census sorts them by morphology and by why they fail — most lack any expressible candidate law at all, 126 settle on a unanimous-but-wrong law, and only 24 are genuine ties. The eval wall (0/120) stays what it is: an information wall, not a bug — see the research log below.
20 — Research log · 29 Jun – 3 Jul 2026
An intense verification wave — dozens of claims measured, several closed for good, and one big convergence: four independent measurements point at the same single missing piece. This log reports it the way the project measures it: verified wins, honest negatives, and open construction sites.
Every stage of the question-to-answer path is now hard and honest — encoding, settling, decoding, committing. The one stage that still starves is transport: moving answer content (not question content) into the answer region of the field by the moment the commit gate looks. Watch the pulse travel:
Holding the question steady in the field — which sounds helpful — measurably blocks the answer. Letting the question fade is not a loss: resolving the question is the transport. Two experiments confirmed it independently.
On ~10–13 items per run the field demonstrably reaches the right valley — then refuses to commit, because at commit time the answer is no longer standing in the answer region. Three attempts to “harvest” these near-misses all failed honestly, and all three point at the same missing piece.
Confidence is not evidence. A rule may only be committed when it reconstructs every example, survives leave-one-out, and is the only survivor. This certificate discipline — born in the ARC work — jumped to the language side this week and produced the first score movement in days.
The trap that shows why this matters: the sequence 1 2 4. The old path confidently answers 8 (doubling). But +1, +2, +3… is equally consistent and predicts 7. Two laws survive the demos — so the honest answer is abstain.
With one more example the tie breaks: exactly one law survives every demonstration and the leave-one-out test, and only then does the field commit. Under this discipline the benchmark moved 19 → 21 correct with 0 wrong — small, but the first movement of the composite score since 28 June, and every point of it is certified.
The named missing piece now exists as a first building block: a learnable channel that carries answer content into the answer region through the unchanged commit gate. The kill-test passed — shuffle the learned associations and correct commits collapse to zero, so it is real association, not an echo. It runs shadow-only for now; switching it live is a separate, measured decision.
For questions that carry their own demonstrations (patterns, alternations, sequences), the certificate makes the determinable subset safely harvestable: two long-standing abstains converted to correct commits, zero new errors — and it has since been switched on by default, together with the deep-landing operating point (12/18 landings, live). For world-knowledge questions with no in-question demos, the wall stands — that is the honest frontier.
Physics stability shipped. The slow-memory runaway that could silently freeze the field got its durable cure — a homeostatic equilibrium, on by default. The field can now run hot for thousands of ticks without lying about its health.
A hard hypothesis died well. A capsule experiment proved the big-field (d=1024) recall wall is not operator geometry — it is transport reach. First gate of the columnar programme closed terminal-negative.
Columnar closed for good; the small field wins. All three columnar fronts measured negative — d=64 stays canonical. The ARC expression wall was measured precisely, and the grounded learning loop (deepen + skill credit) was verified.
Cutover wave. The discovery → grounding → apply loop went live in production, field-native working memory (scratchpad) went live, phase-binding was composed into the energy, and ARC train jumped +8 via conservation-law settling. Cold-boot commit hardening landed.
Convergence day. Four independent measurements named the same missing primitive — and its first building block was built and verified the same day. The sequence certificate moved the score (19 → 21, 0 wrong), the memory-carry wall closed, the remaining 910 ARC tasks were fully mapped, and the commit path now runs with three hardened gates and zero false commits.
The current work plan, in priority order. The sequence certificate and the deep-landing operating point are already live; right now the big one is running:
Honest negatives are results too. These paths are measured dead and guarded against re-work:
21 — Today's path
The long-term vision is a single field where a disturbance simply settles into an answer. Today's working system is an honest intermediate build that proves each step — transport, decode and commit — really works:
Underneath, inference is a single heartbeat that repeats. There is exactly one production descent path — no second solver, no hidden optimiser:
Every force on behaviour must enter as a term of that one energy, an operator, a bus signal, or a commit gate — anything else is rejected as an architecture break. That is also what keeps the system honest: there is no side door through which a clever heuristic could smuggle in an answer.
22 — The minimal grammar
The project distills its rules into eight “information laws” — the minimum a field needs so that language, thinking and creativity can appear as stable attractors.
Exactly one substrate; question, answer and world are regions of it.
Every force is a term of one energy; thinking is relaxation.
After each step, project onto the allowed states — no optimizer.
A spectral gap guarantees the field lands in exactly one fixed point.
Learning deepens basins (Hebbian) — never back-prop on the answer path.
Shared operators move content and relations between places.
Meaning is a coordinate: the same relation is the same shift.
Commit only on real movement — otherwise abstain and think further.
23 — Being honest
This is a research system, not a finished product. The whole design is built to refuse cheating: if an answer comes from outside the field, it does not count — even if the score looks better.
Abstaining is a first-class result. When no stable shape forms, the honest output is “I don't know” — never a pretty lie.
And only the “no-teach” measure counts: answering questions the system has not just been shown. Re-stating something it was handed a moment earlier is memory mechanics, not intelligence.
24 — The name
The name is borrowed from the Hylaean Theoric World in Neal Stephenson's novel Anathem: a timeless realm where perfect mathematical objects — the ideal circle, the truth that 2 + 2 = 4 — exist independently of any mind that thinks them. It is the novel's version of an old philosophical position, mathematical Platonism: mathematical truths are not invented, they are discovered.
Knowledge is stored as millions of trained weights — a fitted approximation that lives entirely inside the particular network.
Knowledge arises as a stable attractor in a dynamic field — a shape the physics settles into, not a number looked up.
The structure itself exists independently of its carrier. An intelligent system does not invent it — it discovers it.
That is the aspiration behind the name. The field is not a store — it is a physical search process. Attractors are regions in the space of possible concepts, and learning does not mean creating knowledge; it means uncovering the stable structures of that space — stable invariants, not pattern matching.
If the field one day holds stable attractors of universal relations, answering will feel less like symbol manipulation — and more like navigating a Hylaean space.
This is a philosophical interpretation, not an established scientific theory — but as a guiding metaphor for a field-based AI it is honest about what it is: a direction, not a claim.
25 — Foundation
The structure borrows its discipline and vocabulary from TFPT (Topological Fixed-Point Theory) — the ideas of a field on a carrier, twist and binding operators, transport between positions, and a gap that guarantees a single attractor. Hylaean takes the structure, not the physics predictions: it is an architecture for letting intelligence emerge as field physics.