complete

LLM Euthymia

Framework · Position paper · Stable generation in large language models

Tier II · Publication

Large language models fail in two opposite directions. The well-known failure is hallucination — confident generation of content that isn't true. The less-discussed failure is what this work calls tautological drift — confident generation of content that is true but empty, restating prior context as though it were new evidence. Current quality signals catch the first and miss the second: redundancy is penalised only at the surface level — token counts, n-grams, compressibility — and a paraphrase defeats every one of those checks.

The framework targets a single stable generation state — euthymia, borrowed only as a label for the equilibrium between the two failure modes — and addresses both with per-token interventions applied at sampling time rather than after the fact.

Two Gates

A dialectic gate triggers on genuine semantic uncertainty — not surface-level lexical variation, but positions where sampled continuations split into distinct meaning clusters. When it fires, the system generates a thesis and a full-strength antithesis, then synthesises what survives the contact. It is verification, but selectively triggered, span-level, and pre-emission: the contested span is argued through before it is committed to the visible answer. The cost is roughly three generations over the gated span, paid only on the small fraction of spans that trigger.

An informativeness gate handles the opposite problem: convergent restatement. Over long contexts, models tend to reformulate prior outputs and then attend to those reformulations as if they were independent evidence — the documented self-reinforcement effect behind repetition loops, operating at the paraphrase level where frequency penalties, repetition penalties, and compression-based penalties register nothing. The gate asks one question per statement — does it contribute to the informational state of the context, or merely restate what is already encoded — and downweights what doesn't.

Both failure modes are dysregulation. Both can be addressed by intervening at the moment of generation rather than after the fact.

Entropy Is Not Enough

Token-level entropy conflates two states that warrant very different responses: true semantic uncertainty (multiple competing meanings) and surface-level variation with a settled meaning (synonyms, phrasings, word order). The framework distinguishes them the way the semantic-entropy literature does — sample continuations, cluster them by meaning, compute entropy over the clusters — and inherits that work's key correction: cluster continuations, not candidate tokens, because single tokens are not meaning-bearing. The dangerous quadrant is uncertain meaning expressed with confident surface form — clean output, divergent underlying clusters — and that is the case token-level metrics miss completely.

What's New Here

The instruments are largely on the shelf, and the paper says so: dynamic per-token decoding (Mirostat, EDT, AdapT), semantic uncertainty measured over meanings rather than tokens (semantic entropy), and generation-time redundancy penalties (frequency, penalty decoding, LZ). Three things are claimed as new. First, the unification — overgeneration and undergeneration treated as one regulation problem, two symmetric gates applied at the moment of generation; no existing method addresses both. Second, the dialectical response — existing dynamic decoding dampens uncertainty toward the safe token; this framework argues it through: thesis, antithesis, synthesis, before commitment. Third, the failure mode itself — tautological drift, semantic-level redundancy that surface penalties cannot see, named as first-class and given a measurement programme.

Read The Paper

The full position paper sets out the two failure modes, develops the semantic-divergence argument, maps the framework onto a three-layer architectural model (surface predictions / attention patterns / pre-trained weights), locates the work in the decoding and semantic-uncertainty literature with full references, and stages an implementation pathway from measurement through self-monitoring.

Provenance. Earlier drafts circulated under the name LLM Lithium and used a pharmacological vocabulary that this version retires. Euthymia is borrowed only as a label for the target equilibrium state. Revised July 2026: the relation to prior work corrected throughout, gate mechanics moved from tokens to continuations and spans, and a section making unsupportable claims about session-state calibration removed.

→ Position paper (PDF) → Progeny Protection

Two failure modes, two gates, applied at the moment of generation rather than after it.