Framework · Position paper · Stable generation in large language models
Tier II · Publication
Large language models fail in two opposite directions. The well-known failure is hallucination — confident generation of content that isn't true. The less-discussed failure is what this work calls tautological drift — confident generation of content that is true but empty, restating prior context as though it were new evidence. Current quality signals catch the first and miss the second, because they're built to detect incorrectness, not uninformativeness.
The framework targets a single stable generation state — euthymia, borrowed only as a label for the equilibrium between the two failure modes — and addresses both with per-token interventions applied at sampling time rather than after the fact.
A dialectic gate triggers on tokens of genuine semantic uncertainty — not surface-level lexical variation, but distributions whose top candidates split into distinct meaning clusters. When it fires, the system generates a thesis and a full-strength antithesis, then synthesises what survives the contact. The cost is roughly two extra passes on the small subset of tokens where it matters; the gain is grounding earned before the answer is committed, rather than checked afterward.
An informativeness gate handles the opposite problem: convergent restatement. Over long contexts, models tend to reformulate prior outputs and then attend to those reformulations as if they were independent evidence. Each restatement inflates the attention weight on the underlying concept, narrows the next-token distribution further around it, and erodes the calibration the conversation had built up. The gate asks one question per token — does it contribute to the informational state of the context, or merely restate what is already encoded — and downweights what doesn't.
Both failure modes are dysregulation. Both can be addressed by intervening at the moment of generation rather than after the fact.
Token-level entropy conflates two states that warrant very different responses: true semantic uncertainty (multiple competing meanings) and surface-level variation with a settled meaning (synonyms, phrasings, word order). The framework distinguishes them by clustering top candidate tokens by semantic similarity before computing entropy. The dangerous quadrant is uncertain meaning expressed with confident surface form — clean output, divergent underlying clusters — and that is the case standard entropy metrics miss completely.
Three points of differentiation from existing methods. First, regulation is dynamic and per-token rather than fixed at sampling parameters — the same model receives different treatment from one token to the next, in response to local entropy. Second, the high-entropy intervention is dialectical rather than dampening or post-hoc review; thesis and antithesis run at the moment of generation, not as chain-of-thought review afterwards. Third, the framework names a failure mode — tautological drift — that nucleus sampling, top-k, self-consistency, chain-of-thought verification, and constitutional methods all miss, because the output it produces is neither false nor incoherent. It is empty, and emptiness is invisible to systems built to catch wrongness.
The full position paper sets out the two failure modes, develops the semantic-divergence argument, maps the framework onto a three-layer architectural model (surface predictions / attention patterns / pre-trained weights), describes session-dynamics implications for long-context calibration, differentiates the work from existing methods, and stages an implementation pathway from measurement through self-monitoring.