Look at the current observability stack and two layers stand out, sitting one on top of the other. At the bottom is the threshold layer. It decides what is true. Error rate crossed a line. A connection wasn’t closed. The ninety-ninth percentile exceeded its budget. At the top is the language layer. It decides how to say what the bottom layer found. It takes a true thing and renders it as a fluent sentence. Stack those two and you get a system that detects with a comparator and speaks with a transformer. The detection is decades old, while the speech is brand new. The missing seam between them is precisely where understanding was supposed to reside.
Two Ends, Not One Axis
It is tempting to picture these as two points on a single line, with simple rules at one end and deep learning at the other, and to imagine the field has been traveling steadily from the first toward the second. The picture is wrong. They are not two positions on one axis. They are two different jobs bolted together.
The threshold layer answers a question about truth value. Is this number out of range. The language layer answers a question about expression. How do I phrase this finding. One produces a verdict; the other produces prose. Neither produces an understanding of the situation, and the reason is structural rather than incidental.
The comparator cannot understand the situation, because it holds no model of one. It holds a boundary value and a current reading, and it reports their relation. There is nothing in a comparator that knows what the system is for, what depends on what, or what the crossing of its line implies for anything else. It knows that a line was crossed.
The language layer is not asked to understand the situation either. It is pointed at the comparator’s output and instructed to phrase it well. The skill it brings is the skill of saying. It was trained to produce fluent, plausible, well-shaped text, and that is what it does with the verdict the comparator hands it. Reasoning about what produced the verdict is outside the job. The fluency of the result is what makes this easy to miss.
A sentence that reads as though it were written by someone who understood the incident can be produced by a system that understood nothing about the incident and a great deal about sentences.
So the middle is not weak. The middle is absent. The architecture has a hole where meaning was meant to go, and the eloquence of the top layer sits directly over the hole, hiding it.
A Lost Connection to the Past

The sharpest evidence that the middle was skipped rather than solved is that the field once had a primitive version of it, twenty years ago, and did not build on it.
Consider an old check: did you close the JDBC connection when the transaction ended? It’s insignificant compared to a sophisticated language model, but it performs a unique function. It’s aware of an entity (the connection) and its lifecycle (open, then close). It recognizes an ordering constraint: a close must occur before the transaction ends. This check isn’t a threshold on a scalar; it’s a concise statement about an entity’s behavior over time and a detection of deviations from the expected pattern. It has an entity, an expectation about its conduct, and a temporal relation the expectation depends on. Compare it to a typical line from a modern insights panel: error rate twelve times above yesterday, a pod at eighty-nine percent of its memory limit. Those are scalar crossings. The twenty-year-old transaction check is more situated.
So the field had, two decades back, the beginnings of a grammar for situations: entities with expected behavior, related over time. The deep-learning era arrived and left that grammar where it found it. It kept the flat comparators at the bottom and spent the entire windfall of new capability on the top, on narration. It added eloquence to the bottom layer’s findings and added no comprehension to them. The transaction check and the language model coexist in the same product, and the check is doing the more situated work while the model gets the credit for intelligence.
What the Middle Is Made Of
The middle is the layer that holds a model of expected behavior and registers departures from it as meaningful. Not a threshold on a metric but a model of what the system is supposed to be doing, against which what the system is actually doing can be read as conforming, straining, or violating. The JDBC check is an expectation about conduct, and a detection of the expectation being broken. The middle layer is that instance, far more generalized and made central.

Generalized, the model of expected behavior is a web of commitments. Each component commits to certain conduct — what it will do, what it depends on, what ordering and timing it guarantees to those that rely on it. The system holds this web, and it reads what is actually circulating against it, continuously, as part of running. In that setting a departure is a different kind of event from a threshold crossing. A threshold crossing says a value passed a line. A departure in the middle layer says a component is no longer keeping a commitment that something else relies on.
This is where two bodies of thought meet, and the meeting supplies the middle layer with both halves of what it needs. Promise Theory gives it a vocabulary. The unit of the middle layer is not a metric; it is a promise. A situation, in those terms, is a configuration of promises being kept and broken, and a fault is a promise that can no longer be honored.
That reframes the entire bottom of the stack: the question is no longer whether a number is in range, but whether the commitments the system is built from are holding.
Semiotics provides the missing piece. A broken promise serves as a sign, carrying inherent significance at the moment it occurs, independent of any external interpretation. This aspect is overlooked by both the comparator and the narrator. The comparator presents a mere value, a number devoid of its own importance. The narrator constructs a sentence, assigning significance retrospectively, from an external perspective, by a reader of records. In contrast, the middle layer generates something neither of them does. That something is an interpreted sign. It forms a judgment that this matter holds significance and its meaning, within the context of the events, as part of the process that produced it.
Today’s observability stack is missing a crucial function: it should emit and interpret signs that indicate departures from commitments made in the present moment. When a promise is broken, the system that made it perceives the breach at the moment it occurs. The sign itself should carry the meaning of the breach, including the threats it poses and the consequences that depend on the failed commitment.
Three Layers

The honest picture of the field, then, has three layers, and the field has built one and a third of them. The bottom layer asks whether a value is out of range. It is built. It has barely changed in twenty years. The top layer asks how to phrase the finding. It is newly and genuinely powerful, and it is doing nearly all of the visible work. The middle layer asks what the system is supposed to be doing, and what a given departure means for what depends on it. It was never filled.
It was never filled because filling it has a cost the industry declined to pay. A middle layer requires the system to carry a model of itself (its commitments, its dependencies, its expected conduct) and to read against that model while it runs.
The industry chose a different arrangement. It chose to carry telemetry instead of a model, to ship the records outward, and to reconstruct meaning after the fact, from outside, by a reader of the records. For most of that history the reader was a human with a console. Now the reader is a language model with a console. The arrangement is unchanged.
The reconstruction has simply become more fluent.
What This Asks For
Narration is useful because, ultimately, a finding needs to be communicated to someone. The claim here is that there’s a layer that’s been overlooked between them, and that this missing layer holds the meaning. It’s the significant part that gives things significance. The difference that makes the difference. What connects communication at different levels.
The deep-learning era was spent on saying rather than on understanding.
The capability that could have gone toward a model of the system’s own behavior went toward a model of language about the system’s behavior. These are not the same achievement. A system that understands its situation perceives a broken commitment as it breaks and knows what the breach is for. A system that narrates its telemetry waits for a comparator to flag a value, then composes a sentence about it. The first carries meaning. The second carries prose about records.
The middle is where the situation lives. It sits between the threshold and the sentence, and it is made of commitments held and read, of promises kept and broken, of signs that carry their own significance at the point and moment they occur. The field built the floor and the ceiling and left the room between them empty. Filling it is the work still to be done.
