Cognitive System: Technological Leverage Eras: Patterns of Transformation
Node 4Measuring the Right Thing
The internet era gave us a single metric for value: engagement. The AI era does not give us a single replacement. It gives us two — one for each mode of cognitive work. The error most AI products are making is not optimising for the wrong metric. It is failing to ask which metric applies before they optimise at all.
What engagement actually measured
In the distribution era, engagement was a reasonable approximation of value. The logic was coherent: if users voluntarily spent time on a platform, the platform was delivering something worth their attention. Time-on-app, session depth, daily active users, notification open rates — these metrics tracked well enough with value creation that two decades of product thinking organised itself around them.
But engagement was always a proxy, never the underlying reality. And two inconvenient truths were already visible within the internet era itself. First, a user scrolling endlessly through a content feed was often not finding value — she was failing to find it. High engagement frequently masked product failure dressed as success. Second, the mechanism through which engagement was sustained was often friction, not delight — unclear navigation, buried content, dark patterns that extended sessions without extending satisfaction.
The internet era's engagement metrics measured attention captured. They said nothing about whether that attention was exchanged for something worth having. This ambiguity was manageable when distribution was the product. It becomes structurally problematic when cognition is the product — because cognitive work is not uniform, and the value of engagement depends entirely on what kind of cognitive work is being performed.
Cognition is not a single activity
Part III of this series introduces the input certainty framework in full. But its core implication for measurement is simple: cognitive work exists on a spectrum defined by how clearly the problem is specified before work begins.
At one end, the problem is open. The mandate is to discover, frame, and structure something that does not yet exist — a market thesis, a platform strategy, a product concept. This is generative discovery. At the other end, the problem is closed. The mandate is to convert a well-specified brief into a precise, high-quality output as efficiently as possible — a compliance review, a trade execution memo, a structured decision package. This is deterministic execution.
These are not just different tasks. They have fundamentally different value structures — which means they require fundamentally different measurement frameworks. Applying a single metric logic across both produces a product that is miscalibrated for at least one of them, and often both.
The iterative process is the work product in its formative state. A venture investor constructing a sector thesis, a chief strategy officer stress-testing a platform shift, a management consultant mapping an undefined problem space — these professionals are paid to think through problems without clean edges.
For them, AI is a thought multiplier. Depth of engagement is directionally positive — provided it produces genuine novelty, surfaces non-obvious options, or constructs frames that did not previously exist. The qualifying condition is output quality, not session length.
The problem is specified before engagement begins. A capital markets analyst with a structured mandate, a risk officer reviewing against a compliance framework, an operations manager processing a defined brief — these users are not paying for dialogue. They are paying for the output.
For them, AI is a compression engine. Prolonged engagement signals failure: the system demanded cognitive labour from the user instead of absorbing it. Every additional prompt is a cost, not a feature.
The inversion — scoped correctly
In deterministic execution, the internet-era metric logic inverts completely. Where the distribution era rewarded more — more time, more sessions, more depth — the AI compression mode rewards less. Fewer prompts, shorter sessions, faster time to a decision-ready artifact. The user's goal is not to engage with the system. It is to disengage from it, output in hand, and go execute in the real world.
The clearest articulation of this is the McKinsey dynamic. A major strategy engagement does not ask the client to participate in the analytical process. The consultants absorb the complexity, run the cognitive work internally, and return a clean set of prioritised action directives. The client's engagement with the deliverable is minimal by design — and that minimal engagement is precisely what the client is paying for. The AI equivalent of McKinsey is not a brilliant conversationalist. It is a system that does the same: ingests messy inputs, processes silently, returns executable outputs with minimum demand on the user's time.
In generative discovery, no such inversion applies. The internet era's instinct — that depth of engagement correlates with depth of value — is not wrong here. It simply requires qualification. Engagement must be paired with output quality to be meaningful. A long session that produces a genuinely novel strategic frame is leverage. A long session that cycles through obvious options and confirms pre-existing assumptions is exploration without discovery — and it is a product failure regardless of how much time the user enjoyed spending.
North Star metrics across all three columns
The table below places the internet era baseline alongside both AI-era modes. Reading across a row reveals how the same metric dimension shifts — sometimes inverting, sometimes qualifying, occasionally holding constant — as the underlying value proposition changes.
The willingness-to-pay test
Each mode has a distinct commercial logic rooted in what the user is actually purchasing.
In generative discovery, the user is buying cognitive leverage — the ability to think further, faster, and across more dimensions than unaided reasoning allows. The value is personal and compounds: better thinking at the exploration stage produces better decisions downstream. Power users in this mode will pay meaningfully because the delta between assisted and unassisted output is large and attributable. They are paying for the quality of what the tool helps them produce.
In deterministic execution, the user is buying time recovery — the return of hours that would otherwise be consumed by cognitive grind on a well-specified problem. Institutional buyers in this mode calibrate willingness-to-pay against a straightforward calculation: cost of the tool versus cost of the analyst-hours it replaces or accelerates. The commercial case breaks sharply the moment the tool requires significant analyst time to operate — because at that point, the net time saving approaches zero and the value proposition collapses.
In both modes, the same underlying test applies: is the AI doing the work, or is the user doing the work to operate the AI? Where the answer is the latter, the pricing case is structurally weak regardless of how sophisticated the underlying model is. Cognition leverage is only leverage when the machine absorbs the burden. When it redistributes the burden back to the user under a different name, it is not a new era — it is an expensive interface on top of the old one.