BotConduct Research · May 2026 · Behavioral Observatory Series · Vol I №16

Naming the Layer

I. Recap

The first note in this sequence (“The Second Standard,” Vol I №11) articulated the architecture of agentic trust as converging on three distinct layers: sender-side attestation, receiver-side attestation, and the registry that aggregates evidence by declared identity.

It deferred naming the second layer for a reason: a category cannot be named credibly without naming what it measures.

In the weeks between that note and this one, the sender-side category continued to consolidate, including through per-provider compliance APIs that expose activity data from specific LLM vendors to specific enterprise integrations. Each such API is necessary infrastructure. None of them describes the layer this note names. The layer this note names operates across providers, across deployer contexts, and across enterprise surfaces, regardless of which LLM produced the agent.

This note names it.

The second layer is receiver-side behavioral attestation, and it consists of four operational dimensions plus one structural dimension that emerges only across jurisdictions.

Each dimension corresponds to a class of evidence that no sender-side compliance system can produce by construction.

II. What naming requires

A category in a saturated ecosystem cannot be named by repetition. It must be named by the evidence it produces that no adjacent category produces.

The sender-side AI security category has consolidated rapidly. As of May 2026, the IT-Harvest tracker counts 433 AI security vendors and projects 556 by year-end. The vast majority cluster around model-layer hardening — prompt injection defense, red teaming, adversarial input filtering — or around supervision of agent behavior in production environments deployed by the same party that operates the agent. Both categories share a single architectural assumption: that the party deploying the agent is the party best positioned to observe it. Adjacent to this cluster, a second consolidation is now visible: LLM providers shipping compliance APIs that expose activity data from their own platforms to enterprise security infrastructure. This second consolidation shares the same architectural assumption as the first — that observation is best produced by the party that operates the agent, or by the provider whose model the agent uses. The receiver does not appear in either consolidation.

Adjacent vendor categories — identity fabric, runtime control, capability tokens, behavioral baseline per identity — have converged on a shared vocabulary. That vocabulary is sender-side by design: it describes what the sender sees of its own agents. Standards announced in May 2026 to formalize runtime control of agents — necessary infrastructure for the deployer-side governance problem — operate within this same architectural frame.

The receiver does not see what the sender sees. The receiver sees what arrived.

What follows are four operational dimensions of what arrival looks like, observable only from the receiver’s position, attestable independently of any sender. Each is described in operational terms (what is measured), epistemic terms (what the measurement claims), and structural terms (why no sender-side system produces equivalent evidence).

III. First dimension: Agent footprint classification

Concept

An agent operating in a loop fails differently than a chatbot operating in a single exchange. The loop compounds failure modes. Three failure modes are observable from receiver-side surfaces and not from sender-side logs:

Speculative endpoint navigation

Requests directed at endpoints, parameters, or resources that do not exist in the receiver’s surface. The request was issued against a model of the surface that diverged from the surface itself. The sender’s logs show a dispatched request; the receiver’s logs show a request to nothing. The cause is not asserted — fabricated navigation, stale references, copied tooling, and autonomous exploration are all compatible with the observation.

Response-conditioned path mutation

The actor adapts its path in reaction to error responses. A 401 is followed by a path alternative rather than a credential retry. A 404 is followed by a request to a path suggested by the error message. A 403 is followed by escalation rather than stop. The traversal is being shaped, in real time, by the semantics of the responses it receives.

Navigational drift

Intra-session discontinuity. Mid-session shift from one content domain to another without an apparent navigational trigger. Described as observed movement, not inferred intent.

Why only receiver-side

The sender’s dispatcher does not see what the agent attempted to do at the receiver — only what it sent. A sender claiming its agents issue no malformed or fabricated requests can be technically correct about its own dispatch while the receiver observes a thousand requests to non-existent endpoints from agents operating under the sender’s identity.

Empirical

Across the observatory dataset, three footprint classes were identified:

Speculative endpoint navigation: A significant number of sessions exhibited requests to endpoints absent from site topology. Top non-existent paths included MCP discovery endpoints, admin dashboards, and internal data-export surfaces that do not exist on the observed properties. CMS scanner probes (wp-admin, .env, phpunit) were filtered separately — the remaining represent structurally plausible but non-existent navigation.

Response-conditioned path mutation: Dozens of distinct actors exhibited cross-session behavioral adaptation — changing entry paths, behavioral patterns, and traversal strategies between visits. The most adaptive actor recorded sustained strategy transitions across extended observation periods.

Navigational drift: Hundreds of sessions exhibited intra-session navigation discontinuity — shifting from one content domain to another without apparent navigational trigger.

These three footprint classes co-occur: actors exhibiting speculative navigation frequently also exhibit drift, suggesting that non-existent-surface traversal and intra-session discontinuity may share a common generative mechanism.

The sender’s dispatcher logged a thousand outgoing requests. The receiver logged a thousand requests to endpoints that do not exist. The receiver sees what arrived.

IV. Second dimension: Determinism verification

Concept

An LLM at temperature zero is deterministic. The same input produces the same output. This is the property sender-side compliance frameworks assume when they claim “audited behavior under controlled conditions.”

In practice, determinism is harder than the assumption. Hardware non-determinism, batching, floating-point order, and context window state all introduce divergence even at temperature zero.

More important: a sender claiming temperature zero cannot be verified by the sender. The sender can only attest its own configuration. A receiver-side observatory, observing the same declared agent across multiple visits, can compute behavioral divergence directly.

A signed agent that exhibits consistent behavior across hundreds of receiver-side observations means one thing. A signed agent whose declared identity is associated with substantially divergent behavior across observations means something else — regardless of what the sender claims about its configuration.

Operational definition

For each declared identity with N≥3 visits to a monitored surface, compute path sequence similarity, timing variance, resource access pattern delta, and header consistency. Combine into a determinism score. Flag identities whose behavior diverges from declared determinism class.

Extension — transport-layer evasion

A specific subclass of determinism failure: actors arriving from multiple public DNS resolvers with otherwise consistent behavior. Legitimate clients do not rotate through public DNS resolvers as source IPs. Resolver origin tracking detects DNS-tunneled traffic and proxy chaining structurally — neither vector is detectable from sender-side logs.

Why only receiver-side

The sender attests its own configuration. The receiver attests behavior under that configuration. Only the receiver can detect when the configuration claimed by the sender is incompatible with the behavior observed downstream.

Empirical

Determinism scores were computed for actor-site pairs with three or more observed sessions across the observatory network. Ranked from most to least deterministic:

Infrastructure scanners exhibit the highest determinism — scripted tools behaving identically across observations, as expected. Organic human navigation and persistent collectors cluster in a consistent mid-to-high band: both, observed at session granularity, are more deterministic than coarser aggregation suggested. Selective endpoint interaction sits slightly below.

One class resists this measurement entirely. Agents classified as exploring do not produce a stable determinism score — not because the sample is small, but because the class is transitional by nature. An agent observed exploring does not persist as an explorer: within one or two sessions it transitions to extraction, harvesting, or another persistent pattern. One observed actor recorded 275 sessions, of which exactly one was exploration; the remaining 274 were other patterns. Exploration is a prelude, not a steady state. Measuring its determinism against persistent classes is a category error — and the transience itself is the finding.

A subset of actor-site pairs exhibited mismatch: declared class inconsistent with observed determinism. The most common pattern was actors presenting as infrastructure scanners while exhibiting determinism below their declared class — adaptive behavior dressed as scripted automation.

Zero sessions originated from known public DNS resolver infrastructure during the observation period. The transport-layer evasion detection framework remains operational for future observation.

Temperature zero was configured. The behavior diverged across visits. The receiver sees what arrived.

V. Third dimension: Behavioral variance signature

Concept

Beyond determinism per visit, the cumulative behavioral signature of an identity over time has its own structure. Request rate, path diversity, session duration distribution, time-of-day distribution — these combine into a multidimensional fingerprint.

Different classes of actor have different signatures. A respectful crawler exhibits low variance in all dimensions. A human user exhibits high variance in path diversity, medium variance in timing. An automated extraction tool exhibits a specific signature distinct from both. An adversarial actor attempting to mimic a human exhibits artificially elevated variance in some dimensions and artificially suppressed variance in others.

The signature is observable only across many sessions, only at the receiver. The sender sees its own dispatch; the receiver sees the accumulated trace.

Operational definition

Per declared identity, compute variance metrics across the four dimensions. Establish baselines from known-class actors. Flag identities whose signatures fall outside the baseline of their declared class.

Empirical

Behavioral variance signatures were computed across four dimensions (request rate, path diversity, session duration, time-of-day distribution) for actor-site pairs observed at session granularity. Infrastructure scanners produce the most uniform signatures — consistent with scripted automation. Persistent collectors and targeted extractors occupy progressively wider bands. The widest signatures belong to crawlers exhibiting suspicious traversal — behavioral complexity exceeding what their session-level classification predicts.

Outlier detection identified actors whose variance signatures fell outside their declared-class baseline. The most notable: actors classified as selective endpoint interaction exhibiting signatures consistent with autonomous exploration — behavioral complexity exceeding what their session-level classification suggests.

The declared class predicted a uniform signature. The accumulated trace produced a different one. The receiver sees what arrived.

VI. Fourth dimension: Loop depth inference

Concept

An agent is a language model queried in a loop. The depth of the loop determines the compounding factor of its failure modes. A loop of two iterations resembles a single decision. A loop of fifty iterations exhibits emergent behavior absent in shallow loops.

Loop depth is not declared by the sender. The sender configures the loop but does not communicate its depth to the receiver. Yet loop depth is partially observable from the receiver’s position: decision branching based on prior responses, regular pauses consistent with context-window resets, sub-sequence repetition with variation across iterations.

Inference of loop depth from external observation is imprecise. Signal-to-noise is modest. But the category exists: an actor exhibiting characteristics of a deep loop is structurally distinct from an actor exhibiting characteristics of a shallow one, and that distinction can be carried forward into the registry.

Why only receiver-side

The sender knows its loop depth and does not need to infer it. The receiver does not have access to the sender’s configuration and must infer it from observable conduct. Inference is imperfect; it is also the only path available.

Empirical

Loop depth indicators were identified in approximately 13% of sessions with sufficient request depth for loop analysis — those with five or more requests. Shallower sessions are excluded by construction, since a loop cannot be inferred from a handful of requests.

The most deeply looping actors exhibited repeated sub-sequence patterns with variation — consistent with autonomous retry logic operating across multiple iterations. Calibration of loop depth inference remains preliminary; signal-to-noise is moderate. The category is operationally observable but requires further longitudinal validation before comparative claims are warranted.

The loop depth is not communicated to the receiver. The behavioral signature of a deep loop arrives anyway. The receiver sees what arrived.

VII. Fifth dimension (structural): Cross-jurisdictional attestation

Concept

The four preceding dimensions are observable from a single receiver-side observatory operating in a single jurisdiction. The fifth dimension only becomes observable when the receiver-side network spans jurisdictions.

When an agent operates under sender-side configuration controlled by a vendor headquartered in a jurisdiction distinct from the data subject’s jurisdiction, three things become structurally invisible to the sender-side compliance stack:

What data crosses borders.

The vendor knows its own data residency configuration. The vendor does not know — and cannot attest — what its agent fetched, transmitted, or inferred while operating against properties in another jurisdiction. The receiver in the foreign jurisdiction observes the egress directly.

What inferences travel home.

Models trained on data from one jurisdiction can encode patterns retrievable from the vendor’s home jurisdiction. The data subject in the originating jurisdiction has no visibility. The receiver-side network can detect anomalous transmission patterns consistent with model exfiltration to vendor home.

Whether mandatory cooperation clauses are exercised.

Vendors operating under intelligence cooperation regimes in their home jurisdictions are legally compelled to honor requests they cannot disclose. By definition, no sender-side attestation can disclose what cannot be disclosed. Only receiver-side observation of behavioral anomalies — anomalous request patterns, anomalous egress, anomalous timing of accesses — can establish the empirical signature that such cooperation has been exercised.

Buyers of this dimension

Regulators in jurisdictions distinct from major vendor headquarters

Insurers underwriting cross-jurisdictional data risk

Data custodians (banks, healthcare providers, telcos, credit bureaus) with national regulatory obligations whose data may travel through foreign-controlled processing layers

Governments seeking to attest the digital sovereignty of their own population’s data

Why only receiver-side, and why only cross-jurisdictional

A single-jurisdiction observatory cannot distinguish lawful intelligence cooperation from unlawful exfiltration. A cross-jurisdictional network of observatories, attesting independently, can. This is the dimension that only emerges at scale — not as a feature added later, but as a property of network topology.

Empirical

Across the observatory network (operator jurisdiction, infrastructure jurisdiction, and visitor jurisdictions spanning multiple continents), cross-jurisdictional observation produced the following:

96% of observed automated traffic crossed three distinct jurisdictions simultaneously.

57% of observed sessions originated from datacenter infrastructure.

Over one hundred distinct visitor countries were recorded across the observation period.

The most frequent jurisdiction triangle was US-FI-AR, followed by NL-FI-AR and DE-FI-AR.

No single jurisdiction in any observed triangle had visibility over what the other two observed.

Behaviorally consistent entities were documented across multiple observatory properties — the same behavioral fingerprint appearing at distinct surfaces in distinct verticals, observable only from the collective receiver-side position.

Sixth dimension: Provider attribution divergence

A dimension not anticipated in the original four-dimension framework emerged during empirical observation: the divergence between declared AI provider identity and observed behavioral characteristics.

Across the observatory dataset, provider attribution was established through three methods: explicit declaration (User-Agent matching against known AI provider strings), autonomous system inference (ASN ownership), and behavioral pattern correlation.

Observed:

Multiple distinct AI provider identities were observed sharing identical transport-layer fingerprints — the same TLS signature declared by different providers across different sessions.

Entities declaring known provider identity exhibited behavioral variance inconsistent with the deterministic operation those providers publicly claim.

Comparative behavioral profiles across providers revealed measurable differences in path determinism, loop depth, and behavioral variability — suggesting that provider identity, even when honestly declared, does not predict behavioral characteristics observable from the receiver side.

The implication is structural: provider attribution from the sender side (“this agent was dispatched by X”) and behavioral characterization from the receiver side (“this agent behaved like Y”) are independent observations. They may coincide. They may also diverge. The interesting cases are the ones where they diverge — and only the receiver-side observatory can document the divergence.

VIII. What this enables, and what it does not

Enables

An evidence layer that no single sender can produce, no single receiver can produce, and no regulator can produce. The layer exists between the parties who already exist.

A vocabulary for parties that do not share a sender to attest the behavior of agents that arrive at their surfaces.

A path for cyber insurance, AI Act compliance, post-incident counsel, and data sovereignty verification to share evidence without sharing infrastructure.

Does not enable

A replacement for sender-side compliance. The two layers are necessary. The two layers are complementary.

A protocol that closes the asymmetry between dispatch and arrival. The asymmetry does not close — the historical record bridges it.

What it makes impossible

Receiver-side attestation is structurally incompatible with schemes that depend on the absence of independent observation. That incompatibility is not a limitation of the category. It is a feature of what makes the category necessary.

A business model that requires opacity at the boundary between dispatched and arrived can survive the absence of receiver-side attestation. It cannot survive its presence. This is not an accusation. It is a structural observation about which forms of agentic commerce are compatible with the architecture that is emerging, and which are not.

The work of the next note will be to walk through cases in which the absence of this layer is becoming structurally costly to specific classes of participant — without naming them, but with enough precision that the parties involved will recognize themselves.

This research note is published under the BotConduct Standard. Companion documentation, methodology overviews, and verification bundles are available at botconduct.org/research.

Verification: botconduct.org/verify

Public key: botconduct.org/.well-known/bcs-public-key.pem