Coordination Ledger Options for Composable Architecture

Status & scope

Status: DRAFT — needs design completion (advisory options analysis; no decision selected yet)
Layer: cross-engine architecture
Date: 2026-04-30
Companion specs: fusion-adi-integration, wire-message-families
Purpose: Frame the drift / audit / state-discipline problem and lay out the options for a coordination layer. This document presents alternatives with tradeoffs; it does not select among them.

Executive summary

Composable architectures pay a recurring audit-quality tax: silent schema drift between services, diffuse audit trails that make cross-service workflow reconstruction forensic rather than mechanical, and undefined workflow states that emerge from message timing instead of declared invariants. This document frames the problem, surveys the modern coordination/audit frameworks (Temporal, Restate, Confluent + Schema Registry, NATS JetStream, EventStoreDB, immudb, AWS QLDB, and others), and lays out four viable paths for the coordination layer — each presented with honest tradeoffs against Axonis-specific constraints (federation invariants, edge deployability, Python-first, existing UDS/RabbitMQ infrastructure). It does not recommend a path. §6 is the decision-criteria table; §9 is the open questions; §7b is a worked end-to-end example showing how all the primitives compose. Reading time: 15–20 minutes; decision conversation: 30–45 minutes.

1. Problem statement

A composable service architecture optimises for service-team independence. The cost is three system-level pathologies that emerge as the system grows:

Pathology	Definition	How it surfaces
Schema drift	Two services interpret the same concept differently as either evolves	Silent contract breaks; consumers fail in production after a producer update
Audit diffusion	No single ordered record of what happened across services	Cross-service workflow reconstruction is forensic, not mechanical
Undefined workflow states	Cross-service workflows reach states the system never declared	Operational off-the-rails; no enforceable invariants between services

Constraints any solution must respect for Axonis specifically:

Federation invariants — data stays at the source; cross-organisational audit is via signed evidence, not shared infrastructure
Edge deployability — must be runnable on a Lookup-Light Edge Node (CL-04 Profile 1)
Existing infrastructure — Xanadu/RabbitMQ as the federation transport, UDS/Elastic as the search-shaped object store
Python-first — services are predominantly Python; non-Python additions create operational asymmetry

2. Workflow / state-machine engines surveyed

System	Schema enforcement	Audit log	State machine	Federation	Footprint	Notes
Temporal	partial (SDK-typed activities)	strong (workflow history)	strong (workflows ARE state machines)	single-cluster	heavy (Cassandra/PG backend, broker tier)	Production at Uber, Snap, DoorDash, Coinbase. Workflow code IS the spec.
Restate	strong (typed handlers)	strong (event-sourced)	strong (durable execution)	single-cluster	medium (single binary)	Newer (2024 GA), opinionated, well-engineered.
Camunda 8 / Zeebe	medium (BPMN typing)	strong	strong (BPMN-modelled)	single-cluster	heavy (JVM, brokers)	Mature; visually-designed workflows.
AWS Step Functions	medium (JSON Schema)	strong	strong	none — AWS-only	managed	Excellent in AWS; vendor lock.
Apache Airflow / Prefect	weak	medium	DAG (not state machine)	single-cluster	medium-heavy	Wrong shape — batch DAGs, not transactional workflow.
Akka / Microsoft Orleans	weak	weak	actor mailboxes	yes (cluster sharding)	medium	Different paradigm — message-passing actors, not coordination.
Dapr Workflow	partial (Temporal under the hood)	strong	strong	partial (sidecar pattern crosses boundaries)	medium (Dapr runtime per service)	Modular building blocks; brings Dapr surface as a whole.

3. Append-only / ledger backing stores surveyed

System	Native append-only	Cryptographic verification	Schema enforcement	Python ecosystem	Federation fit	Footprint
ElasticSearch (current UDS choice)	by convention	none	optional (mappings)	strong	partial (cross-cluster replication)	heavy (JVM, sharded cluster)
AWS QLDB	yes, native	strong (Merkle-tree, signed)	weak (Amazon Ion)	medium	none — AWS-only	managed
immudb	yes, native	strong (Merkle, key-value-style)	weak	strong (Python client)	partial	small (single Go binary)
EventStoreDB	yes, native	weak	medium	medium	weak	medium (.NET on Linux)
Trillian / Sigstore Rekor	yes, native	strong (transparency log; powers Certificate Transparency)	weak	medium	strong (purpose-built for distributed audit)	medium
PostgreSQL insert-only + triggers	by convention	by convention (hash chain via trigger)	strong (typed cols + JSONB)	very strong	partial	medium
SQLite + WAL, insert-only	by convention	by convention (hash chain via library)	strong (typed cols + JSON)	trivial — stdlib	per-node, replicate via outbox	tiny — single file
Plain JSONL with hash chain	yes, by construction	strong (one-line-per-record hash chain in pure Python)	enforced by Python validator	trivial	trivial — file replication	none beyond filesystem
Confluent Kafka + Schema Registry	yes (compacted topics ≠ append-only; raw topics yes)	weak	very strong	strong	medium (MirrorMaker)	heavy (KRaft/Zookeeper, brokers, registry)
NATS JetStream + KV	yes	weak	optional	strong	strong (leaf nodes for federation)	small
TigerBeetle	yes	strong (financial ledger)	very narrow (financial transfer schema only)	partial	weak	small but inflexible

4. Schema registry options surveyed

System	Maturity	Python	Drift enforcement	Coupling
Confluent Schema Registry	very mature	yes	strong	tight to Kafka
Apicurio	mature	yes	strong	broker-agnostic
AWS Glue Schema Registry	mature	yes	strong	AWS-only
In-tree JSON Schema files + `jsonschema` library	trivial	trivial	strong (validate at write time)	none — pure Python

5. Three paths forward, each viable

The surveys reveal that no single product solves all three pathologies for federated systems. Three distinct paths are coherent — each makes different tradeoffs.

Path A — Pure-Python primitives in axonis-core (no new services)

Shape

Three libraries added to axonis-core:

Schema registry as a directory of versioned JSON Schema files plus a Python validator
Hash-chained coordination ledger backed by JSONL or SQLite (configurable)
Declarative state machine library (data-defined; state derives from ledger)

All pure Python; ~1,000 lines per primitive; no external service dependencies; runs on any Python 3.11+ environment.

Tradeoffs

✅ Zero operational additions; no Kafka/Temporal/Confluent stack
✅ Edge-deployable — runs on Lookup-Light Edge Nodes
✅ Federation-native — each participant runs its own ledger; cross-participant replication via existing RabbitMQ outbox
✅ All artefacts are JSON / JSONL — auditable with cat and jq
⚠️ Maturity is what we ship — no community ecosystem to lean on
⚠️ Ledger throughput is bounded by single-file or single-SQLite-connection; high-volume nodes may need rotation strategy
⚠️ State-machine library is in-house — semantic precedents (BPMN, statecharts) require reimplementation
⚠️ Audit-quality only as good as the disciplined-write pattern (writers must use the library; bypass is possible)

Implementation surface

Phase 1 (week 1-2): schema registry + 6-8 starter coordination schemas
Phase 2 (week 2-3): hash-chained ledger with JSONL + SQLite backends
Phase 3 (week 3-4): state-machine library + reference workflow
Phase 4 (week 4-6): wire to existing VRS-screening pipeline; ledger-driven replay
Phase 5 (week 6+): cross-node replication via RabbitMQ outbox

Path B — Adopt Temporal (or Restate) + Confluent Schema Registry

Shape

Run Temporal Cluster (workflow + audit) alongside existing services. Run Confluent Schema Registry alongside RabbitMQ for typed event contracts. Services adopt Temporal SDK for workflows and use Schema Registry for cross-service event contracts.

Tradeoffs

✅ Battle-tested at scale — Uber, Snap, DoorDash use Temporal; Confluent is the de facto schema-registry standard
✅ Mature SDKs in many languages — Temporal's Python SDK is well-engineered
✅ Workflow-as-code semantics are well-understood; less in-house design risk
⚠️ Operational footprint — Temporal requires Cassandra or PostgreSQL backend, broker tier, frontend service; Confluent requires registry deployment
⚠️ Federation gap — Temporal is single-cluster by design; cross-cluster federation requires running multiple Temporal Clusters with separate workflow namespaces (no native cross-cluster workflow concept)
⚠️ Edge deployment problematic — Temporal Cluster is far too heavy for a Lookup-Light Edge Node
⚠️ Two new services to operate; new SLAs to maintain
⚠️ Vendor velocity matters — Temporal is well-funded; Restate is younger

Variants

B1: Temporal + Confluent (heavyweight, mature)
B2: Restate + Apicurio (lighter, younger, less proven)
B3: Step Functions + Glue Registry (managed; AWS-only; vendor lock)

Path C — NATS JetStream as coordination substrate, schema files in-tree

Shape

Adopt NATS JetStream as the coordination layer alongside existing RabbitMQ. NATS subjects act as schema namespaces; KV bucket holds composition state. Schema discipline via in-tree JSON Schema files validated at publish time. State machine library similar to Path A.

Tradeoffs

✅ Federation-native — leaf nodes designed for organisation-spanning topologies
✅ Lighter footprint than Kafka — single Go binary; well-instrumented
✅ Combines pub/sub + work queues + KV in one runtime; closest to a "Linda tuple space" feel
⚠️ Two messaging substrates to operate — RabbitMQ (Xanadu) AND NATS, OR migrate Xanadu off RabbitMQ
⚠️ NATS doesn't ship a workflow primitive — state machine still in-house
⚠️ Cryptographic chain for audit is not native; would need to be added
⚠️ Operational learning curve — NATS conventions differ from RabbitMQ

Path D — Layer on existing UDS/Elastic with disciplined append-only conventions

Shape

Treat the existing UDS as the coordination ledger. Add per-event signed-hash chaining as a UDS object property. Schema registry as files in axonis-core. State machine library similar to Path A.

Tradeoffs

✅ Zero new infrastructure — uses what's already deployed
✅ Searchable — Elastic indices are query-rich
⚠️ Elastic is not append-only by construction — append discipline is convention, not enforcement
⚠️ Tamper detection requires post-hoc verification; cannot prevent tampering at the storage layer
⚠️ Heavy footprint — Elastic stays heavy; not Edge-deployable
⚠️ Federation across organisations via Elastic cross-cluster replication is operationally complex

6. Decision criteria

Questions whose answers narrow the choice:

Question	If answer is X, prefer	If answer is Y, prefer
Must coordination work on a 100MB Edge Node?	Path A or C	Path B excluded
Must we operate the coordination layer ourselves?	Path A (least op cost)	Path D OK; B costly
Is federation across organisations a load-bearing requirement?	Path A or C	Path B partial; D weak
Do we need named workflow primitives that already exist?	Path B (Temporal)	Path A (build small library)
What's our tolerance for new services?	Path A or D	Path B (2-3 new services)
What's our tolerance for in-house code?	Path B (least in-house)	Path A (most in-house)
Is rapid time-to-first-running-system the priority?	Path A (4-6 weeks pure Python)	Path B (similar wall-clock but heavier ops)
Is mature community ecosystem important?	Path B	Path A (we ship the maturity)
Is Elastic-the-ledger durable enough for regulator-grade audit?	If yes, Path D	If no, Paths A, B, C

7. Cross-cutting design points (apply to whichever path)

These elements are required regardless of which path is chosen.

7.1 Schema versioning rule

Convention namespace.entity.vN with rules: - Backward-compatible (additive) changes increment minor version - Breaking changes require new major version + migration plan - The validator refuses to write events whose schema_id is not in the registry

7.2 Causal predecessor field

Every cross-service event carries a causal_predecessor field — the event ID this one logically follows. Enables replay and audit reconstruction regardless of which storage path is chosen.

7.3 Composition ID

A workflow instance has a unique composition_id. All events for that workflow carry it. The state machine queries the ledger by composition_id to derive current state.

7.4 Replay semantics

The audit/replay claim requires that re-running events in their stored order yields identical state. This is structural in Paths A, B, C; conventional in Path D.

7.5 Federation outbox

Whichever path is chosen, cross-organisational replication is an outbox pattern over Xanadu/RabbitMQ. The producing node packages event slices (with verification metadata); the receiving node imports and verifies. The path choice affects what verification is possible.

7a. Authorization, identity, and lease primitives

The coordination ledger composes with — does not replace — Axonis's existing authorization model. Per Invariant 1, UDS is the sole ABAC authority. The coordination layer records what was attempted, who attempted it, and what UDS decided; it does not make authorization decisions.

This section describes how four orthogonal concerns (agents, leasing, JWT-bearing actions, ABAC/RBAC) fit on top of any of the four paths.

7a.1 Agents as first-class actors

Agents (LLM-driven workers, autonomous task workers, or any non-human actor) emit coordination events identically to services. The schema registry includes a base coordination.actor.v1 schema describing actor metadata, which agent-specific schemas extend.

Required actor fields on every coordination event:

actor.type           : enum(service, agent, human)
actor.id             : str (canonical identifier)
actor.version        : semver (for replay determinism)

Agent-specific extension fields:

agent.model_id          : str (e.g. "claude-opus-4-7")
agent.prompt_hash       : SHA-256 of the prompt template + context
agent.temperature       : float
agent.seed              : int (when set; null otherwise)
agent.response_hash     : SHA-256 of the agent's output
agent.tool_call_chain   : list of tool invocations within the action

Replay determinism for agent events: the ledger record includes agent.response_hash. On replay, the cached response is re-used; the model is not re-invoked. This preserves replay determinism without requiring agent runtimes to be deterministic.

Invariant 6 alignment ("AI assists, humans attest"): the state machine declares source constraints on transitions. Transitions to terminal states (attested, closed) are restricted to actor.type: human. Agents may propose; only humans freeze.

7a.2 Leasing — explicit lifecycle as state-machine transitions

Leasing maps directly to ledger-recorded state transitions. No separate lease store; current lease state is derived from the ledger.

Standard lease lifecycle:

States:        unclaimed → leased → in_progress → complete
                                ↓
                          lease_expired → unclaimed (retry)

Transitions:
  unclaimed → leased
    on_event:  coordination.lease_claimed.v1
    payload:   { lease_holder, jwt_jti, lease_window_seconds, claimed_at }

  leased → in_progress
    on_event:  coordination.work_started.v1

  leased → lease_expired
    on_event:  coordination.lease_expired.v1
    triggered: by timer service when (claimed_at + window) < now
                  AND no work_completed event observed

  lease_expired → unclaimed
    on_event:  coordination.lease_released.v1

Properties: - Lease state derives from the ledger — no separate lease table - Lease expiry is itself a recorded event (durable, audit-quality) - Concurrent-claim prevention via state machine refusal

Path-specific lease implementation: - Path A (pure Python): small timer worker reads ledger, emits expiry events (~100 LOC) - Path B (Temporal/Restate): native — Temporal timer activities are the canonical lease mechanism - Path C (NATS JetStream): KV bucket TTL provides the timer natively - Path D (UDS/Elastic): scheduled job sweeps for stale leases

7a.3 JWT-bearing actions

Coordination events carry identity context as a required header field, enforced by schema registry:

{
  "schema_id": "coordination.lease_claimed.v1",
  "actor": {
    "type": "agent",
    "id": "agent_vrs_screener_01",
    "version": "1.4.2"
  },
  "auth": {
    "jwt_jti": "abc123...",
    "subject": "user_smith@firm.example",
    "scopes": ["fusion_operator", "lens_authoring"],
    "issuer": "idp.firm.example",
    "issued_at": "2026-04-30T10:00:00Z",
    "expires_at": "2026-04-30T11:00:00Z",
    "delegation_chain": ["service_acct_axonis", "user_smith@firm.example"]
  },
  "payload": { ... }
}

What this gives: - Audit-grade record of who acted under which credential at what time - Replay can verify the JWT was valid at the original time, not at replay time (since tokens expire) - Repudiation defence: chained hash + signed JWT-issuance metadata makes "I didn't do that" claims structurally verifiable

JWT verification location: at the service boundary, not at the coordination-layer write. The ledger records the JWT context; it does not validate the JWT signature itself. Validation happens once at the service that consumes the request; the verified claims are then carried in the event.

7a.4 ABAC and RBAC integration

The coordination ledger does not enforce authorization. UDS does (Invariant 1). The flow:

1. Actor presents request + JWT to a service
2. Service verifies JWT signature and basic claims
3. Service calls UDS for ABAC evaluation
   - subject attributes (from JWT claims + UDS profile lookup)
   - requested action (from event schema_id)
   - resource attributes (from event payload)
   - environmental attributes (time, geographic, classification level)
4a. UDS returns ALLOW → service emits the coordination event; ledger records it
4b. UDS returns DENY → service emits coordination.access_denied.v1 → ledger records the denial
5. State machine validates the event against current state; advances or rejects

Two patterns supported:

Deny-by-omission — when ABAC denies and audit of denials is not regulator-required, no domain event is emitted. The ledger only records denials when explicit auditing is required.

Authorization-as-context — the auth block in each event lets downstream services re-verify claims without re-querying UDS. Reduces UDS load; verifies independently.

Schema-side enforcement of authorization context: - Each schema declares required_scopes in registry metadata (not event payload) - Validator at write time checks the actor's JWT contains required scopes - Mismatch is wire-layer rejection (drift prevention extends to access control)

7a.5 SSO topology — single vs federated

Axonis deployments use one of two SSO topologies. The coordination layer supports both, with no path-specific differences.

Single SSO (one identity provider issues all JWTs):

All services share one trusted issuer (customer's Okta, Azure AD, Auth0, or Axonis-hosted IdP)
JWT verification config is uniform across services
Cross-service claims propagate without re-issuance — original JWT is forwarded in event headers
Long-running workflows: token expiry is handled by either (a) refresh against the same IdP, or (b) workflow pause until human re-auth
This is the typical configuration for single-firm deployments (Citi alone, Disney alone)

Federated SSO (multiple IdPs across organisational boundaries):

Each participating organisation runs its own IdP
Trust is established between IdPs via one of:
OIDC federation (one IdP trusts another's tokens directly)
SAML federation
mTLS-bound JWTs (the certificate binds the JWT to the issuer infrastructure)
Per-edge issuance (federation hub issues short-lived JWTs scoped to a single composition)
Coordination event auth.issuer field disambiguates which IdP issued the JWT
Cross-org events go through the federation outbox (§7.5); receiver verifies the JWT signature against the trusted IdP's public key
Long-running cross-org workflows: each organisation's portion uses its own IdP's JWT; the workflow ledger records the chain
This is the typical configuration for multi-firm scenarios (VRS + regulated firm; Disney + Hulu + ESPN; defence partners)

JWT freshness during cross-org workflows: - Each organisation's contribution to a composition uses a JWT issued by that organisation's IdP - The composition's auth.delegation_chain records the originating user + each acting service account - If a JWT expires mid-workflow, the next event in that org's portion requires either re-auth (human-driven) or a refreshed service-account JWT (system-driven). Either way the transition is recorded.

7a.6 What the coordination layer does and does not do for authorization

Does: - Record full identity context on every event - Enforce required scopes via schema registry (drift prevention) - Preserve replay-quality audit including who-did-what-under-which-credential - Support both single and federated SSO without path-specific changes

Does not: - Validate JWT signatures (service boundary's job) - Make ABAC decisions (UDS's job per Invariant 1) - Establish cross-organisational trust (IdP federation configuration) - Refresh tokens or manage credential lifecycle (auth-domain responsibility) - Replace RBAC primitives (RBAC layered on top of ABAC at the service tier)

7a.7 Open questions specific to authorization integration

These are in addition to §9's path-selection questions:

Single SSO or federated SSO for the first reference implementation? (Determines whether auth.issuer field disambiguation is exercised in P0.)
Are denied-access events themselves required to be auditable, or is deny-by-omission acceptable? (Affects whether coordination.access_denied.v1 is emitted by default.)
Is the auth.delegation_chain field load-bearing for any current customer scenario, or is it future-proofing? (Affects required-vs-optional in the schema.)
Token-refresh strategy for long workflows: human re-auth, service-account substitution, or pause-and-resume? (Doesn't affect path choice, but affects state-machine design.)
Capability-based delegation (narrow JWT scopes for agent actions) — required now or later? (Compatible with all paths; design decision.)

7b. Worked example — agent-driven VRS screening with federated SSO

This walks through one composition end-to-end, showing how the primitives compose. Setting: Firm A (Citi UK) is screening a customer against the VRS register. Federated SSO — Citi's IdP issues human user JWTs; VRS Ltd's IdP issues participant tokens. An agent (agent_vrs_screener_01) drives the screening lifecycle. Composition state advances through 6 declared states; 8 coordination events land on the ledger.

Assumptions for the example: - Path A (pure-Python, JSONL ledger) for concreteness; same shape applies to other paths - Composition ID: cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL - Schema registry contains every schema referenced - Hash chain uses SHA-256

7b.1 Sequence

State machine          Event                                          Actor
─────────────────      ─────────────────────────────────────────      ─────────────────
requested              composition.requested.v1                       human (Citi user)
                       ↓
lease_open             coordination.lease_claimed.v1                  agent (claims work)
                       ↓
in_progress            coordination.lens_run_started.v1               agent
                       ↓
                       coordination.psi_round_completed.v1            firm + VRS (federated)
                       ↓
                       coordination.lens_run_completed.v1             agent
                       ↓
evidence_emitted       coordination.evidence_block_emitted.v1         agent
                       ↓
results_available      coordination.matches_published.v1              service (firm Beacon)
                       ↓
lease_released         coordination.lease_released.v1                 agent
                       ↓
closed                 composition.closed.v1                          human (Citi user) — attests

7b.2 Event 1 — composition.requested.v1 (human-initiated, Firm SSO)

{
  "event_id": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_001",
  "schema_id": "composition.requested.v1",
  "composition_id": "cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL",
  "ts_utc": "2026-04-30T14:00:00.000Z",
  "causal_predecessor": null,
  "actor": {
    "type": "human",
    "id": "user_smith@citi.example",
    "version": null
  },
  "auth": {
    "jwt_jti": "jti_8f3a2b1c4d5e6f70",
    "subject": "user_smith@citi.example",
    "scopes": ["fusion_operator", "vrs_screening_request"],
    "issuer": "idp.citi.example",
    "issued_at": "2026-04-30T13:55:00Z",
    "expires_at": "2026-04-30T15:55:00Z",
    "delegation_chain": ["user_smith@citi.example"]
  },
  "payload": {
    "customer_internal_ref": "FIRM-CUST-789012",
    "lens_id": "vrs_alerts_v2_equivalent",
    "lens_version": "2.0.0",
    "screening_purpose": "fca_consumer_duty_vulnerability_check"
  },
  "abac_decision": {
    "outcome": "ALLOW",
    "evaluated_at": "2026-04-30T14:00:00.012Z",
    "uds_eval_id": "ueval_4xZ8m"
  },
  "prev_hash": null,
  "this_hash": "sha256:9c2a8b...e4f1"
}

7b.3 Event 2 — coordination.lease_claimed.v1 (agent picks up work)

{
  "event_id": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_002",
  "schema_id": "coordination.lease_claimed.v1",
  "composition_id": "cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL",
  "ts_utc": "2026-04-30T14:00:00.250Z",
  "causal_predecessor": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_001",
  "actor": {
    "type": "agent",
    "id": "agent_vrs_screener_01",
    "version": "1.4.2"
  },
  "agent": {
    "model_id": "claude-opus-4-7",
    "prompt_hash": "sha256:1f4e6c...a9b3",
    "temperature": 0.0,
    "seed": 42,
    "response_hash": "sha256:7a3d8f...c5e1",
    "tool_call_chain": ["claim_screening_task"]
  },
  "auth": {
    "jwt_jti": "jti_a1b2c3d4e5f60718",
    "subject": "service_acct_axonis_screener",
    "scopes": ["fusion_operator", "lease_claim"],
    "issuer": "idp.axonis.internal",
    "issued_at": "2026-04-30T14:00:00Z",
    "expires_at": "2026-04-30T15:00:00Z",
    "delegation_chain": [
      "service_acct_axonis_screener",
      "user_smith@citi.example"
    ]
  },
  "payload": {
    "lease_holder": "agent_vrs_screener_01",
    "lease_window_seconds": 1800,
    "claimed_at": "2026-04-30T14:00:00.250Z",
    "lease_expires_at": "2026-04-30T14:30:00.250Z"
  },
  "abac_decision": {
    "outcome": "ALLOW",
    "evaluated_at": "2026-04-30T14:00:00.255Z",
    "uds_eval_id": "ueval_4xZ8n"
  },
  "prev_hash": "sha256:9c2a8b...e4f1",
  "this_hash": "sha256:b73e1f...8d9c"
}

Notes on this event: - Agent identity is rich — actor.id + agent.model_id + agent.prompt_hash + agent.response_hash together pin the action for replay - auth.delegation_chain records that the service account is acting on behalf of the original Citi user — full chain preserved for repudiation defence - abac_decision.uds_eval_id is the UDS authorization-evaluation receipt — UDS made the decision; the ledger records it - Lease state derives from this event — no separate lease store

7b.4 Event 4 — coordination.psi_round_completed.v1 (federated, two issuers in play)

{
  "event_id": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_004",
  "schema_id": "coordination.psi_round_completed.v1",
  "composition_id": "cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL",
  "ts_utc": "2026-04-30T14:00:38.412Z",
  "causal_predecessor": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_003",
  "actor": {
    "type": "service",
    "id": "parallax.fusion_pipeline",
    "version": "1.13.0"
  },
  "auth": {
    "jwt_jti": "jti_a1b2c3d4e5f60718",
    "subject": "service_acct_axonis_screener",
    "scopes": ["fusion_operator", "psi_participant"],
    "issuer": "idp.axonis.internal",
    "delegation_chain": [
      "service_acct_axonis_screener",
      "user_smith@citi.example"
    ]
  },
  "payload": {
    "psi_protocol": "dh-rfc3526-group14",
    "rounds": 2,
    "node_a_id": "vrs_register_node",
    "node_a_issuer": "idp.vrs.example",
    "node_a_jwt_jti": "jti_vrs_71e2f8c4d6b09a35",
    "node_b_id": "firm_node_citi_uk",
    "node_b_issuer": "idp.citi.example",
    "node_b_jwt_jti": "jti_8f3a2b1c4d5e6f70",
    "set_a_size": 5000,
    "set_b_size": 100010,
    "intersection_size": 4995,
    "raw_records_transmitted": 0
  },
  "abac_decision": {
    "outcome": "ALLOW",
    "evaluated_at": "2026-04-30T14:00:38.418Z",
    "uds_eval_id": "ueval_4xZ8q"
  },
  "prev_hash": "sha256:c891fe...2a4b",
  "this_hash": "sha256:e2b740...6f3e"
}

Notes on this event: - Federated SSO is visible in payload.node_a_issuer (VRS IdP) vs payload.node_b_issuer (Citi IdP) — two participants, two issuers - Each side's JWT JTI is recorded for replay / repudiation defence - payload.raw_records_transmitted: 0 is a structural assertion — the privacy invariant. Verifiable on replay. - This event's actor is a service (the fusion pipeline orchestrator), distinct from the agent that claimed the lease

7b.5 Event 6 — coordination.evidence_block_emitted.v1 (frozen, signed evidence)

{
  "event_id": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_006",
  "schema_id": "coordination.evidence_block_emitted.v1",
  "composition_id": "cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL",
  "ts_utc": "2026-04-30T14:00:39.115Z",
  "causal_predecessor": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_005",
  "actor": {
    "type": "agent",
    "id": "agent_vrs_screener_01",
    "version": "1.4.2"
  },
  "agent": {
    "model_id": "claude-opus-4-7",
    "prompt_hash": "sha256:bcd421...f817",
    "temperature": 0.0,
    "seed": 42,
    "response_hash": "sha256:e9a0c7...4d12",
    "tool_call_chain": ["emit_evidence_block"]
  },
  "auth": {
    "jwt_jti": "jti_a1b2c3d4e5f60718",
    "subject": "service_acct_axonis_screener",
    "scopes": ["fusion_operator", "evidence_emit"],
    "issuer": "idp.axonis.internal",
    "delegation_chain": [
      "service_acct_axonis_screener",
      "user_smith@citi.example"
    ]
  },
  "payload": {
    "block_id": "blk_01HZQ4XY...evidence",
    "lens_id": "vrs_alerts_v2_equivalent",
    "lens_version": "2.0.0",
    "query_hash": "sha256:7f1e3a...9c5b",
    "result_hash": "sha256:4d8b62...e1a0",
    "match_count": 4995,
    "false_positives": 9,
    "false_negatives": 1,
    "f1": 0.999,
    "frozen_at": "2026-04-30T14:00:39.115Z",
    "view_hint": {
      "component": "cluster_card",
      "layout_type": "evidence_panel",
      "primary_field": "match_status"
    }
  },
  "abac_decision": {
    "outcome": "ALLOW",
    "evaluated_at": "2026-04-30T14:00:39.120Z",
    "uds_eval_id": "ueval_4xZ8s"
  },
  "prev_hash": "sha256:1aff85...cb20",
  "this_hash": "sha256:48d3e7...f192"
}

Notes on this event: - payload.query_hash and payload.result_hash are SPEC-33 evidence-block fields — coordination ledger carries them by reference - payload.view_hint is the SPEC-33 ViewHint contract embedded in the event payload (Beacon dispatcher reads this to pick the renderer) - The block is frozen — frozen_at is set; prev_hash + this_hash make tampering detectable

7b.6 Event 8 — composition.closed.v1 (human attestation, terminal state)

{
  "event_id": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_008",
  "schema_id": "composition.closed.v1",
  "composition_id": "cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL",
  "ts_utc": "2026-04-30T14:08:22.840Z",
  "causal_predecessor": "evt_01HZQ4XY2N7K8R3V5W6P9T2QSL_007",
  "actor": {
    "type": "human",
    "id": "user_smith@citi.example",
    "version": null
  },
  "auth": {
    "jwt_jti": "jti_8f3a2b1c4d5e6f70",
    "subject": "user_smith@citi.example",
    "scopes": ["edition_attest", "vrs_screening_review"],
    "issuer": "idp.citi.example",
    "delegation_chain": ["user_smith@citi.example"]
  },
  "payload": {
    "decision": "matches_attested_for_action",
    "attested_at": "2026-04-30T14:08:22.840Z",
    "edition_id": "edition_01HZQ4XY...att",
    "evidence_block_ref": "blk_01HZQ4XY...evidence"
  },
  "abac_decision": {
    "outcome": "ALLOW",
    "evaluated_at": "2026-04-30T14:08:22.845Z",
    "uds_eval_id": "ueval_4xZ8u"
  },
  "prev_hash": "sha256:7c3e0a...d4b8",
  "this_hash": "sha256:a195fe...3c2d"
}

Notes on this event: - Terminal state closed is reached only by a human-initiated transition — Inv 6 enforced by state-machine schema metadata declaring transitions[*].source: human for transitions into terminal states - Attestation references the evidence_block by ID — the chain of custody is mechanical: composition → events → evidence_block → frozen result hash - The user's JWT is the same jti_8f3a2b1c4d5e6f70 as Event 1 (still within the 2-hour validity window) — no token refresh was needed for this short workflow

7b.7 What replay of this composition produces

from axonis_core.ledger import Ledger

ledger = Ledger.open("~/axonis/ledger/composition_events.jsonl")

# Verify chain integrity
ledger.verify()                                        # raises if any prev_hash mismatch

# Replay this composition
events = list(ledger.replay("cmp_01HZQ4XY2N7K8R3V5W6P9T2QSL"))
assert len(events) == 8
assert events[0].schema_id == "composition.requested.v1"
assert events[-1].schema_id == "composition.closed.v1"

# Verify state machine arrived at terminal state by valid path
from axonis_core.statemachine import vrs_screening_workflow
final_state = vrs_screening_workflow.replay_state(events)
assert final_state == "closed"

# Verify privacy invariant
psi_event = next(e for e in events if e.schema_id == "coordination.psi_round_completed.v1")
assert psi_event.payload["raw_records_transmitted"] == 0

# Verify federated SSO chain
human_events = [e for e in events if e.actor.type == "human"]
assert all(e.auth.issuer == "idp.citi.example" for e in human_events)
psi_event_payload = psi_event.payload
assert psi_event_payload["node_a_issuer"] == "idp.vrs.example"
assert psi_event_payload["node_b_issuer"] == "idp.citi.example"

# Verify all ABAC decisions were ALLOW
assert all(e.abac_decision["outcome"] == "ALLOW" for e in events)

Each assertion is a regulator-defensible claim made mechanical: replay determinism, state-machine validity, privacy invariant, federated SSO posture, ABAC outcomes. The same eight events satisfy GDPR Art. 30 (ROPA), Art. 32 (security of processing), Art. 35 (DPIA replay), and FCA Consumer Duty PRIN 2A audit obligations.

7b.8 What this example demonstrates about each primitive

Primitive	Demonstrated by
Schema registry (drift)	Every event has `schema_id`; validator enforces shape at write
Hash-chained ledger (audit)	`prev_hash` + `this_hash` chain; `ledger.verify()` is mechanical
State machine (no off-rails)	6 states × declared transitions; `replay_state()` arrives at `closed` only via valid path
Agents as actors (Inv 6)	Agent emits propose-events; only human emits `composition.closed.v1`
Lease lifecycle	`lease_claimed` → `lease_released` events with explicit window
JWT actions	Every event has `auth` block with JWT JTI, scopes, delegation chain
ABAC integration	Every event records `abac_decision` from UDS; ledger doesn't decide
Federated SSO	Two issuers visible in payload (`idp.citi.example` + `idp.vrs.example`)
ViewHint (SPEC-33)	Embedded in evidence_block_emitted payload

8. Cost / time / risk summary

Path	Implementation cost	Operational cost	Federation risk	Maturity risk
A Pure Python in axonis-core	medium (4-6 weeks focused)	low (no new services)	low (federation-native)	medium (in-house code)
B Temporal + Confluent	medium-high (similar coding + op setup)	high (2-3 new services)	high (single-cluster pattern)	low (proven ecosystems)
C NATS JetStream + in-tree schemas	medium-high	medium (one new service)	low (NATS federation-native)	medium (newer ecosystem at scale)
D Existing UDS/Elastic	low (mostly conventions + library)	none added	medium (cross-cluster Elastic ops)	medium (audit-quality contention)

9. Open questions for the team to resolve before deciding

What is the regulator-grade audit standard the customer requires? (Determines whether tamper-evidence at the storage layer is mandatory, which excludes Path D.)
Is the coordination layer expected to run on a Lookup-Light Edge Node, or only on full-Lens Edge Nodes? (Excludes Path B if yes.)
What's the team's tolerance for adopting a second messaging substrate (NATS) alongside RabbitMQ? (Determines viability of Path C.)
What's the team's tolerance for in-house libraries vs adopted frameworks? (Affects A vs B trade.)
How critical is workflow-authoring tooling (UI, BPMN modeller) for this iteration vs later? (B brings these; A defers.)
What customer scenario will be the first reference implementation — VRS screening, multi-INT cross-cue, Disney 5-way? (Affects which workflow patterns the state machine needs to handle.)
Is there a target date by which the coordination layer must be live in front of a customer? (Affects implementation-speed weighting.)

10. What this document does not do

Does not recommend a path
Does not assume one of the paths is already partially built
Does not assess which engineering team members would own which path
Does not score the paths against a fixed weighting of decision criteria
Does not address the gap between SPEC-34 (this) and what the platform actually has today

These are the conversations to have after the path is chosen.

11. Glossary

Term	Meaning
Composition	A workflow instance — one customer screening, one fusion run, one investigation
Composition event	A typed, schema-registered, durable record of one cross-service interaction within a composition
Coordination ledger	The append-only store containing composition events, in whichever backing form a chosen path uses
Schema registry	The mechanism (file-based, service-based, or vendor-product) that enforces typed event contracts
State machine	The declarative definition of valid composition workflows
Causal predecessor	The event ID this event logically follows; enables replay and audit reconstruction
Federation outbox	The pattern by which cross-organisational replication of events is staged at the producer and verified at the receiver

Depends on: component.parallax.fusion-adi-integration, component.parallax.wire-message-families

Realizes: product.fusion