Skip to content

Signal Queue Contract

Status & scope

  • Status: DRAFT
  • Layer: cross-service contract (Cortex, Parallax, Prism, Sentinel)
  • Date: 2026-05-06
  • Track: 2 (Operational Capability) — see themis/docs/STRATEGIC-TRACKS.md.
  • Scope: Cross-service contract for signal ingest, lifecycle, routing, and outbound dispatch across Cortex, Parallax, Prism (axonis-lens), Sentinel, and any future signal producers/consumers.

This spec defines a contract, not an implementation. It captures the capability requirements that close the gaps surfaced by the Signal-Flow Ground Truth review (May 2026). An implementation spec for the queue itself sits below this one.

Why this exists

The signal-flow ground truth review surfaced fragmentation: five existing artifacts across three repos disagree on lifecycle authority, payload enforcement, dedup semantics, and ownership of the streaming-path promotion gate. Specifically:

  • Cortex ships a Signal lifecycle (SIGNAL_TRANSITIONS state machine in cortex/cortex/tools/signal.py:47-54) reachable via MCP tools/call.
  • Parallax does not yet emit signals; wiring is spec-only (TASK-C01).
  • Prism (axonis-lens) has a Signal struct (lens/models/signal.py:8-21) and threshold evaluation but is not wired to DES (TASK-C02-lens).
  • Sentinel is spec-only (developers-environment/specs/SPEC-PLATFORM-07-ALERTING-SERVICE.md); no implementation exists.
  • REST entry path (fedai-rest /userspace/signal/{uid}) may bypass the SIGNAL_TRANSITIONS state machine that MCP enforces — an audit-grade divergence.
  • No deduplication, suppression, or backpressure exists anywhere in the signal flow.

The Pattern Recognition Roadmap (themis/docs/PATTERN-RECOGNITION-ROADMAP.md) Phase II commitments depend on signal-flow capabilities that today are partially or not implemented. This spec defines those capabilities so the wiring can land coherently.

This is not a new design from scratch. It is a unification spec: it pulls together what's specified across SPEC-14, SPEC-PLATFORM-07, DES_Signal_Lifecycle, TASK-C01, TASK-C02-lens and adds capability requirements that close the named gaps.


References

Doc Owns
parallax/specs/SPEC-14-SIGNAL-PAYLOAD.md (v2, ratified 2026-03-14) Universal signal payload schema (correlation / policy / observations / display / context sections)
developers-environment/specs/SPEC-PLATFORM-07-ALERTING-SERVICE.md Sentinel service design (AlertEvent / Notification / Subscriber routing)
cortex/docs/spec/DES_Signal_Lifecycle_Specification.md Cortex Signal state machine + status_history audit semantics
parallax/tasks/TASK-C01-signal-generation.md Parallax → DES Signal wiring spec
axonis-lens/docs/future/tasks/TASK-C02-lens-signal-generation.md Prism → DES Signal wiring spec
themis/docs/PATTERN-RECOGNITION-ROADMAP.md Long-term capability roadmap (Track 3) — consumer of this spec
themis/docs/ANALYST-WORKFLOW-REQUIREMENTS.md Analyst-edit requirements (EREQ/UREQ) — extends signal lifecycle
themis/docs/STRATEGIC-TRACKS.md Routing rule for which tier this spec serves (Track 2)

Capability requirements

Each CREQ names the capability, what it closes, what evidence anchors it (gap from the May 2026 review), and acceptance criteria. CREQs do not prescribe implementation.

CREQ-1 — Universal payload conformance

Capability: Every signal — regardless of producer (Cortex internal, Parallax fusion, Prism threshold, external webhook, analyst-manual) — must conform to the SPEC-14 v2 payload structure on entry to the queue. Non-conforming signals are rejected at the entry boundary.

Closes: "Convention not contract" gap. SPEC-14 is the documented schema; today nothing enforces conformance — producers populate it or not.

Acceptance criteria: - Schema validation runs at every entry point (MCP, REST, internal producer) and rejects non-conforming payloads with structured error - Validator references the canonical SPEC-14 schema artifact, not a duplicate - Validation error includes which field/section is non-conforming and why

CREQ-2 — Lifecycle state machine enforcement uniform across entry paths

Capability: Signal lifecycle transitions (new → acknowledged → investigating → resolved | dismissed) are enforced identically whether the request enters via MCP tools/call, REST POST /userspace/signal/{uid}, internal producer push, or any future ingest path.

Closes: Audit-grade bug. Today MCP set_signal_status validates against SIGNAL_TRANSITIONS (signal.py:624-722); REST POST /userspace/signal/{uid} may bypass this validation by writing directly to the userspace object.

Acceptance criteria: - A test fixture forces an invalid transition (e.g., resolved → investigating) via REST and confirms rejection with the same error semantics MCP produces - State machine validation lives in one server-side place, not duplicated across handlers - Invalid transitions are logged as audit events regardless of entry path

CREQ-3 — Producer registration / discovery

Capability: All signal producers (Parallax, Prism, Cortex internal, external webhook, analyst-manual) register with the queue through one declared path. The queue knows: producer identity, expected signal pattern type (CREQ-4), classification level, expected rate envelope.

Closes: Today Parallax/Prism wiring is spec-only (TASK-C01, TASK-C02-lens). Cortex internal signals are implicit. External webhooks have no registration model. There is no inventory of "who can produce signals."

Acceptance criteria: - Producer registration is declarative (config or registration-time API) - Unregistered producers are rejected at ingest - Producer metadata is queryable (audit + capacity-planning) - Producer identity is preserved through the queue and stamped on every signal's provenance chain

CREQ-4 — Per-pattern-type routing

Capability: The queue understands the per-pattern signal-flow profiles (see § Per-pattern-type profiles below) and routes accordingly. A smurfing-shape signal does not get the same lifecycle as a left-of-launch chain-stage signal.

Closes: Today the lifecycle is one-size-fits-all. Designed for analyst-created or system-computed single events with explicit lifecycle. Does not handle bursty (smurfing), competing (chain), continuous-noisy (anomaly), or rare-persistent (composition) shapes.

Acceptance criteria: - Signal carries a pattern_kind field (or equivalent) that the queue uses for routing decisions - Routing rules are declarative per pattern kind (which dedup config, which suppression rules, which lifecycle ownership) - Default routing handles the existing single-event shape so nothing today breaks

CREQ-5 — Deduplication contract

Capability: Identity-key extraction + time-window collapse for signals that represent repeated observation of the same underlying state. The queue collapses N duplicate events into one signal with n_occurrences count, preserving first-seen and last-seen timestamps. Duplicate definition is per-pattern-kind (CREQ-4).

Closes: Today there is no dedup logic anywhere in the signal flow. related_signals[] field exists in the schema but no rules govern it. Smurfing, anomaly, and stage-A-of-chain signals will flood the queue without this.

Acceptance criteria: - Per-pattern dedup rules declared in queue config (identity key, time window, collapse policy) - Collapsed signals preserve full evidence chain (each contributing event traceable) - Dedup applied before lifecycle state machine — duplicates do not create independent state - Producer can override dedup with explicit "this is genuinely distinct" flag (escape hatch)

CREQ-6 — Suppression contract

Capability: Related-signal correlation: "if a signal is open / under investigation for entity X with cause Y, suppress new lower-severity signals for the same (entity, cause) until the open signal is resolved." Rules are declarative.

Closes: Today there is no suppression. Schema hint at signal.metadata.resolved_by_insight and resolved_by_edition exists but no rules use it.

Acceptance criteria: - Suppression rules declared per pattern kind (suppression key, suppression duration, escalation threshold) - Suppressed signals are recorded (not silently dropped) with reference to the suppressing signal - Operator can override suppression for a specific case (escape hatch) - Suppression can re-open a previously resolved signal if a higher-severity related signal arrives

CREQ-7 — Hypothesis-state container

Capability: Storage and lifecycle for competing hypothesis branches — the data structure that supports MHT for left-of-launch causal chains and any future multi-hypothesis pattern types. State per branch: which signature, which stages observed, which lead-time prediction, which prune evidence.

Closes: Multi-hypothesis tracking has no current home. The May 2026 ground truth review found Sentinel was supposed to provide this; Sentinel is spec-only.

Acceptance criteria: - Hypothesis branches first-class (not a hidden detail of one consumer) - Pruning is an audited event (which branch pruned, by what evidence, when) - Multiple competing branches for one underlying observation set are queryable - Hypothesis state survives restart (durable, not in-memory only)

Scope note: This CREQ defines the container. The MHT algorithm is Phase III work per the Pattern Recognition Roadmap §2.5 (deprioritized based on toy-era data showing small effect). The container exists so MHT can land later without re-architecting.

CREQ-8 — Backpressure feedback to producers

Capability: The queue signals to producers when downstream consumers are saturated. Producers respond by throttling, batching, or dropping with priority awareness.

Closes: Today producers have no feedback loop. When Parallax/Prism eventually emit signals, an over-eager producer can swamp the lifecycle layer. Anomaly-pattern (Pattern 4) is the canonical concern: noisy, high-rate.

Acceptance criteria: - Backpressure protocol documented (advisory headers, hard rate limits, or both) - Producer registration declares maximum acceptable lag tolerance - Backpressure events are themselves observable (queue depth, oldest unprocessed age)

CREQ-9 — Provenance preservation through queue

Capability: Every signal's provenance — correlation_id (Parallax), policy_id + lens_id (Prism), observation_id[] (source data refs), producer_id (CREQ-3) — is preserved end-to-end through the queue. No queue operation drops provenance fields.

Closes: Today provenance is convention per SPEC-14 schema. Producers can populate it or not; queue layer does not enforce that what enters with provenance leaves with provenance.

Acceptance criteria: - Provenance fields enumerated explicitly in queue contract - Queue operations (dedup collapse, suppression, MHT branch merge) defined to preserve or aggregate provenance — never drop - Audit query: "give me the full provenance chain for signal X" returns deterministic walk back to source observations

CREQ-10 — Dead-letter and retry semantics on outbound

Capability: Outbound signal dispatch (effects → external systems, Beacon Investigation hand-off, downstream queue forwarding) has explicit retry policy per destination. Failed dispatches enter a dead-letter store with structured error, retry count, and resolution path.

Closes: Today cortex/cortex/effects/dispatcher.py:13-61 is fire-and-forget. Failures are caught and logged but not retried. There is no DLQ.

Acceptance criteria: - Per-destination retry policy declared (max retries, backoff curve, give-up threshold) - Dead-letter store queryable by operator - Operator can manually retry, ignore, or escalate dead-lettered dispatches - Retries themselves are audited events

CREQ-11 — Rate limiting / quota per producer per pattern kind

Capability: Each (producer, pattern kind) pair has a configurable rate envelope. Exceeding the rate triggers backpressure (CREQ-8) or rejection depending on policy.

Closes: Today no rate limit. A misconfigured Parallax run (or a malicious external webhook) could flood the lifecycle layer.

Acceptance criteria: - Rate envelope declared at producer registration - Rate violation is an observable event (metric + log) - Rate-limited signals have a defined disposition (rejected with retry-after header, queued with degraded SLA, dropped with audit log)

CREQ-12 — Audit trail integration with Cortex insight events

Capability: Significant queue events (signal create, state transition, dedup collapse, suppression, dispatch attempt, retry, DLQ entry, hypothesis branch prune) are recorded in the existing Cortex insight event ledger format so they appear in the same audit chain analysts already trust.

Closes: Today queue-side events would be invisible to existing audit tooling. Cortex's _append_event() model (tools/insight.py:140-182) is the established pattern; queue events should conform.

Acceptance criteria: - Queue events use the Cortex event-ledger schema (or an equivalent that interoperates) - Audit query: "show me everything that happened to this signal" returns unified events from both Cortex and queue - Append-only by construction (Invariant 2)

CREQ-13 — ABAC enforcement at queue boundary

Capability: Suppressed fields (per themis/docs/ANALYST-WORKFLOW-REQUIREMENTS.md EREQ-6 + the lens YAML policy_envelope.field_suppression) never cross the queue boundary. Cluster classification derives as the highest of any contributing observation's classification. Caller without the right clearance receives the visible subset (and is informed members are hidden).

Closes: Tier A #5 ABAC boundary test (May 2026 in themis) verified suppressed fields don't appear in engine records. Same discipline must extend to queue layer when signals from clusters carry classification.

Acceptance criteria: - ABAC filter at queue ingest and at queue read - Per-signal classification stamped at producer-side; queue does not allow downgrade - Audit log when a caller is given a filtered subset (so analysts know data was withheld)

CREQ-14 — Determinism + replay guarantees

Capability: Given an identical input event sequence + identical configuration, the queue produces identical signal lifecycle outcomes (state transitions, dedup outcomes, suppression decisions, dispatch ordering). Determinism is testable via canonical hashing of the queue's output sequence.

Closes: Tier A #1 determinism replay (May 2026 in themis) verified parallax engine determinism via canonical cluster-partition hashing. The queue layer should compose with that — same fixture + same lens + same queue config → byte-identical end-to-end.

Acceptance criteria: - Replay test: same input event sequence produces identical SHA-canonical output sequence - Non-determinism sources documented (e.g., external clock if used) and either eliminated or made explicit - Replay determinism is a CI gate, not an aspirational claim


Per-pattern-type signal-flow profiles

The four pattern types named in PATTERN-RECOGNITION-ROADMAP.md produce distinct signal-flow profiles. The queue must handle each. CREQ-4 routes by pattern_kind; the profiles below define what each route configuration looks like.

Profile A — single_event (today's existing shape)

  • Volume: low to moderate, event-driven
  • Lifecycle: full SIGNAL_TRANSITIONS; persistent until explicitly resolved
  • Dedup: none required (each event genuinely distinct)
  • Suppression: none
  • MHT: not applicable
  • Producers: Cortex internal, analyst-manual, external webhook, MCP tool
  • Notes: This is the shape Cortex ships today. The contract preserves it as the default route.

Profile B — temporal_sequence (smurfing — Pattern 1)

  • Volume: bursty, high rate per entity within the detection window
  • Lifecycle: signal fires only on full sequence match; no persistent state for partial sequences
  • Dedup: strong — collapse multiple matches of the same sequence on the same originator within the rolling window
  • Suppression: suppress lower-severity related sequences while a high-severity one is open for the entity
  • MHT: not applicable
  • Producers: AML detection lens (sequence matcher) consuming confirmed transaction clusters
  • Notes: Without dedup, smurfing detectors flood the queue. Profile must be tunable per jurisdiction (BSA $10K vs EU AMLD).

Profile C — composition_template (SAM battery — Pattern 2)

  • Volume: rare (a real composition match is a rare event)
  • Lifecycle: persistent — composition match is durable evidence, lives until resolved or refuted
  • Dedup: moderate — collapse continued observations of the same composition cluster as the same signal
  • Suppression: none typically (rare matches don't compete)
  • MHT: not applicable
  • Producers: subgraph-isomorphism matcher consuming entity clusters with role-typed classes
  • Notes: Full provenance + classification handling load-bearing. Customer threat templates drive the matcher; signals carry references to the template_id they matched.

Profile D — causal_chain (left-of-launch — Pattern 3)

  • Volume: medium — partial chain matches fire before full chain confirms
  • Lifecycle: stage-by-stage; signal advances through chain states; full state machine
  • Dedup: moderate — collapse repeated stage-A observations into one stage-A signal
  • Suppression: per-chain — competing chain hypotheses for the same observation set are alive simultaneously; pruning is the suppression event
  • MHT: load-bearing — CREQ-7 hypothesis-state container must handle this
  • Producers: chain-state-machine matcher consuming staged indicator events
  • Notes: Lead-time prediction with uncertainty bands flows through the signal payload. The most architecturally novel of the four; Phase II/III commitment.

Profile E — baseline_deviation (anomaly — Pattern 4)

  • Volume: continuous high noise — many threshold-near events
  • Lifecycle: advisory by default; not all anomalies require investigation
  • Dedup: strong — collapse continuous deviation into trend signals; surface only when significance threshold met
  • Suppression: moderate — once a deviation is acknowledged, suppress further-of-the-same until baseline returns to normal
  • MHT: not applicable
  • Producers: statistical-deviation engine consuming clusters compared against PoL baseline (federated learning, FATE/Titan integration)
  • Notes: The most rate-sensitive profile. Without dedup + suppression, this floods the system.

Dependency tree

SPEC-14 v2 payload schema (ratified)
  ↓
CREQ-1 universal payload conformance       ← can ship without SPEC-11
CREQ-2 state machine uniform across paths  ← can ship without SPEC-11
CREQ-3 producer registration               ← can ship without SPEC-11
CREQ-4 per-pattern routing                 ← can ship without SPEC-11
CREQ-12 audit trail integration            ← extends existing Cortex events; no new dep
CREQ-13 ABAC enforcement                   ← extends existing platform ABAC
CREQ-14 determinism + replay               ← extends Tier A #1; CI work

SPEC-11 persistent correlation registry (THE GATE — outside this spec)
  ↓
CREQ-5 deduplication                       ← needs persistence to survive restarts
CREQ-6 suppression                         ← needs persistence
CREQ-7 hypothesis state container          ← needs persistence
CREQ-8 backpressure                        ← can be in-memory; durability optional
CREQ-9 provenance preservation             ← extends existing schema; durability via SPEC-11
CREQ-10 DLQ + retry                        ← needs persistence
CREQ-11 rate limit                         ← can be in-memory; persistence optional

Eight CREQs are pre-storage-decision. Six CREQs gate on SPEC-11 (persistent correlation registry) — itself currently DRAFT. The phasing recommendation below splits accordingly.


Phased shipping recommendation

Phase II.A — pre-SPEC-11 wins (ships immediately)

CREQ Effort Why first
CREQ-1 universal payload conformance small Closes "convention not contract" without engine work
CREQ-2 state machine uniform across paths small-medium Fixes the audit-grade REST/MCP divergence bug
CREQ-3 producer registration medium Unblocks Parallax/Prism wiring (TASK-C01, C02)
CREQ-4 per-pattern routing small Schema-level addition; routing rules can default to Profile A
CREQ-9 provenance preservation (declared, not enforced) small Documents the chain even before it's audited
CREQ-12 audit trail integration medium Reuses existing Cortex event ledger
CREQ-13 ABAC enforcement at queue boundary small Extends existing pattern from Tier A #5
CREQ-14 determinism + replay medium CI gate; extends Tier A #1 to streaming layer

Phase II.A end state: Universal payload contract + uniform state machine + producer registration + per-pattern routing + audit trail + ABAC + determinism. Sentinel-the-component still doesn't exist, but its load-bearing behaviors (lifecycle, ABAC, audit) are coherent across Cortex / Parallax / Prism / external entry paths.

Phase II.B — post-SPEC-11

CREQ Why second
CREQ-5 deduplication Bursty signal handling; gates Pattern 1 + Pattern 4
CREQ-6 suppression Related-alert mute; gates Pattern 4 viability
CREQ-9 provenance preservation (enforced) Move from declared to enforced once persistence layer can hold the chain
CREQ-10 dead-letter + retry Operational reliability for effects dispatch
CREQ-11 rate limit Anomaly-pattern flood control

Phase II.B end state: Pattern Recognition Layers 2-3 (matcher + baseline) become operationally viable. SDA proposal Phase II commitments around governed multi-source fusion environment are technically defensible.

Phase III — full pattern recognition

CREQ Why third
CREQ-7 hypothesis state container Required for Pattern 3 (causal chain) MHT
CREQ-8 backpressure full implementation Required for federated learning baseline (Pattern 4) at scale

Phase III end state: All four pattern types operational. The Pattern Recognition Roadmap's full architecture (Layers 0-5) lands.


Critical principles (do not violate)

  1. One state machine, one place. Lifecycle transitions are validated in exactly one server-side location. MCP, REST, internal producers, and any future entry path call the same validator. The current divergence (REST may bypass MCP's SIGNAL_TRANSITIONS check) is a bug that blocks audit-grade claims.

  2. Provenance is preserved or the operation fails. Queue operations that drop provenance fields (correlation_id, lens_id, evidence_ref, producer_id) are forbidden. If a future operation cannot preserve provenance, it is not added to the queue contract.

  3. Dedup and suppression are auditable. Collapsing N events into one signal is an event itself, recorded with which events were collapsed, by which rule, when. Silent collapse is forbidden.

  4. Append-only events. Per Invariant 2 across the platform: queue events are append-only. Revocation is a new event, not deletion or in-place mutation.

  5. Determinism + replay compose with attestation log. Same input sequence + same config + same attestation event log → byte-identical output. This composes with Tier A #1 (parallax determinism) and the future analyst-edit determinism (per ANALYST-WORKFLOW-REQUIREMENTS.md). The composition is testable.

  6. Humans attest, queue routes. The queue does not make decisions that bypass the 7-Invariant rule "humans attest, AI assists." Auto-resolution of signals, auto-pruning of MHT branches without audit chain, auto-overwrite of suppression rules — all forbidden.

  7. Backpressure is observable. The queue does not silently degrade. Rate violations, dead-letter entries, dedup collapses, suppression decisions, hypothesis pruning are all metrics + audit events. Operators see what the queue is doing.


What this spec does NOT do

  • Does not design the queue implementation. Choice of message broker (Kafka, RabbitMQ, in-process, etc.), storage layer, retry mechanism, deployment topology — all out of scope. Implementation spec sits below this one.
  • Does not commit to MHT implementation. CREQ-7 names the container; the algorithm is Phase III per the Pattern Recognition Roadmap §2.5 (deprioritized based on toy-era evidence).
  • Does not subsume Sentinel. Sentinel's design spec (SPEC-PLATFORM-07) is a consumer of this contract, not replaced by it. Sentinel may BE the implementation of the queue, or sit on top of it — design-time decision.
  • Does not break existing Cortex shipping behavior. Profile A (single_event) preserves today's behavior. Existing create_signal / set_signal_status / etc. continue to work; the spec captures their contract and extends it.
  • Does not specify federated learning runtime. CREQ-7 + Pattern 4 reference baseline-of-life learning; the FATE/Titan integration is its own track.
  • Does not address cross-domain signaling (CDS, cross-classification). Mentioned in Pattern Recognition Roadmap as future work; out of scope here.

Open questions to resolve before implementation

  1. Where does the persistent state live physically? Co-located with parallax fusion engine? Separate microservice? This is the SPEC-11 storage decision in concrete form. CREQs 5/6/7/9/10 all gate on this.

  2. Is the queue the implementation of Sentinel, or does Sentinel sit on top? Architectural decision affects who owns CREQs 6/7. Sentinel's existing spec (SPEC-PLATFORM-07) needs to be reconciled with this contract.

  3. What is the rate-envelope policy framework? CREQ-11 declares the capability; the rules (defaults per pattern kind, per-customer overrides, escalation paths) need design.

  4. What is the conflict-resolution policy for suppression overrides? When an analyst manually creates a signal that the suppression rule would have suppressed, what happens? Mirror of EREQ-2 conflicts in the analyst-edit context. Customer-policy specific.

  5. How does this interact with cross-domain solution (CDS) / classified signal handling? SPEC-PLATFORM-07 hints at multi-classification; not addressed here. Worth a follow-on spec.

  6. What is the bulk-operations API shape? Real ops tempo includes bulk reject ("reject all signals from sensor X between T1 and T2 — known calibration fault"). Bulk dedup. Bulk replay. Ergonomics of these matter operationally.

  7. What is the interaction with the Beacon Investigation surface? Existing Beacon flow consumes signals via Cortex. With the queue inserted, does Beacon read from the queue, from Cortex, from both? Architectural decision.

These are real design questions, not afterthoughts. Worth a dedicated design session before committing engineering on Phase II.B.


Closing note

This spec inherits its credibility from the May 2026 Signal-Flow Ground Truth review. Every CREQ traces to a named gap with file/line citations in that review. No CREQ is added on intuition. When in doubt about whether a capability belongs here, ask: did the ground truth review name a specific gap that this would close? If yes, add it. If no, leave it for the implementation spec.

Maintainer: cross-component (Cortex, Parallax, Prism, Sentinel teams). Single review point quarterly or on substantive signal-flow change in any of those services.


Depends on: component.parallax.correlation-persistence, component.parallax.signal-payload

Realizes: product.fusion