Skip to content

Fusion Output, Persistent Correlation & Blocking Acceleration

Status & scope

  • Stage: DRAFT — Architecture Decision
  • Milestone: M3 (Federation Integration)

Problem

component.parallax.fusion-binding defines how data flows INTO the fusion engine. This spec defines what comes OUT and where it lives.

Currently, fusion produces match results and entity clusters as ephemeral outputs — they exist in memory during a fusion run and disappear afterward. This creates three problems:

  1. Redundant work. Every fusion run re-correlates everything from scratch, even pairs that were confirmed last week. At 700 nodes × 100K records, this wastes hours of compute.

  2. No institutional memory. The system doesn't remember that "Record A at Barclays IS Record B at HSBC." Each run discovers the same truth independently. Human attestations (confirmed matches, rejected false positives) are lost between runs.

  3. No blocking acceleration. New records must be compared against the entire candidate space. If we already know that records A, B, and C are the same entity, a new record D that matches A's blocking key should immediately be compared against {A, B, C} — not against the full population.

Decision

Fusion outputs are written back to dataspace as Correlation Records — lightweight reference objects that link records across federates without copying field values. These persist between runs and serve as blocking accelerators for subsequent fusion.

A Correlation Record is NOT a merged entity. It does not copy or merge field values from contributing records. It is a reference graph: "these records refer to the same real-world entity, with this confidence, based on this evidence." Raw data stays at each federate. The correlation record lives in UDS as metadata.

This is architecturally cleaner than creating golden records because: - Sovereignty preserved. No field values are copied. Each federate owns their data. The correlation is metadata about relationships between records, not a duplicate of the records themselves. - Append-only. New evidence strengthens or weakens correlations. Old correlations are never deleted — they decay or get superseded by newer runs. - Auditable. Every correlation traces back to a specific fusion run, lens version, binding version, and scored pair evidence.

Architecture

Entity Lifecycle

                    ┌─────────────────────────────────────────┐
                    │            RAW OBSERVATION               │
                    │  A single record at a single federate    │
                    │  correlation_status: UNRESOLVED          │
                    │  (default — has not been through fusion) │
                    └────────────────┬────────────────────────┘
                                     │
                              FUSION RUN
                                     │
                    ┌────────────────┴────────────────────────┐
                    │                                          │
            ┌───────▼────────┐                    ┌───────────▼──────────┐
            │   INDEPENDENT   │                    │    CORRELATED        │
            │ Fusion confirmed │                    │ Fusion confirmed     │
            │ this record is   │                    │ this record matches  │
            │ unique — no      │                    │ one or more records  │
            │ matches found    │                    │ at other federates   │
            │                  │                    │                      │
            │ resolved_to: ∅   │                    │ resolved_to: CR-xxx  │
            └──────────────────┘                    └──────────────────────┘

Three states, not two. The important distinction is between UNRESOLVED (never been through fusion) and INDEPENDENT (been through fusion, confirmed unique). This matters because: - UNRESOLVED records need full blocking and scoring - INDEPENDENT records can skip full scoring on re-run unless their data changed - CORRELATED records have a Correlation Record that accelerates future blocking

Correlation Record

┌─────────────────────────────────────────────────────────┐
│  CORRELATION RECORD  (CR-2026-03-vrs-00427)             │
│                                                          │
│  entity_type: "customer"     (from lens.target_model)    │
│  lens_id: "vrs_vulnerability_v1"                         │
│  confidence: 0.87            (aggregate across pairs)    │
│  status: CONFIRMED           (human-attested)            │
│                                                          │
│  contributing_records:                                    │
│  ┌──────────────────────────────────────────────────┐   │
│  │ federate: barclays.axonis.ai                     │   │
│  │ record_ref: "cust_uk_00234"                      │   │
│  │ blocking_keys: ["smith_john_1985", "SW1A_1AA"]   │   │
│  │ last_scored: "2026-03-01T09:00:00Z"              │   │
│  │ pair_score: 0.91 (vs HSBC record)                │   │
│  └──────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────┐   │
│  │ federate: hsbc.axonis.ai                         │   │
│  │ record_ref: "client_ret_08821"                   │   │
│  │ blocking_keys: ["smith_john_1985", "SW1A_1AA"]   │   │
│  │ last_scored: "2026-03-01T09:00:00Z"              │   │
│  │ pair_score: 0.91 (vs Barclays record)            │   │
│  └──────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────┐   │
│  │ federate: lloyds.axonis.ai                       │   │
│  │ record_ref: "acct_holder_55102"                  │   │
│  │ blocking_keys: ["smith_john_1985", "SW1A_2BX"]   │   │
│  │ last_scored: "2026-03-07T14:00:00Z"              │   │
│  │ pair_score: 0.84 (vs Barclays record)            │   │
│  └──────────────────────────────────────────────────┘   │
│                                                          │
│  source_lineage:                                         │
│  ┌──────────────────────────────────────────────────┐   │
│  │ fusion_run: "run_vrs_2026_03_01_0900"            │   │
│  │ lens_version: "1.0.0"                            │   │
│  │ binding_versions: {barclays: "v3", hsbc: "v2"}   │   │
│  │ scoring_evidence_block: "blk_9f8e7d..."          │   │
│  │ accuracy: +2 (believed correct)                  │   │
│  │ credibility: +2 (usually credible)               │   │
│  └──────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────┐   │
│  │ fusion_run: "run_vrs_2026_03_07_1400"            │   │
│  │ lens_version: "1.0.0"                            │   │
│  │ added: lloyds record (new contributing record)   │   │
│  │ scoring_evidence_block: "blk_3a4b5c..."          │   │
│  │ accuracy: +2                                     │   │
│  │ credibility: +2                                  │   │
│  └──────────────────────────────────────────────────┘   │
│                                                          │
│  attestation:                                            │
│  ┌──────────────────────────────────────────────────┐   │
│  │ attested_by: "analyst@vrs.gov.uk"                │   │
│  │ attested_at: "2026-03-02T10:30:00Z"              │   │
│  │ decision: "CONFIRMED"                            │   │
│  │ edition_id: "ed_vrs_00427_confirmed"             │   │
│  │ notes: "Verified via passport token + DOB match" │   │
│  └──────────────────────────────────────────────────┘   │
│                                                          │
│  created_at: "2026-03-01T09:00:00Z"                     │
│  updated_at: "2026-03-07T14:00:00Z"                     │
│  observation_count: 3                                    │
│  time_first_observation: "2026-03-01T09:00:00Z"         │
│  time_most_recent_observation: "2026-03-07T14:00:00Z"   │
└─────────────────────────────────────────────────────────┘

What a Correlation Record is NOT

  • NOT a golden record. It does not contain full_name: "John Smith" or dob: "1985-03-15". Field values stay at the federate.
  • NOT an InsightRoot. InsightRoots are created when something needs human attention. Most correlations are routine and don't warrant investigation.
  • NOT a copy of source data. It holds record references, blocking keys, and scores. If you need the actual field values, you query the federate via UDS (respecting ABAC).
  • NOT permanent. Correlations have confidence that decays over time. Unrefreshed correlations weaken. Invalidated correlations (schema change, lens change) are superseded.

Correlation Status on Records

Every record in UDS gains a lightweight correlation status. This is metadata, not a schema change — it's stored alongside the record's existing UDS metadata.

Record metadata (existing):
  uds.model: "customer"
  uds.source: "retail_kyc"
  uds.federate: "barclays.axonis.ai"

Correlation metadata (NEW):
  uds.correlation_status: "CORRELATED"    # UNRESOLVED | INDEPENDENT | CORRELATED
  uds.correlation_refs: ["CR-2026-03-vrs-00427"]  # Correlation Record IDs
  uds.last_resolved: "2026-03-07T14:00:00Z"       # When last fusion run touched this
  uds.resolution_lens: "vrs_vulnerability_v1"      # Which lens resolved this

Multiple lenses can correlate the same record for different purposes. A customer record might be CORRELATED under VRS lens and INDEPENDENT under AML lens. The correlation_refs array holds one CR per lens.

Blocking Acceleration

The Three-Tier Blocking Strategy

When a fusion run starts, the blocker checks correlation state before generating candidates:

TIER 1: CORRELATION CACHE (cheapest)
    For each record with correlation_status == CORRELATED:
        Load its Correlation Record
        The CR's blocking_keys are pre-computed
        If a new record's blocking key matches a CR's blocking keys:
            → Candidate pair = new record vs. all contributing records in the CR
            → Skip full blocking for this pair
        Cost: O(1) lookup per record

TIER 2: INDEPENDENT SKIP (cheap)
    For records with correlation_status == INDEPENDENT:
        If the record has NOT changed since last_resolved:
            → Skip scoring for this record (it was already confirmed unique)
        If the record HAS changed:
            → Demote to UNRESOLVED, include in full blocking
        Cost: O(1) check per record

TIER 3: FULL BLOCKING (normal)
    For records with correlation_status == UNRESOLVED:
        → Standard blocking: compute blocking keys, generate candidates
        → This is the existing blocker behavior (MA-01, MA-02)
        Cost: O(N) per blocking key computation

Impact on Fusion Run Time

For a stable federation (few new records, few changes):

Tier Records Work per Record Total
Correlation cache ~60% (already correlated) O(1) lookup Minimal
Independent skip ~30% (confirmed unique) O(1) check Minimal
Full blocking ~10% (new/changed) O(N) blocking Standard

Result: 90% of records skip full blocking on re-runs. The fusion run only does heavy lifting on new arrivals and changed records.

For an initial run (no prior correlations):

Tier Records Work per Record Total
Correlation cache 0%
Independent skip 0%
Full blocking 100% O(N) blocking Full cost

First run is full cost. Every subsequent run gets faster as the correlation cache builds up.

Incremental Fusion

When new records arrive, the fusion engine runs incrementally:

1. NEW RECORDS arrive at a federate (ingest, scheduled refresh, signal)

2. CLASSIFY each new record:
   → Compute blocking keys using lens blocking spec
   → Check blocking keys against Correlation Record index
   → If match found: CANDIDATE against existing CR
   → If no match: CANDIDATE against all UNRESOLVED records (full blocking)

3. SCORE candidates:
   → For CR matches: score new record against one representative per CR
     (pick the contributing record with highest individual score)
   → For full blocking matches: standard pairwise scoring

4. UPDATE correlations:
   → New record joins existing CR: add to contributing_records, update scores
   → New record forms new CR: create CR with two contributing records
   → New record is unique: mark as INDEPENDENT

5. WRITE BACK to dataspace:
   → Updated/new Correlation Records
   → Updated correlation_status on affected records
   → Evidence blocks for all scoring decisions
   → Audit trail (fusion run record)

Source Lineage

Every Correlation Record carries a lineage chain — the full history of how and why it was created or modified. Each entry in the lineage answers: who ran what, with what evidence, and how confident are we?

Lineage Entry

lineage_entry:
  fusion_run_id: "run_vrs_2026_03_07_1400"
  timestamp: "2026-03-07T14:00:00Z"
  action: "RECORD_ADDED"    # CREATED | RECORD_ADDED | RECORD_REMOVED | SCORE_UPDATED | ATTESTED | INVALIDATED | DECAYED
  lens_id: "vrs_vulnerability_v1"
  lens_version: "1.0.0"
  binding_versions:
    barclays.axonis.ai: "vrs_v1__barclays__2026_03"
    hsbc.axonis.ai: "vrs_v1__hsbc__2026_03"
    lloyds.axonis.ai: "vrs_v1__lloyds__2026_03"
  evidence_block_id: "blk_3a4b5c..."
  accuracy: 2          # -3 to +4 (how well do we know this?)
  credibility: 2        # -2 to +3 (how much can we believe this?)
  details:
    added_record: "lloyds.axonis.ai:acct_holder_55102"
    pair_score: 0.84
    scoring_method: "weighted_aggregate"

Accuracy and Credibility Scores

Each lineage entry carries two quality indicators:

Accuracy (-3 to +4): How well do we know this correlation is correct? - -3: Known to be wrong (human rejected) - -2: Improbable (contradictory evidence) - -1: Doubtful (low fusion score, edge case) - 0: Unable to judge (first run, no human review) - +1: Possibly correct (fusion score above threshold but marginal) - +2: Believed correct (strong fusion score, multiple supporting signals) - +3: Confirmed (human attested) - +4: Strongly confirmed (multiple human attestations or deterministic match)

Credibility (-2 to +3): How much can we trust the sources? - -2: Hostile or deceptive (known bad data source) - -1: Generally not credible (source has history of bad data) - 0: Unable to judge (new source, no track record) - +1: Fairly credible (source passed basic quality checks) - +2: Usually credible (established source, good track record) - +3: Highly credible (authoritative source, e.g., passport office)

These aggregate across lineage entries. The CR's overall accuracy/credibility is the weighted average of its lineage entries, with more recent entries weighted higher.

Lineage Actions

Action When Effect on CR
CREATED First fusion run finds match New CR with 2 contributing records
RECORD_ADDED Subsequent run finds new match to existing CR Contributing records list grows
RECORD_REMOVED Contributing record deleted at source Record removed from CR, confidence recalculated
SCORE_UPDATED Re-run produces different score Pair scores updated, confidence recalculated
ATTESTED Human confirms or rejects Status changes to CONFIRMED or REJECTED, accuracy set to +3 or -3
INVALIDATED Lens version change, binding change, schema change CR marked stale, must be re-evaluated on next run
DECAYED Time passes without re-verification Confidence decreases per decay schedule

Correlation Record Schema

UDS Object Type

object_type: correlation_record
subtype: fusion_correlation

fields:
  # Identity
  correlation_id: string        # Unique ID (CR-{year}-{month}-{lens_short}-{seq})
  entity_type: string           # From lens.target_model (e.g., "customer")
  lens_id: string               # Which lens produced this
  lens_version: string          # Lens version at time of creation

  # State
  status: enum                  # PROPOSED | CONFIRMED | REJECTED | STALE | DECAYED
  confidence: float             # 0.0 - 1.0, aggregate across pair scores
  accuracy: integer             # -3 to +4, aggregate across lineage
  credibility: integer          # -2 to +3, aggregate across lineage

  # Contributing records (references only — no field values)
  contributing_records:
    - federate_id: string       # Which federate owns this record
      record_ref: string        # Record identifier at that federate
      blocking_keys: [string]   # Pre-computed blocking keys for this record
      pair_score: float         # Score of this record vs. the cluster
      last_scored: datetime     # When this record was last scored
      binding_id: string        # Which binding was used

  # Temporal
  observation_count: integer    # How many times fusion has confirmed this
  time_first_observation: datetime
  time_most_recent_observation: datetime
  created_at: datetime
  updated_at: datetime

  # Lineage (append-only)
  source_lineage: [lineage_entry]

  # Attestation (optional — only if human has reviewed)
  attestation:
    attested_by: string
    attested_at: datetime
    decision: enum              # CONFIRMED | REJECTED | DEFERRED
    edition_id: string          # Links to the Edition object
    notes: string

  # Blocking acceleration
  composite_blocking_keys: [string]  # Union of all contributing records' blocking keys
  blocking_key_hash: string          # Hash of composite keys for fast lookup

Status Lifecycle

                    ┌─────────┐
                    │ PROPOSED │  ← Fusion creates CR, no human review yet
                    └────┬────┘
                         │
              human reviews │
                         │
              ┌──────────┼──────────┐
              ▼          ▼          ▼
        ┌──────────┐ ┌────────┐ ┌──────────┐
        │CONFIRMED │ │REJECTED│ │DEFERRED  │
        │(accuracy │ │(accuracy│ │(needs more│
        │  = +3)   │ │  = -3) │ │ evidence) │
        └────┬─────┘ └────────┘ └────┬─────┘
             │                        │
     lens/schema change        new evidence
             │                        │
             ▼                        ▼
        ┌─────────┐             back to PROPOSED
        │  STALE   │  ← Must be re-evaluated
        └────┬─────┘
             │
      not re-evaluated
      within decay window
             │
             ▼
        ┌─────────┐
        │ DECAYED  │  ← Confidence below useful threshold
        └─────────┘

REJECTED correlations are kept (invariant 2: append-only). They serve as negative evidence — "these records were explicitly confirmed as NOT the same entity." This is critical for false positive traps like the P018 test case (two different John Smiths).

When to Create InsightRoots

Correlation Records are quiet — they accumulate in dataspace without generating noise. InsightRoots are created only when a pattern warrants human attention:

Signal Triggers (configurable per lens)

# In the lens output_semantics section:
insight_triggers:
  # New correlation with high confidence — may need attestation
  - trigger: new_correlation
    condition: "confidence >= 0.85 AND status == PROPOSED"
    severity: medium
    action: create_insight

  # Contradiction detected — same entity resolved differently by two lenses
  - trigger: contradiction
    condition: "record appears in two CRs with different entity_type"
    severity: high
    action: create_insight

  # Escalating pattern — entity gaining contributing records rapidly
  - trigger: rapid_accumulation
    condition: "observation_count increased by >= 3 in 7 days"
    severity: medium
    action: create_insight

  # Stale correlation — was CONFIRMED but evidence has decayed
  - trigger: stale_confirmed
    condition: "status == CONFIRMED AND time_since_last_scored > 180d"
    severity: low
    action: create_insight

  # False positive trap — similar to P018 test case
  - trigger: near_miss
    condition: "pair_score BETWEEN 0.60 AND threshold AND name_similarity > 0.90"
    severity: medium
    action: create_insight

InsightRoot from Correlation Record

When a trigger fires:

Correlation Record CR-xxx
    ↓ signal trigger fires
InsightRoot created:
    subject: "Correlated Entity: CR-xxx"
    subject_type: entity_type from CR
    evidence_blocks:
        - scoring_evidence_block from CR lineage
        - blocking_evidence_block from CR lineage
    correlation_record_id: CR-xxx (back-reference)
    signal_type: from trigger config
    severity: from trigger config

The InsightRoot links back to the CR. The human investigates the InsightRoot, reviews evidence, makes a decision via Edition, and the attestation flows back to the CR's attestation field.

Confidence Decay

Correlation confidence is not static. Without re-verification, confidence decays:

def decayed_confidence(cr: CorrelationRecord, now: datetime) -> float:
    """Calculate current confidence with time decay.

    Uses half-life from the lens's evidence_rules.confidence_decay.
    Default half_life: 90 days.

    Confidence resets to full on:
    - Re-scoring (new fusion run confirms the match)
    - Human attestation (CONFIRMED status)
    - New contributing record added
    """
    if cr.status == "CONFIRMED":
        # Human-attested correlations decay slower (2x half-life)
        effective_half_life = cr.lens_decay_half_life * 2
    else:
        effective_half_life = cr.lens_decay_half_life

    days_since_last = (now - cr.time_most_recent_observation).days
    decay_factor = 0.5 ** (days_since_last / effective_half_life)
    return cr.confidence * decay_factor

When decayed_confidence falls below a configurable threshold (default: 0.30), the CR status transitions to DECAYED. DECAYED CRs are excluded from blocking acceleration but preserved for audit.

Negative Correlations (False Positive Blocklist)

When a human REJECTS a correlation ("these are NOT the same entity"), the CR status becomes REJECTED with accuracy = -3. This creates a negative correlation that the blocker uses as a blocklist:

NEGATIVE CORRELATION CHECK (before scoring):
    For each candidate pair (record_a, record_b):
        Check if a REJECTED CR exists containing both record_a and record_b
        If yes: SKIP this pair — human already confirmed they are different
        This prevents the P018 problem from recurring

Negative correlations never decay — once a human says "these are different people," the system remembers permanently. They can only be overridden by a new human attestation that explicitly reverses the decision (creating a new lineage entry with action: ATTESTED).

Impact on parallax

New Module: parallax/ops/fusion/correlation.py

"""Correlation Record management for persistent entity resolution.

Handles creation, update, and query of Correlation Records.
Pure Python, no infrastructure dependencies.
"""

@dataclass(frozen=True)
class ContributingRecord:
    federate_id: str
    record_ref: str
    blocking_keys: tuple[str, ...]
    pair_score: float
    last_scored: str
    binding_id: str = ""

@dataclass(frozen=True)
class LineageEntry:
    fusion_run_id: str
    timestamp: str
    action: str  # CREATED | RECORD_ADDED | SCORE_UPDATED | ATTESTED | INVALIDATED | DECAYED
    lens_id: str
    lens_version: str
    evidence_block_id: str = ""
    accuracy: int = 0
    credibility: int = 0
    details: dict = field(default_factory=dict)

@dataclass(frozen=True)
class Attestation:
    attested_by: str
    attested_at: str
    decision: str  # CONFIRMED | REJECTED | DEFERRED
    edition_id: str = ""
    notes: str = ""

@dataclass(frozen=True)
class CorrelationRecord:
    correlation_id: str
    entity_type: str
    lens_id: str
    lens_version: str
    status: str  # PROPOSED | CONFIRMED | REJECTED | STALE | DECAYED
    confidence: float
    contributing_records: tuple[ContributingRecord, ...]
    source_lineage: tuple[LineageEntry, ...]
    composite_blocking_keys: tuple[str, ...] = ()
    blocking_key_hash: str = ""
    observation_count: int = 0
    time_first_observation: str = ""
    time_most_recent_observation: str = ""
    attestation: Attestation | None = None
    accuracy: int = 0
    credibility: int = 0
    created_at: str = ""
    updated_at: str = ""

# Functions

def create_correlation(
    pair: MatchCandidate,
    fusion_run_id: str,
    lens: LensSpec,
    bindings: dict[str, LensBinding],
) -> CorrelationRecord:
    """Create a new CR from a confirmed match pair."""

def add_record_to_correlation(
    cr: CorrelationRecord,
    new_record: ContributingRecord,
    pair_score: float,
    fusion_run_id: str,
) -> CorrelationRecord:
    """Add a new contributing record to an existing CR.
    Returns new CR (frozen — creates a copy)."""

def apply_attestation(
    cr: CorrelationRecord,
    attestation: Attestation,
) -> CorrelationRecord:
    """Apply human attestation to a CR."""

def check_negative_correlation(
    record_a_ref: str,
    record_b_ref: str,
    rejected_crs: list[CorrelationRecord],
) -> bool:
    """Check if a REJECTED CR exists for this pair. Returns True if blocked."""

def calculate_decayed_confidence(
    cr: CorrelationRecord,
    now_iso: str,
    half_life_days: int = 90,
) -> float:
    """Calculate current confidence with time decay."""

def compute_composite_blocking_keys(
    cr: CorrelationRecord,
) -> tuple[str, ...]:
    """Union of all contributing records' blocking keys."""

def build_correlation_index(
    crs: list[CorrelationRecord],
) -> dict[str, list[CorrelationRecord]]:
    """Build blocking key → CR index for acceleration.
    Returns {blocking_key: [CRs containing a record with this key]}."""

Updated Blocker: Correlation-Aware Blocking

The existing blocker.py gains an optional parameter for the correlation index:

def generate_candidates_v2(
    records_a: list[dict],
    records_b: list[dict],
    blocking_specs: list,
    correlation_index: dict[str, list[CorrelationRecord]] | None = None,
    rejected_pairs: set[tuple[str, str]] | None = None,
    id_field: str = "id",
) -> list[tuple[str, str]]:
    """Generate candidate pairs with correlation-aware blocking.

    If correlation_index provided:
    1. For each record, check if its blocking key hits the correlation index
    2. If hit: generate candidates only against the CR's contributing records
    3. If no hit: fall through to standard blocking (existing behavior)

    If rejected_pairs provided:
    - Skip any pair that appears in the rejected set (negative correlations)

    The existing generate_candidates() function is unchanged.
    """

Storage in UDS

Correlation Records are stored as a new UDS object subtype:

# UDS object metadata
uds.object_type: "correlation_record"
uds.subtype: "fusion_correlation"
uds.model: null                        # CRs are cross-model (they link records)
uds.lens_id: "vrs_vulnerability_v1"    # Queryable: find all CRs for a lens
uds.status: "CONFIRMED"                # Queryable: find all confirmed CRs
uds.created_at: "2026-03-01T09:00:00Z"
uds.updated_at: "2026-03-07T14:00:00Z"

# Indexed for blocking acceleration
uds.composite_blocking_keys: ["smith_john_1985", "SW1A_1AA", "SW1A_2BX"]
uds.blocking_key_hash: "a7f3e2..."     # For fast equality check

The blocking key index on CRs is the critical performance feature. At query time, the blocker issues:

GET correlation_records WHERE composite_blocking_keys CONTAINS "smith_john_1985"

This returns all CRs that contain a record matching that blocking key — instantly narrowing the candidate space.

Relationship to Existing Platform Objects

Platform Object Relationship to Correlation Record
Record (raw) CR.contributing_records[].record_ref points at records. Records gain uds.correlation_status metadata.
Block (evidence) CR.source_lineage[].evidence_block_id points at scoring evidence blocks.
InsightRoot Created only when signal triggers fire. InsightRoot.correlation_record_id back-references the CR.
Edition Human decision on an InsightRoot. CR.attestation links to the Edition.
Dataset Fusion input datasets (component.parallax.fusion-binding) are consumed by the engine. CRs are outputs, not datasets.
Project The Fusion Project (component.parallax.fusion-binding) orchestrates runs that produce/update CRs.
Lens CR.lens_id references the lens that governs the matching rules.

What Needs to Be Built

In parallax (pure Python)

  1. correlation.py — CorrelationRecord dataclass + functions
  2. blocker.py extension — generate_candidates_v2() with correlation index
  3. correlation_index.py — build/query correlation index from CR list
  4. Tests for all of the above

In titan (orchestrator, later)

  1. Write-back to UDS — persist CRs after fusion run
  2. Load CRs from UDS at fusion start — build correlation index
  3. Incremental fusion mode — classify new vs. changed vs. unchanged records
  4. Confidence decay scheduler — periodic job to decay and transition stale CRs

In cortex (MCP tools, later)

  1. query_correlations — find CRs for a record, entity, or lens
  2. attest_correlation — apply human decision to a CR
  3. invalidate_correlations — mark CRs stale when lens/binding changes

Invariant Compliance

Invariant Compliance
1. UDS is sole ABAC authority CRs stored in UDS, queried via UDS. Contributing record access governed by federate ABAC.
2. Events are append-only Source lineage is append-only. CRs are never deleted. Status transitions are new lineage entries.
3. Blocks are evidence Every scoring decision that creates/updates a CR produces an evidence block with hash.
4. Frozen means frozen Evidence blocks referenced by CR lineage are frozen. CR confidence is derived, not stored — it's recomputed from frozen evidence.
5. Editions require frozen evidence Attestation links to an Edition which references frozen evidence blocks.
6. AI assists, humans attest Fusion proposes CRs (status: PROPOSED). Humans confirm or reject via attestation.
7. "No action" is a decision REJECTED CRs are explicit "these are NOT the same." DEFERRED attestations are explicit "I need more evidence."

Cross-References

Document Relationship
component.parallax.fusion-binding (Binding & Project) Fusion input → engine → Correlation Records (this spec defines the output)
component.parallax.blocking-engine (Blocking) generate_candidates_v2() extends blocking with correlation index
component.parallax.scoring-engine (Scoring) Pair scores stored in CR.contributing_records[].pair_score
component.parallax.lens-parser (Lens Parser) Lens output_semantics extended with insight_triggers

Depends on: component.parallax.blocking-engine, component.parallax.fusion-binding, component.parallax.scoring-engine

Realizes: product.fusion

Required by: component.parallax.dissent, component.parallax.fusion-governance-lifecycle, component.parallax.local-persistence-adapter, component.parallax.multi-contributor-combination, component.parallax.observation-reverse-index, component.parallax.quorum, component.parallax.signal-queue-contract, component.parallax.tracking-integration, component.parallax.wire-message-families