Multi-Contributor Combination Function

Status & scope

Stage: DRAFT — ready to implement (replaces the min(pair_scores) placeholder in correlation.py:473-474)
Layer: compute (pure combination math — parallax library)
Depends on: SPEC-PLATFORM-04 (Parallax service), CL-06 (lens emission), CL-09 (evidence chain), CL-08 (classification)
Consumed by: quorum evaluation (weighted policies), dissent (conflict_indicator emission path), consensus session
Milestone: P0 — gates real production federation; gates Q3 2026 GA

Purpose

The current correlation.py:473-474 joint-confidence path is a placeholder:

all_scores = [r.pair_score for r in contributing_tuple if r.pair_score > 0]
confidence = min(all_scores) if all_scores else cluster.aggregate_confidence

min(pair_scores) is wrong for production multi-contributor fusion. It ignores authority weighting (the STANAG 2022 fields accuracy and credibility already in ContributingRecord), it produces no conflict indicator, and it cannot represent dissent. This spec defines the real combiner.

Inputs and Outputs

Input

A list of contributions from N federates that have already produced pair scores against a common candidate match:

@dataclass(frozen=True)
class Contribution:
    contributor_id: str           # federate identifier (also serves as tenant_id when
                                  # contribution arrives via federation per CL-10)
    pair_score: float             # [0.0, 1.0] — output of pair-level scoring
    accuracy: int                 # STANAG 2022 source accuracy: 1 (high) – 6 (low)
    credibility: int              # STANAG 2022 data credibility: 1 (high) – 6 (low)
    signer_key_id: str            # for evidence chain (CL-09)
    classification: str           # contributor-declared classification of THIS contribution
                                  # (per CL-08; combiner computes output_classification from these)

Field Authority

Field	Authority	Notes
`classification` (on Contribution)	Contributor declares its own	What the contributing federate says about its own derived feature
`output_classification` (on CombinationResult)	Combiner computes per CL-08 propagation rule	max-of-inputs by default; declared / downgraded per lens spec
`tenant_id` (NOT on Contribution)	Session-scoped per CL-10	Contributions inherit tenant context from the session they're added to; not duplicated on each contribution

Output

@dataclass(frozen=True)
class CombinationResult:
    joint_confidence: float       # [0.0, 1.0]
    conflict_indicator: float     # [0.0, 1.0]; 0 = full agreement, 1 = maximum disagreement
    method: str                   # "weighted_average" | "dempster_shafer"
    per_contributor_weight: dict  # contributor_id → effective weight used
    output_classification: str    # max-of-inputs per CL-08 unless lens overrides
    inputs_hash: str              # SHA-256 over canonical inputs (for replay)

The Combination Function

Two modes are pack-configurable per lens. Mode is declared in the lens spec under consensus.combination_method.

Mode A — Authority-Weighted Average (default)

For each contribution, compute an authority weight from STANAG 2022:

composite_rating(accuracy, credibility) → [0.0, 1.0]
  high accuracy (1) + high credibility (1) → 1.0
  worst accuracy (6) + worst credibility (6) → 0.0
  formula: ((7 - accuracy) / 6 + (7 - credibility) / 6) / 2

This function already exists at correlation.py:92. Reuse it.

Joint confidence is the weighted average:

weight_i = composite_rating(accuracy_i, credibility_i)
joint_confidence = sum(weight_i * pair_score_i) / sum(weight_i)

Conflict indicator measures spread across contributors, normalized:

weighted_mean = joint_confidence
weighted_variance = sum(weight_i * (pair_score_i - weighted_mean)^2) / sum(weight_i)
weighted_stddev = sqrt(weighted_variance)
conflict_indicator = min(1.0, weighted_stddev / 0.5)   # 0.5 is the half-range scaling

A conflict_indicator above the lens-declared quorum.conflict_threshold (default 0.3) triggers SPEC-36 IN_CONFLICT state and SPEC-30 machine dissent emission.

Mode B — Dempster-Shafer (optional, conflict-visible)

For partners that require evidence-theory combination (some coalition deployments do):

Each contribution maps to a basic probability assignment (BPA) over the frame {match, no_match, unknown}:

m_i({match})    = pair_score_i * weight_i
m_i({no_match}) = (1 - pair_score_i) * weight_i
m_i({unknown})  = 1 - weight_i              # un-discounted mass goes to unknown

Dempster's rule of combination over two contributions:

K = sum over all (B, C) where B ∩ C = ∅ of m1(B) * m2(C)     # conflict mass
m_combined(A) = (1 / (1 - K)) * sum over (B ∩ C = A) of m1(B) * m2(C)

For N > 2, apply pairwise iteratively (Dempster's rule is associative and commutative).

Joint confidence is the combined belief in {match}:

joint_confidence = m_combined({match}) + m_combined({match, unknown})  # belief, not plausibility
conflict_indicator = K_total                                            # combined conflict mass

When K_total approaches 1, contributors fundamentally disagree — Dempster's rule degenerates and the combiner returns IN_CONFLICT without forcing a smoothed answer.

Mode Selection

The lens declares which mode to use:

consensus:
  combination_method: weighted_average   # or: dempster_shafer
  quorum:
    required_contributors: 2
    minimum_authority_sum: 1.5
    conflict_threshold: 0.3
    conflict_policy: flag                # one of: suppress | flag | split

Default: weighted_average. Coalition-specific lenses (FVEY, NATO partner) declare dempster_shafer when partner accreditation requires evidence-theory semantics.

Reproducibility Contract

Same inputs MUST produce byte-identical outputs. Three constraints on implementation:

Canonical input ordering. Sort contributions by (contributor_id, signer_key_id) before any computation.
Deterministic floating-point. Use decimal.Decimal with 15 significant digits OR documented IEEE-754 double-precision sequence. Both modes documented; pack chooses one.
Inputs hash. Compute inputs_hash as SHA-256 over canonical-JSON of the sorted contribution list. Recorded in the result and in the evidence Block (per CL-09).

Replay: given an evidence Block referencing this combiner, anyone can recompute joint_confidence + conflict_indicator from inputs_hash and verify byte-equality with the recorded output.

Reference Implementation

This drops into parallax/ops/fusion/correlation.py replacing the placeholder at lines 473-474. Your engineer integrates-and-tests rather than designs-from-scratch.

# parallax/ops/fusion/combination.py — NEW FILE

from __future__ import annotations
import hashlib
import json
import math
from dataclasses import dataclass, asdict
from typing import Sequence


@dataclass(frozen=True)
class Contribution:
    contributor_id: str
    pair_score: float
    accuracy: int
    credibility: int
    signer_key_id: str
    classification: str


@dataclass(frozen=True)
class CombinationResult:
    joint_confidence: float
    conflict_indicator: float
    method: str
    per_contributor_weight: dict
    output_classification: str
    inputs_hash: str


def composite_rating(accuracy: int, credibility: int) -> float:
    """STANAG 2022 two-axis rating → [0.0, 1.0].
    accuracy and credibility are 1 (high) – 6 (low).
    Already exists at correlation.py:92 — reuse, don't re-implement."""
    if not (1 <= accuracy <= 6) or not (1 <= credibility <= 6):
        raise ValueError(f"Invalid STANAG rating: a={accuracy}, c={credibility}")
    return ((7 - accuracy) / 6 + (7 - credibility) / 6) / 2


def _canonical_sort(contribs: Sequence[Contribution]) -> list[Contribution]:
    return sorted(contribs, key=lambda c: (c.contributor_id, c.signer_key_id))


def _inputs_hash(contribs: Sequence[Contribution]) -> str:
    canonical = json.dumps(
        [asdict(c) for c in contribs],
        sort_keys=True, separators=(",", ":"),
    ).encode("utf-8")
    return hashlib.sha256(canonical).hexdigest()


def _max_classification(contribs: Sequence[Contribution]) -> str:
    """CL-08 max-of-inputs rule. Order-aware over CL-08 taxonomy."""
    order = ["U", "U_FOUO", "CUI", "PROPRIETARY", "PII", "PHI", "PCI",
             "C", "S", "TS", "TS_SCI", "TS_SAP"]
    levels = [c.classification for c in contribs]
    return max(levels, key=lambda L: order.index(L) if L in order else -1)


def combine_weighted_average(contribs: Sequence[Contribution]) -> CombinationResult:
    """Mode A — Authority-Weighted Average (default)."""
    sorted_c = _canonical_sort(contribs)
    weights = {c.contributor_id: composite_rating(c.accuracy, c.credibility) for c in sorted_c}
    total_w = sum(weights.values())
    if total_w == 0:
        raise ValueError("All contributors have zero authority; cannot combine")

    joint = sum(weights[c.contributor_id] * c.pair_score for c in sorted_c) / total_w
    var = sum(
        weights[c.contributor_id] * (c.pair_score - joint) ** 2 for c in sorted_c
    ) / total_w
    stddev = math.sqrt(var)
    conflict = min(1.0, stddev / 0.5)

    return CombinationResult(
        joint_confidence=joint,
        conflict_indicator=conflict,
        method="weighted_average",
        per_contributor_weight=weights,
        output_classification=_max_classification(sorted_c),
        inputs_hash=_inputs_hash(sorted_c),
    )


def combine_dempster_shafer(contribs: Sequence[Contribution]) -> CombinationResult:
    """Mode B — Dempster-Shafer combination over {match, no_match, unknown} frame."""
    sorted_c = _canonical_sort(contribs)
    weights = {c.contributor_id: composite_rating(c.accuracy, c.credibility) for c in sorted_c}

    # Initial BPA from first contributor
    first = sorted_c[0]
    w0 = weights[first.contributor_id]
    m = {
        frozenset({"match"}): first.pair_score * w0,
        frozenset({"no_match"}): (1 - first.pair_score) * w0,
        frozenset({"match", "no_match"}): 1 - w0,  # unknown
    }

    total_conflict = 0.0
    for c in sorted_c[1:]:
        w = weights[c.contributor_id]
        m_new = {
            frozenset({"match"}): c.pair_score * w,
            frozenset({"no_match"}): (1 - c.pair_score) * w,
            frozenset({"match", "no_match"}): 1 - w,
        }
        # Dempster's combination
        combined = {}
        K = 0.0
        for A, mA in m.items():
            for B, mB in m_new.items():
                inter = A & B
                if not inter:
                    K += mA * mB
                else:
                    combined[inter] = combined.get(inter, 0.0) + mA * mB
        if K >= 0.999:
            # Total conflict — return IN_CONFLICT marker
            return CombinationResult(
                joint_confidence=0.0,
                conflict_indicator=1.0,
                method="dempster_shafer",
                per_contributor_weight=weights,
                output_classification=_max_classification(sorted_c),
                inputs_hash=_inputs_hash(sorted_c),
            )
        m = {A: mA / (1 - K) for A, mA in combined.items()}
        total_conflict += K * (1 - total_conflict)

    belief_match = m.get(frozenset({"match"}), 0.0)
    return CombinationResult(
        joint_confidence=belief_match,
        conflict_indicator=min(1.0, total_conflict),
        method="dempster_shafer",
        per_contributor_weight=weights,
        output_classification=_max_classification(sorted_c),
        inputs_hash=_inputs_hash(sorted_c),
    )


def combine(contribs: Sequence[Contribution], method: str = "weighted_average") -> CombinationResult:
    if not contribs:
        raise ValueError("No contributions to combine")
    if method == "weighted_average":
        return combine_weighted_average(contribs)
    elif method == "dempster_shafer":
        return combine_dempster_shafer(contribs)
    else:
        raise ValueError(f"Unknown combination method: {method}")

Migration from `min(pair_scores)`

In parallax/ops/fusion/correlation.py:

# OLD (lines 473-474):
all_scores = [r.pair_score for r in contributing_tuple if r.pair_score > 0]
confidence = min(all_scores) if all_scores else cluster.aggregate_confidence

# NEW:
from .combination import Contribution, combine

contribs = [
    Contribution(
        contributor_id=r.contributor_id,
        pair_score=r.pair_score,
        accuracy=r.accuracy,
        credibility=r.credibility,
        signer_key_id=r.signer_key_id,
        classification=r.classification,
    )
    for r in contributing_tuple if r.pair_score > 0
]
if contribs:
    method = lens_spec.consensus.combination_method  # "weighted_average" | "dempster_shafer"
    result = combine(contribs, method=method)
    confidence = result.joint_confidence
    conflict_indicator = result.conflict_indicator   # NEW — drives SPEC-36 session state
    inputs_hash = result.inputs_hash                  # NEW — for evidence block
else:
    confidence = cluster.aggregate_confidence
    conflict_indicator = 0.0
    inputs_hash = None

The conflict_indicator and inputs_hash flow into the SPEC-36 session state machine and the SPEC-30 dissent emission and the CL-09 evidence block.

Invariants

Inputs ordering is canonical before any math. Reproducibility depends on it.
Floating-point determinism. Either Decimal mode or documented IEEE-754 sequence. No arbitrary math.fsum substitution.
composite_rating() from correlation.py:92 is the single source of authority weighting. Do not re-implement.
Output classification follows CL-08 max-of-inputs by default. Lens may declare downgrade per CL-08 invariant.
The combiner is pure. Same inputs → same output, no side effects, no I/O.
Conflict above lens-declared conflict_threshold triggers IN_CONFLICT. Combiner does not silently smooth conflict in flag or split policies.
DS mode degenerates cleanly. If K ≥ 0.999, return joint_confidence=0.0, conflict_indicator=1.0 rather than divide-by-zero.

Test Expectations

Reproducibility test: same contribs produces same result 100/100 runs (ideally byte-identical via JSON serialization).
Authority weighting test: two contributors, same pair_score, different STANAG ratings → joint confidence biases toward higher-rated.
Conflict detection test: two contributors, scores 0.9 and 0.2 → conflict_indicator > 0.5.
DS degeneration test: two contributors, totally opposing BPAs → returns IN_CONFLICT without crash.
Classification propagation test: mixed U_FOUO and CUI inputs → output CUI.
Migration parity test: for cases where min(pair_scores) happens to equal weighted average (e.g., all equal scores), new combiner produces same numeric result.
Reference fixture test: tests/fixtures/combiner/ contains 50 hand-checked input/output pairs; combiner reproduces all.

Cross-Reference

SPEC-36: how conflict_indicator drives session state transitions
SPEC-30: how IN_CONFLICT triggers machine dissent emission
SPEC-31: where this combiner is invoked (Consensus Service)
CL-06: how the result becomes a lens emission
CL-09: how the result becomes an evidence Block

Depends on: component.parallax.correlation-persistence, component.parallax.scoring-engine

Realizes: product.fusion

Required by: component.parallax.consensus-mission-service, component.parallax.consensus-session, component.parallax.wire-message-families