Multi-Contributor Combination Function
Status & scope
- Stage: DRAFT — ready to implement (replaces the
min(pair_scores)placeholder incorrelation.py:473-474) - Layer: compute (pure combination math — parallax library)
- Depends on: SPEC-PLATFORM-04 (Parallax service), CL-06 (lens emission), CL-09 (evidence chain), CL-08 (classification)
- Consumed by: quorum evaluation (weighted policies), dissent (
conflict_indicatoremission path), consensus session - Milestone: P0 — gates real production federation; gates Q3 2026 GA
Purpose
The current correlation.py:473-474 joint-confidence path is a placeholder:
all_scores = [r.pair_score for r in contributing_tuple if r.pair_score > 0]
confidence = min(all_scores) if all_scores else cluster.aggregate_confidence
min(pair_scores) is wrong for production multi-contributor fusion. It ignores authority weighting (the STANAG 2022 fields accuracy and credibility already in ContributingRecord), it produces no conflict indicator, and it cannot represent dissent. This spec defines the real combiner.
Inputs and Outputs
Input
A list of contributions from N federates that have already produced pair scores against a common candidate match:
@dataclass(frozen=True)
class Contribution:
contributor_id: str # federate identifier (also serves as tenant_id when
# contribution arrives via federation per CL-10)
pair_score: float # [0.0, 1.0] — output of pair-level scoring
accuracy: int # STANAG 2022 source accuracy: 1 (high) – 6 (low)
credibility: int # STANAG 2022 data credibility: 1 (high) – 6 (low)
signer_key_id: str # for evidence chain (CL-09)
classification: str # contributor-declared classification of THIS contribution
# (per CL-08; combiner computes output_classification from these)
Field Authority
| Field | Authority | Notes |
|---|---|---|
classification (on Contribution) |
Contributor declares its own | What the contributing federate says about its own derived feature |
output_classification (on CombinationResult) |
Combiner computes per CL-08 propagation rule | max-of-inputs by default; declared / downgraded per lens spec |
tenant_id (NOT on Contribution) |
Session-scoped per CL-10 | Contributions inherit tenant context from the session they're added to; not duplicated on each contribution |
Output
@dataclass(frozen=True)
class CombinationResult:
joint_confidence: float # [0.0, 1.0]
conflict_indicator: float # [0.0, 1.0]; 0 = full agreement, 1 = maximum disagreement
method: str # "weighted_average" | "dempster_shafer"
per_contributor_weight: dict # contributor_id → effective weight used
output_classification: str # max-of-inputs per CL-08 unless lens overrides
inputs_hash: str # SHA-256 over canonical inputs (for replay)
The Combination Function
Two modes are pack-configurable per lens. Mode is declared in the lens spec under consensus.combination_method.
Mode A — Authority-Weighted Average (default)
For each contribution, compute an authority weight from STANAG 2022:
composite_rating(accuracy, credibility) → [0.0, 1.0]
high accuracy (1) + high credibility (1) → 1.0
worst accuracy (6) + worst credibility (6) → 0.0
formula: ((7 - accuracy) / 6 + (7 - credibility) / 6) / 2
This function already exists at correlation.py:92. Reuse it.
Joint confidence is the weighted average:
weight_i = composite_rating(accuracy_i, credibility_i)
joint_confidence = sum(weight_i * pair_score_i) / sum(weight_i)
Conflict indicator measures spread across contributors, normalized:
weighted_mean = joint_confidence
weighted_variance = sum(weight_i * (pair_score_i - weighted_mean)^2) / sum(weight_i)
weighted_stddev = sqrt(weighted_variance)
conflict_indicator = min(1.0, weighted_stddev / 0.5) # 0.5 is the half-range scaling
A conflict_indicator above the lens-declared quorum.conflict_threshold (default 0.3) triggers SPEC-36 IN_CONFLICT state and SPEC-30 machine dissent emission.
Mode B — Dempster-Shafer (optional, conflict-visible)
For partners that require evidence-theory combination (some coalition deployments do):
Each contribution maps to a basic probability assignment (BPA) over the frame {match, no_match, unknown}:
m_i({match}) = pair_score_i * weight_i
m_i({no_match}) = (1 - pair_score_i) * weight_i
m_i({unknown}) = 1 - weight_i # un-discounted mass goes to unknown
Dempster's rule of combination over two contributions:
K = sum over all (B, C) where B ∩ C = ∅ of m1(B) * m2(C) # conflict mass
m_combined(A) = (1 / (1 - K)) * sum over (B ∩ C = A) of m1(B) * m2(C)
For N > 2, apply pairwise iteratively (Dempster's rule is associative and commutative).
Joint confidence is the combined belief in {match}:
joint_confidence = m_combined({match}) + m_combined({match, unknown}) # belief, not plausibility
conflict_indicator = K_total # combined conflict mass
When K_total approaches 1, contributors fundamentally disagree — Dempster's rule degenerates and the combiner returns IN_CONFLICT without forcing a smoothed answer.
Mode Selection
The lens declares which mode to use:
consensus:
combination_method: weighted_average # or: dempster_shafer
quorum:
required_contributors: 2
minimum_authority_sum: 1.5
conflict_threshold: 0.3
conflict_policy: flag # one of: suppress | flag | split
Default: weighted_average. Coalition-specific lenses (FVEY, NATO partner) declare dempster_shafer when partner accreditation requires evidence-theory semantics.
Reproducibility Contract
Same inputs MUST produce byte-identical outputs. Three constraints on implementation:
- Canonical input ordering. Sort contributions by
(contributor_id, signer_key_id)before any computation. - Deterministic floating-point. Use
decimal.Decimalwith 15 significant digits OR documented IEEE-754 double-precision sequence. Both modes documented; pack chooses one. - Inputs hash. Compute
inputs_hashas SHA-256 over canonical-JSON of the sorted contribution list. Recorded in the result and in the evidence Block (per CL-09).
Replay: given an evidence Block referencing this combiner, anyone can recompute joint_confidence + conflict_indicator from inputs_hash and verify byte-equality with the recorded output.
Reference Implementation
This drops into parallax/ops/fusion/correlation.py replacing the placeholder at lines 473-474. Your engineer integrates-and-tests rather than designs-from-scratch.
# parallax/ops/fusion/combination.py — NEW FILE
from __future__ import annotations
import hashlib
import json
import math
from dataclasses import dataclass, asdict
from typing import Sequence
@dataclass(frozen=True)
class Contribution:
contributor_id: str
pair_score: float
accuracy: int
credibility: int
signer_key_id: str
classification: str
@dataclass(frozen=True)
class CombinationResult:
joint_confidence: float
conflict_indicator: float
method: str
per_contributor_weight: dict
output_classification: str
inputs_hash: str
def composite_rating(accuracy: int, credibility: int) -> float:
"""STANAG 2022 two-axis rating → [0.0, 1.0].
accuracy and credibility are 1 (high) – 6 (low).
Already exists at correlation.py:92 — reuse, don't re-implement."""
if not (1 <= accuracy <= 6) or not (1 <= credibility <= 6):
raise ValueError(f"Invalid STANAG rating: a={accuracy}, c={credibility}")
return ((7 - accuracy) / 6 + (7 - credibility) / 6) / 2
def _canonical_sort(contribs: Sequence[Contribution]) -> list[Contribution]:
return sorted(contribs, key=lambda c: (c.contributor_id, c.signer_key_id))
def _inputs_hash(contribs: Sequence[Contribution]) -> str:
canonical = json.dumps(
[asdict(c) for c in contribs],
sort_keys=True, separators=(",", ":"),
).encode("utf-8")
return hashlib.sha256(canonical).hexdigest()
def _max_classification(contribs: Sequence[Contribution]) -> str:
"""CL-08 max-of-inputs rule. Order-aware over CL-08 taxonomy."""
order = ["U", "U_FOUO", "CUI", "PROPRIETARY", "PII", "PHI", "PCI",
"C", "S", "TS", "TS_SCI", "TS_SAP"]
levels = [c.classification for c in contribs]
return max(levels, key=lambda L: order.index(L) if L in order else -1)
def combine_weighted_average(contribs: Sequence[Contribution]) -> CombinationResult:
"""Mode A — Authority-Weighted Average (default)."""
sorted_c = _canonical_sort(contribs)
weights = {c.contributor_id: composite_rating(c.accuracy, c.credibility) for c in sorted_c}
total_w = sum(weights.values())
if total_w == 0:
raise ValueError("All contributors have zero authority; cannot combine")
joint = sum(weights[c.contributor_id] * c.pair_score for c in sorted_c) / total_w
var = sum(
weights[c.contributor_id] * (c.pair_score - joint) ** 2 for c in sorted_c
) / total_w
stddev = math.sqrt(var)
conflict = min(1.0, stddev / 0.5)
return CombinationResult(
joint_confidence=joint,
conflict_indicator=conflict,
method="weighted_average",
per_contributor_weight=weights,
output_classification=_max_classification(sorted_c),
inputs_hash=_inputs_hash(sorted_c),
)
def combine_dempster_shafer(contribs: Sequence[Contribution]) -> CombinationResult:
"""Mode B — Dempster-Shafer combination over {match, no_match, unknown} frame."""
sorted_c = _canonical_sort(contribs)
weights = {c.contributor_id: composite_rating(c.accuracy, c.credibility) for c in sorted_c}
# Initial BPA from first contributor
first = sorted_c[0]
w0 = weights[first.contributor_id]
m = {
frozenset({"match"}): first.pair_score * w0,
frozenset({"no_match"}): (1 - first.pair_score) * w0,
frozenset({"match", "no_match"}): 1 - w0, # unknown
}
total_conflict = 0.0
for c in sorted_c[1:]:
w = weights[c.contributor_id]
m_new = {
frozenset({"match"}): c.pair_score * w,
frozenset({"no_match"}): (1 - c.pair_score) * w,
frozenset({"match", "no_match"}): 1 - w,
}
# Dempster's combination
combined = {}
K = 0.0
for A, mA in m.items():
for B, mB in m_new.items():
inter = A & B
if not inter:
K += mA * mB
else:
combined[inter] = combined.get(inter, 0.0) + mA * mB
if K >= 0.999:
# Total conflict — return IN_CONFLICT marker
return CombinationResult(
joint_confidence=0.0,
conflict_indicator=1.0,
method="dempster_shafer",
per_contributor_weight=weights,
output_classification=_max_classification(sorted_c),
inputs_hash=_inputs_hash(sorted_c),
)
m = {A: mA / (1 - K) for A, mA in combined.items()}
total_conflict += K * (1 - total_conflict)
belief_match = m.get(frozenset({"match"}), 0.0)
return CombinationResult(
joint_confidence=belief_match,
conflict_indicator=min(1.0, total_conflict),
method="dempster_shafer",
per_contributor_weight=weights,
output_classification=_max_classification(sorted_c),
inputs_hash=_inputs_hash(sorted_c),
)
def combine(contribs: Sequence[Contribution], method: str = "weighted_average") -> CombinationResult:
if not contribs:
raise ValueError("No contributions to combine")
if method == "weighted_average":
return combine_weighted_average(contribs)
elif method == "dempster_shafer":
return combine_dempster_shafer(contribs)
else:
raise ValueError(f"Unknown combination method: {method}")
Migration from min(pair_scores)
In parallax/ops/fusion/correlation.py:
# OLD (lines 473-474):
all_scores = [r.pair_score for r in contributing_tuple if r.pair_score > 0]
confidence = min(all_scores) if all_scores else cluster.aggregate_confidence
# NEW:
from .combination import Contribution, combine
contribs = [
Contribution(
contributor_id=r.contributor_id,
pair_score=r.pair_score,
accuracy=r.accuracy,
credibility=r.credibility,
signer_key_id=r.signer_key_id,
classification=r.classification,
)
for r in contributing_tuple if r.pair_score > 0
]
if contribs:
method = lens_spec.consensus.combination_method # "weighted_average" | "dempster_shafer"
result = combine(contribs, method=method)
confidence = result.joint_confidence
conflict_indicator = result.conflict_indicator # NEW — drives SPEC-36 session state
inputs_hash = result.inputs_hash # NEW — for evidence block
else:
confidence = cluster.aggregate_confidence
conflict_indicator = 0.0
inputs_hash = None
The conflict_indicator and inputs_hash flow into the SPEC-36 session state machine and the SPEC-30 dissent emission and the CL-09 evidence block.
Invariants
- Inputs ordering is canonical before any math. Reproducibility depends on it.
- Floating-point determinism. Either Decimal mode or documented IEEE-754 sequence. No arbitrary
math.fsumsubstitution. composite_rating()fromcorrelation.py:92is the single source of authority weighting. Do not re-implement.- Output classification follows CL-08 max-of-inputs by default. Lens may declare downgrade per CL-08 invariant.
- The combiner is pure. Same inputs → same output, no side effects, no I/O.
- Conflict above lens-declared
conflict_thresholdtriggers IN_CONFLICT. Combiner does not silently smooth conflict inflagorsplitpolicies. - DS mode degenerates cleanly. If K ≥ 0.999, return
joint_confidence=0.0, conflict_indicator=1.0rather than divide-by-zero.
Test Expectations
- Reproducibility test: same
contribsproduces sameresult100/100 runs (ideally byte-identical via JSON serialization). - Authority weighting test: two contributors, same
pair_score, different STANAG ratings → joint confidence biases toward higher-rated. - Conflict detection test: two contributors, scores 0.9 and 0.2 →
conflict_indicator > 0.5. - DS degeneration test: two contributors, totally opposing BPAs → returns
IN_CONFLICTwithout crash. - Classification propagation test: mixed
U_FOUOandCUIinputs → outputCUI. - Migration parity test: for cases where
min(pair_scores)happens to equal weighted average (e.g., all equal scores), new combiner produces same numeric result. - Reference fixture test:
tests/fixtures/combiner/contains 50 hand-checked input/output pairs; combiner reproduces all.
Cross-Reference
- SPEC-36: how
conflict_indicatordrives session state transitions - SPEC-30: how IN_CONFLICT triggers machine dissent emission
- SPEC-31: where this combiner is invoked (Consensus Service)
- CL-06: how the result becomes a lens emission
- CL-09: how the result becomes an evidence Block
Depends on: component.parallax.correlation-persistence, component.parallax.scoring-engine
Realizes: product.fusion
Required by: component.parallax.consensus-mission-service, component.parallax.consensus-session, component.parallax.wire-message-families