Tracking Integration & Advanced Matching
Status & scope
- Stage: DRAFT — Architecture Decision
- Milestone: M4 (Phase 5 — Tracking & Advanced Matching)
- Primitives covered: MF-03, MF-05, MF-12, MA-08, MA-11, MA-12, CF-07, CF-08
Problem
The tracking primitives (TR-01 through TR-07) are already implemented as standalone algorithms in parallax/ops/fusion/tracking.py — staypoint detection, trip extraction, kinematic gating, Kalman smoothing, and ST-DBSCAN all work. They have 43 tests passing.
But they're disconnected from the fusion pipeline. The fusion engine doesn't know how to:
-
Extract tracking features from raw records. A GPS track is not a flat record — it's a time-ordered sequence of observations. The existing
extract_features()works on tabular rows, not sequences. -
Use tracking output as matching evidence. Staypoints, trips, and motion profiles are powerful correlation signals. Two entities that share staypoints at the same times are very likely the same entity. The scoring engine has no metrics for this.
-
Gate impossible matches. Kinematic gating (TR-06) can eliminate candidate pairs before scoring — if two entities were 500km apart 10 minutes ago, they cannot be the same entity regardless of how similar their names are.
-
Disambiguate with LLM assistance. For borderline cases (0.60–0.80 confidence), the current pipeline either matches or doesn't. Phase 5 adds LLM confidence adjustment — feeding context to an LLM to nudge borderline scores up or down.
Decision
component.parallax.tracking-integration bridges the existing tracking primitives into the fusion matching pipeline through three integration points: tracking-aware feature extraction, three new tracking metrics, and three advanced matching algorithms.
The tracking module stays where it is — tracking.py is a pure algorithm library. This spec adds:
- A track extractor that converts raw GPS/sensor sequences into trackable features (staypoints, motion profiles, spatial footprints)
- Three tracking-aware metrics (MF-03, MF-05, MF-12) that score entity pairs based on movement patterns
- A kinematic gate (MA-07 extension) that pre-filters impossible candidates before scoring
- An LLM confidence adjuster (MA-12) that uses lens-specified context to disambiguate borderline pairs
- Connected components with edge pruning (MA-11) that improves clustering quality
- ML ranking (MA-08) as a supervised scoring alternative for production deployments
Architecture
Where Tracking Integrates into the Pipeline
CSV/records (tabular) GPS/sensor tracks (sequential)
│ │
▼ ▼
┌────────────────┐ ┌─────────────────────┐
│ extract_features│ │ extract_track_features│ ← NEW
│ (component.parallax.feature-extraction) │ │ (component.parallax.tracking-integration) │
└───────┬────────┘ └──────────┬──────────┘
│ │
│ FeatureVector │ TrackFeatureVector
│ (flat fields) │ (staypoints, motion)
└────────────┬───────────────────┘
│
┌────────▼──────────┐
│ KINEMATIC GATE │ ← NEW (pre-filter)
│ reject impossible │
│ candidate pairs │
└────────┬──────────┘
│
┌────────▼──────────┐
│ SCORING ENGINE │
│ existing metrics │
│ + 3 tracking │ ← NEW metrics
│ metrics │
└────────┬──────────┘
│
┌────────▼──────────┐
│ LLM ADJUSTMENT │ ← NEW (post-score)
│ borderline pairs │
│ get LLM review │
└────────┬──────────┘
│
┌────────▼──────────┐
│ CLUSTERING │
│ union-find │
│ + edge pruning │ ← NEW algorithm
└────────┬──────────┘
│
┌────────▼──────────┐
│ CORRELATION │
│ (component.parallax.correlation-persistence) │
└───────────────────┘
Track Feature Extraction
Not all lenses involve tracking. A VRS customer lens works on tabular records. A maritime surveillance lens works on AIS tracks. The lens scope.object_types determines which extraction path runs.
# Lens with tracking:
scope:
object_types: ["vessel_track", "ais_observation"]
time_window: "30d"
federation: "all_participants"
# This triggers track extraction because object_types
# includes a track-type entity (configured per domain)
The track extractor produces a TrackFeatureVector — an extension of FeatureVector that includes spatial-temporal summaries:
@dataclass(frozen=True)
class TrackFeatureVector:
"""Feature vector enriched with tracking-derived features."""
# Standard fields (from FeatureVector)
federate_id: str
lens_id: str
lens_version: str
entity_id: str
features: dict[str, Any]
extraction_ts: str
blocking_keys: tuple[str, ...]
# Tracking-derived fields
staypoints: tuple[StaypointSummary, ...] = ()
motion_profile: MotionProfile | None = None
spatial_footprint: SpatialFootprint | None = None
track_quality: TrackQuality | None = None
StaypointSummary
Compressed representation of staypoints — not the raw points (those stay at the federate), but enough for matching:
@dataclass(frozen=True)
class StaypointSummary:
"""Lightweight staypoint representation for cross-federate matching.
Contains geohash-level location (not exact coordinates) to allow
matching without exposing precise positions.
"""
geohash: str # Geohash at configurable precision (default: 6 = ±0.6km)
arrival_hour: int # Hour of day (0-23) — temporal bucketing
departure_hour: int # Hour of day (0-23)
duration_bucket: str # "brief" (<30min), "medium" (30min-2h), "extended" (>2h)
day_of_week: int # 0=Monday, 6=Sunday
MotionProfile
Statistical summary of movement patterns:
@dataclass(frozen=True)
class MotionProfile:
"""Aggregate motion statistics for an entity's track."""
avg_speed_ms: float
max_speed_ms: float
total_distance_m: float
total_duration_s: float
staypoint_count: int
trip_count: int
dominant_heading: float | None # Most common direction of travel (degrees)
speed_variance: float # How variable is the speed?
active_hours: tuple[int, ...] # Hours of day with movement (24h format)
SpatialFootprint
The geographic area an entity occupies:
@dataclass(frozen=True)
class SpatialFootprint:
"""Geographic coverage of an entity's observations."""
bounding_geohashes: tuple[str, ...] # Set of geohashes visited
home_geohash: str | None # Most frequent staypoint location
range_km: float # Max distance between any two observations
geohash_precision: int = 5 # Precision level used
Three Tracking Metrics
MF-03: Uncertainty Region Overlap
Two entities whose position uncertainty regions overlap at the same time could be co-located. This metric scores based on spatial and temporal overlap of observations.
def uncertainty_overlap(
footprint_a: SpatialFootprint,
footprint_b: SpatialFootprint,
temporal_window_s: float = 3600.0,
) -> float:
"""MF-03: Score based on overlap of spatial footprints.
Uses geohash intersection as a proxy for spatial overlap.
Returns ratio of shared geohashes to total unique geohashes.
"""
set_a = set(footprint_a.bounding_geohashes)
set_b = set(footprint_b.bounding_geohashes)
if not set_a or not set_b:
return 0.0
intersection = set_a & set_b
union = set_a | set_b
return len(intersection) / len(union)
MF-05: Motion Consistency
Two entities with similar motion profiles (speed patterns, activity hours, movement range) are more likely to be the same entity.
def motion_consistency(
profile_a: MotionProfile,
profile_b: MotionProfile,
) -> float:
"""MF-05: Score based on similarity of motion profiles.
Compares:
- Speed distributions (avg, max, variance)
- Activity hour overlap
- Range similarity
- Staypoint/trip count similarity
Returns weighted average of component similarities [0.0-1.0].
"""
Component weights (configurable via lens):
| Component | Weight | Rationale |
|---|---|---|
| Speed similarity | 0.25 | Entities of the same type move at similar speeds |
| Active hours overlap | 0.25 | Same entity is active at the same times |
| Range similarity | 0.20 | Similar operating areas |
| Staypoint count ratio | 0.15 | Similar behavior patterns |
| Speed variance similarity | 0.15 | Consistent movement style |
MF-12: Staypoint Similarity
Two entities that share staypoints (same place, overlapping times) are strong candidates for being the same entity. This is the most powerful tracking metric.
def staypoint_similarity(
staypoints_a: tuple[StaypointSummary, ...],
staypoints_b: tuple[StaypointSummary, ...],
) -> float:
"""MF-12: Score based on shared staypoint patterns.
A shared staypoint means: same geohash + overlapping time windows.
Scoring:
- For each staypoint in A, check if B has a matching staypoint
- Match = same geohash + overlapping hours + same day_of_week (or close)
- Score = (2 * matched_count) / (count_a + count_b) [Dice coefficient]
Returns [0.0-1.0]. High scores (>0.5) are strong correlation signals.
"""
How Tracking Metrics Integrate with Existing Scoring
The lens identity_fusion.match_function can reference tracking metrics just like any other metric:
identity_fusion:
match_function:
# Standard fields
full_name:
weight: 0.25
metric: jaro_winkler
date_of_birth:
weight: 0.15
metric: exact
# Tracking fields (only present if lens has tracking entities)
staypoint_pattern:
weight: 0.30
metric: staypoint_similarity
motion_profile:
weight: 0.15
metric: motion_consistency
spatial_coverage:
weight: 0.15
metric: uncertainty_overlap
When tracking metrics are defined in the lens but the entity has no tracking data, the scorer redistributes weight to non-tracking metrics (existing null-field handling from component.parallax.scoring-engine).
Kinematic Gate (Pre-Filter)
MA-07 Extension: Travel Feasibility as Candidate Filter
Currently, kinematic gating lives in tracking.py as a point-to-point check. component.parallax.tracking-integration promotes it to a candidate-pair pre-filter in the blocker:
def kinematic_filter(
candidates: list[tuple[str, str]],
track_features: dict[str, TrackFeatureVector],
max_speed_kmh: float = 120.0,
time_window_s: float = 3600.0,
) -> list[tuple[str, str]]:
"""Filter candidate pairs that are physically impossible.
For each candidate pair (a, b):
1. Get the most recent observations for both entities
2. Check if travel between them is feasible given max_speed
3. If not feasible: remove from candidates (no need to score)
This runs AFTER blocking but BEFORE scoring — it's cheap and
eliminates obvious non-matches that happen to share blocking keys.
Returns filtered candidate list (impossible pairs removed).
"""
Configuration (CF-03)
Max speeds are configurable per entity type in the lens:
# In lens configuration:
tracking:
travel_feasibility:
person: 120 # km/h (driving)
vessel: 60 # km/h (fast vessel)
aircraft: 900 # km/h (commercial)
vehicle: 200 # km/h (highway)
default: 120 # km/h (fallback)
LLM Confidence Adjustment (MA-12)
Problem
Borderline pairs (confidence 0.60–0.80) are where most false positives and false negatives occur. The current pipeline uses a hard threshold — above it matches, below it doesn't. This misses context that could resolve ambiguity.
Design
The LLM adjuster is a post-scoring step that reviews borderline pairs. It receives lens-specified context fields (never raw data) and returns an adjusted confidence score.
@dataclass(frozen=True)
class LLMDisambiguationRequest:
"""Context for LLM to evaluate a borderline match pair."""
pair_id: str
original_confidence: float
per_field_scores: dict[str, float]
context_fields_a: dict[str, str] # Lens-specified context only
context_fields_b: dict[str, str] # Lens-specified context only
lens_id: str
entity_type: str
disambiguation_prompt: str # From lens config
@dataclass(frozen=True)
class LLMDisambiguationResult:
"""LLM's adjustment to pair confidence."""
pair_id: str
adjusted_confidence: float
reasoning: str
action: str # "INCREASE" | "DECREASE" | "NO_CHANGE"
confidence_delta: float
def llm_disambiguate(
request: LLMDisambiguationRequest,
llm_adapter: LLMAdapter,
) -> LLMDisambiguationResult:
"""MA-12: Use LLM to adjust confidence of borderline pairs.
The LLM receives:
- Per-field similarity scores
- Context fields specified in the lens (never raw record data)
- A disambiguation prompt from the lens configuration
- The entity type and current confidence
The LLM returns:
- Adjusted confidence (clamped to ±0.15 of original)
- Reasoning text (stored as evidence)
- Action taken
Constraints:
- LLM can only adjust confidence by ±0.15 (prevents over-reliance)
- LLM never sees raw data — only lens-specified context fields
- LLM decisions are logged as evidence blocks
- This step can be disabled per lens (enabled: false)
"""
LLM Adapter Interface
The actual LLM call is abstracted behind an adapter — parallax doesn't import LLM infrastructure:
class LLMAdapter:
"""Interface for LLM calls. Implemented by titan/cortex, not parallax."""
def complete(self, prompt: str, system: str = "") -> str:
"""Send prompt to LLM, return response text."""
raise NotImplementedError
def is_available(self) -> bool:
"""Check if LLM is accessible."""
return False
For standalone testing, a MockLLMAdapter returns deterministic responses.
Lens Configuration for LLM Disambiguation
identity_fusion:
llm_disambiguation:
enabled: true
model: "claude-sonnet-4-20250514"
borderline_range: [0.60, 0.80] # Only review pairs in this range
max_adjustment: 0.15 # Max confidence delta
context_fields: # What the LLM sees (lens-controlled)
- full_name
- city
- date_of_birth_year
prompt: |
You are evaluating whether two records refer to the same person.
Consider name variations, common addresses, and age proximity.
Return INCREASE if likely same person, DECREASE if likely different.
Connected Components with Edge Pruning (MA-11)
Problem
The current union-find clustering (MA-10 in algorithms/clustering.py) uses transitive closure — if A matches B and B matches C, then A-B-C are one cluster even if A and C score poorly directly. This can cause "cluster drift" where weakly connected entities get grouped together.
Design
Edge pruning validates transitive connections:
def cluster_with_pruning(
matches: list[tuple[str, str, float]],
min_intra_cluster_confidence: float = 0.50,
) -> list[EntityCluster]:
"""MA-11: Cluster matches with edge pruning to prevent drift.
Steps:
1. Build initial clusters using union-find (existing MA-10)
2. For each cluster with > 2 members:
a. Build the pairwise confidence graph
b. Prune edges below min_intra_cluster_confidence
c. Re-cluster the pruned graph
d. If cluster splits: create separate EntityClusters
3. Return refined clusters
This prevents weak transitive links from merging distinct entities.
"""
ML Ranking (MA-08)
Problem
Weighted aggregate scoring uses fixed weights from the lens. For production deployments with labeled data (confirmed/rejected pairs from component.parallax.correlation-persistence attestations), a learned model can outperform hand-tuned weights.
Design
ML ranking is optional and requires training data:
@dataclass(frozen=True)
class MLRankingModel:
"""Trained model for pair scoring. Created from labeled data."""
model_id: str
lens_id: str
feature_names: tuple[str, ...]
weights: tuple[float, ...] # Learned weights per feature
intercept: float
training_pairs: int # How many pairs were used
precision_at_threshold: float # Performance metric
recall_at_threshold: float
threshold: float
trained_at: str
def train_ml_ranker(
labeled_pairs: list[tuple[dict[str, float], bool]],
feature_names: list[str],
) -> MLRankingModel:
"""MA-08: Train gradient-boosted ranking model from labeled pairs.
Input: list of (per_field_scores, is_match) pairs from attestation data.
Output: MLRankingModel with learned weights.
Uses simple logistic regression first (no external ML library).
Can be upgraded to gradient boosting when sklearn is available.
Minimum training data: 100 labeled pairs (50+ positive, 50+ negative).
Below this threshold, falls back to weighted aggregate scoring.
"""
def ml_score_pair(
model: MLRankingModel,
per_field_scores: dict[str, float],
) -> float:
"""Score a pair using the trained ML model.
Returns confidence [0.0-1.0] from model prediction.
Falls back to weighted aggregate if model is unavailable.
"""
Training Data from Attestations
The key insight: component.parallax.correlation-persistence's attestation flow generates labeled training data automatically. Every CONFIRMED correlation is a positive example. Every REJECTED correlation is a negative example. Over time, enough attestations accumulate to train a lens-specific scoring model.
Attestation flow (component.parallax.correlation-persistence) → Labeled pairs → ML training (component.parallax.tracking-integration)
↓
MLRankingModel
↓
Production scoring
This is a flywheel: better scoring → more accurate proposals → faster attestation → more training data → even better scoring.
Configuration Primitives
CF-07: Passport TTL (Time-To-Live)
A "passport" is a CorrelationRecord that has been human-attested (CONFIRMED). The TTL defines how long it remains valid without re-verification.
# In lens evidence_rules:
evidence_rules:
confidence_decay:
half_life_days: 90 # Unverified correlations
confirmed_half_life_days: 180 # Attested correlations (2x)
passport_ttl_days: 365 # CF-07: Max time between re-attestations
When a CONFIRMED correlation exceeds passport_ttl_days since last attestation, it transitions to STALE. This is implemented via component.parallax.correlation-persistence's decay mechanism — CF-07 just adds the TTL check as an additional STALE trigger.
CF-08: Consent Linkage & Revocation
In regulated environments, entity correlation may require consent. CF-08 adds consent tracking to Correlation Records:
@dataclass(frozen=True)
class ConsentRecord:
"""Tracks consent for cross-federate correlation."""
consent_id: str
granted_by: str # Entity or authorized representative
granted_at: str # ISO8601
scope: str # "full" | "limited" | "anonymized"
lens_ids: tuple[str, ...] # Which lenses this consent covers
expires_at: str | None # Optional expiry
revoked: bool = False
revoked_at: str | None = None
When consent is revoked:
- All Correlation Records linked to this consent transition to INVALIDATED
- The invalidation is a lineage entry (append-only, invariant 2)
- Affected records return to UNRESOLVED status
- The correlation data is NOT deleted (invariant 2) — it is marked as no longer actionable
This is enforced at the titan layer (consent checks on write-back), not in parallax (pure compute has no consent awareness).
What Needs to Be Built (In parallax)
New Files
| File | Content | Effort |
|---|---|---|
models/track_features.py |
TrackFeatureVector, StaypointSummary, MotionProfile, SpatialFootprint, TrackQuality | 2d |
track_extractor.py |
extract_track_features() — converts raw sequences → TrackFeatureVector using existing tracking.py |
3d |
metrics_tracking.py |
MF-03 (uncertainty_overlap), MF-05 (motion_consistency), MF-12 (staypoint_similarity) | 4d |
algorithms/kinematic_filter.py |
kinematic_filter() — pre-scoring candidate elimination |
2d |
algorithms/edge_pruning.py |
cluster_with_pruning() — MA-11 connected components with pruning |
3d |
algorithms/ml_ranking.py |
train_ml_ranker(), ml_score_pair() — MA-08 learned scoring |
4d |
llm_adapter.py |
LLMAdapter interface + MockLLMAdapter for testing |
1d |
llm_disambiguator.py |
llm_disambiguate() — MA-12 borderline pair LLM review |
3d |
models/consent.py |
ConsentRecord, ConsentStatus — CF-08 consent model | 1d |
Extended Files
| File | Change | Effort |
|---|---|---|
scorer.py |
Add score_pair_v2() that integrates kinematic gate + tracking metrics + LLM adjustment |
2d |
metrics.py |
Register tracking metrics in METRIC_REGISTRY | 1d |
blocker.py |
Add kinematic_filter call in generate_candidates_v2() |
1d |
Tests
| Test File | Tests | Coverage |
|---|---|---|
test_track_features.py |
~15 tests | TrackFeatureVector creation, staypoint summarization, motion profile |
test_metrics_tracking.py |
~20 tests | MF-03, MF-05, MF-12 with edge cases (no tracking data, partial, identical) |
test_kinematic_filter.py |
~10 tests | Impossible pairs filtered, feasible pairs kept, edge cases |
test_edge_pruning.py |
~12 tests | Cluster splitting, single edge pruned, no pruning needed |
test_ml_ranking.py |
~10 tests | Training, scoring, minimum data check, fallback |
test_llm_disambiguator.py |
~10 tests | MockLLM responses, clamp ±0.15, disabled path |
Estimated: ~77 new tests, 27 days effort
In titan (later)
- Consent enforcement on write-back
- Passport TTL scheduler (extends component.parallax.correlation-persistence decay scheduler)
- LLM adapter implementation (connects to Cortex LLM infrastructure)
- ML model persistence (save/load trained models to UDS)
In cortex (later)
query_track_features— MCP tool for track feature retrievaltrain_ml_ranker— MCP tool to trigger ML training from attestation datallm_disambiguate— MCP tool wrapper for LLM disambiguation
Dependencies Between component.parallax.tracking-integration Components
TrackFeatureVector model
│
┌───────────────┼───────────────┐
▼ ▼ ▼
track_extractor metrics_tracking kinematic_filter
│ │ │
│ ▼ │
│ scorer.py v2 ◄────────┘
│ │
│ ▼
│ llm_disambiguator
│ │
│ ▼
│ edge_pruning (clustering)
│ │
│ ▼
│ ml_ranking (needs labeled data)
│
└──► All feed into component.parallax.correlation-persistence correlation pipeline
Build order:
1. models/track_features.py — no deps
2. track_extractor.py — depends on (1) + existing tracking.py
3. metrics_tracking.py — depends on (1)
4. algorithms/kinematic_filter.py — depends on (1)
5. scorer.py v2 — depends on (3, 4)
6. llm_adapter.py + llm_disambiguator.py — depends on (5)
7. algorithms/edge_pruning.py — depends on existing clustering.py
8. algorithms/ml_ranking.py — depends on (5) + component.parallax.correlation-persistence attestation data
Steps 1-4 can be parallelized. Steps 7-8 can be parallelized.
Invariant Compliance
| Invariant | Compliance |
|---|---|
| 1. UDS is sole ABAC authority | Track features extracted per federate under ABAC. StaypointSummary uses geohash (not exact coords) to limit exposure. |
| 2. Events are append-only | Consent revocation is an append (new lineage entry), not a delete. ML models versioned, not overwritten. |
| 3. Blocks are evidence | LLM disambiguation creates evidence blocks with prompt, response, and reasoning. |
| 4. Frozen means frozen | Training data snapshot is frozen when ML model trains. Model weights are immutable once created. |
| 5. Editions require frozen evidence | LLM-adjusted scores become frozen evidence before any attestation. |
| 6. AI assists, humans attest | LLM adjusts confidence (AI assist). ML proposes scores (AI assist). Humans confirm/reject via component.parallax.correlation-persistence. |
| 7. "No action" is a decision | Borderline pairs that LLM cannot resolve explicitly marked DEFERRED (not silently dropped). |
Future Sensor Extensions (Out of Scope)
The tracking architecture supports additional sensor modalities without pipeline changes:
- LIDAR point clouds — Dense 3D TrackPoints with altitude + intensity in metadata. ST-DBSCAN (TR-07) handles the spatial clustering. Needs a LIDAR-specific track extractor and point cloud decimation for cross-federate exchange (raw LIDAR is too large).
- GMTI radar tracks — Sparse detections with velocity vectors. Map directly to TrackPoint (lat, lon, speed, heading). Kinematic gating (TR-06) applies with aircraft/vehicle speed profiles. Low observation density means staypoint detection is less useful — motion consistency (MF-05) becomes the primary matching signal.
- ADS-B / AIS transponder — Already close to the current model. AIS vessel tracks and ADS-B aircraft tracks are timestamped position reports with identity metadata. Good candidates for the first tracking-integrated fusion lens.
These are sensor modality extensions, not architecture changes. Each needs a domain-specific track extractor and possibly custom blocking keys (ICAO hex for ADS-B, MMSI for AIS), but the pipeline, metrics, and correlation layer are unchanged.
Cross-References
| Document | Relationship |
|---|---|
| component.parallax.scoring-engine (Scoring) | Extended with tracking metrics and LLM adjustment |
| component.parallax.primitives-framework (Primitives) | MF-03, MF-05, MF-12, MA-07, MA-08, MA-11, MA-12 specified here |
| component.parallax.correlation-persistence (Correlation) | Attestation data feeds ML training. Consent revocation invalidates CRs. |
| component.parallax.fusion-binding (Binding) | Track entities use same binding model (local_model, field_mappings) |
tracking.py |
Existing TR-01..TR-07 implementations consumed by track_extractor |
Depends on: component.parallax.correlation-persistence, component.parallax.primitives-framework, component.parallax.scoring-engine
Realizes: product.fusion