Skip to content

Universal Lens Parser

Status & scope

  • Module: lens/parser.py, lens/models/spec.py
  • Milestone: Phase 1 (complete)

Purpose

Parse YAML lens specifications into frozen, validated LensSpec dataclasses. This is the cost-lens engine's parser: one parser handles the four cost lens types (traversal, threat, observation, temporal) — domain-specific config lives in domain submodules, not here.

Two-engine boundary. prism is the cost-lens engine; the entity (semantic) lens is owned end-to-end by parallax (the Fusion engine — see dev-env SPEC-FUSION and parallax SPEC-01-LENS-PARSER). prism does not parse or execute entity lenses. The only cross-engine touchpoint is composition: a cost lens consuming an entity lens's output as a layer via the correlation source adapter (SPEC-13).

Public API

parse_lens(yaml_path: str | Path) → LensSpec

  • Load YAML file, parse and validate into frozen LensSpec.
  • Pure function (reads file, no side effects beyond that).
  • Raises LensValidationError subclasses on invalid specs.

parse_lens_from_dict(data: dict) → LensSpec

  • Parse from pre-loaded dict (already deserialized from YAML/JSON).
  • Handles both bare lens dicts and pack-wrapped envelopes.
  • Pure function.

parse_lens_pack(data: dict) → dict

  • Parse pack-wrapped envelope → {spec, sharing, governance, pack_id, pack_version}.
  • Pack envelope format: pack_type: lens, pack_id, pack_version, sharing, governance, lens: { ... }.

validate_lens(spec: LensSpec) → list[LensValidationError]

  • Semantic validation rules V-01 through V-07.
  • Pure function — takes frozen spec, returns list of errors (empty = valid).

validate_schema(data: dict) → list[str]

  • JSON Schema structural validation.
  • Returns list of error messages.

Dataclasses (all frozen=True)

LensSpec

Field Type Default Notes
lens_id str Unique identifier (e.g., "flood_risk_midwest_v1")
type LensType "traversal" | "threat" | "observation" | "temporal" (cost types; entity/semantic is parallax's)
name str Human-readable name
version str Semver (X.Y.Z)
description str "" Optional description
layers tuple[Layer, ...] One or more weighted layers
thresholds ThresholdConfig Signal generation boundaries
policy PolicyConfig Governance config
output OutputConfig What the lens produces

Layer

Field Type Default Notes
name str Layer identifier
weight float 0.0–1.0, all layers sum to 1.0
source str Data source URI (pack://, memory://, lens://)
cost_model str Primitive name or legacy registry key
refresh RefreshMode "static" "static" | "dynamic"
params dict {} Cost model parameters
transform str "" Transform name (empty = identity)
transform_params dict {} Transform parameters

ThresholdConfig

Field Type Default Notes
confirm float High severity threshold
candidate float Medium severity threshold
reject float Below this = no action

PolicyConfig

Field Type Default Notes
suppressed_fields tuple[str, ...] () Fields to redact
classification str "" Security classification
consent_required bool False Requires user consent
governance GovernanceLevel "full" "full" | "lightweight" | "none"

OutputConfig

Field Type Default Notes
signal_type str DES signal type name
artifacts tuple[str, ...] () Domain artifact types
evidence bool True Create frozen evidence block

Validation Rules

Rule Description Enforcement
V-01 Layer weights sum to 1.0 ±0.001 validate_lens()
V-02 reject < candidate < confirm validate_lens()
V-03 Layer source is resolvable Runtime (adapter dispatch)
V-04 Cost model is registered Runtime (resolve_cost_model())
V-05 Suppressed fields not leaked Enforced at evidence-creation time in lens/evidence.py:create_evidence_block (recursive walk over inputs + output; raises SuppressedFieldLeakError)
V-06 Version is valid semver (X.Y.Z) validate_lens()
V-07 At least one layer required parse_lens_from_dict() + validate_lens()
V-08 All objects frozen Enforced by frozen=True on all dataclasses

Relationship to the entity-lens parser (parallax)

parallax's SPEC-01-LENS-PARSER parses the entity lens (the Fusion matching contract). This parser is the peer for the four cost lens types — not a superset. The two engines are firewalled; neither parser handles the other's lens kind. This one uses a layer/weight/cost-model structure with domain-specific config (vehicle profiles, satellite TLEs, flood models) in domain submodules, keeping the parser generic across the cost types.

The Layer.cost_model field drives op selection. Layer.transform + Layer.transform_params enable data extraction before the cost primitive runs — this is new vs fusion (where transforms were hardcoded in the pipeline).

Test Fixtures

Test Input Expected File
Parse valid traversal YAML examples/traversal.yaml LensSpec with type="traversal" tests/test_parser.py
Parse all 5 types 5 example YAMLs All parse without error tests/test_parser.py
V-01: bad weights weights sum to 0.5 WeightSumError tests/test_parser.py
V-02: bad thresholds confirm < candidate ThresholdOrderError tests/test_parser.py
V-06: bad version "1.0" InvalidVersionError tests/test_parser.py
V-07: no layers layers: [] NoLayersError tests/test_parser.py
Pack envelope pack_type: lens wrapper Unwraps correctly tests/test_parser.py
Transform fields transform: extract_field Layer has transform set tests/test_models.py

Current test count: 23 parser tests + 26 model tests = 49 tests (all passing)

File Layout

lens/
  models/
    spec.py              ← LensSpec, Layer, ThresholdConfig, PolicyConfig, OutputConfig
  parser.py              ← parse_lens, parse_lens_from_dict, parse_lens_pack, validate_lens
schemas/
  lens_spec.schema.json  ← JSON Schema for structural validation

Integration Points

  • Cortex: cortex/tools/lens.py already consumes lens YAML from packs — same parse pattern. The COST_MODELS dict in cortex maps to our Layer.cost_model field.
  • Athena: No direct integration — parser is pure config parsing.
  • Beacon: No direct integration — Beacon gets parsed results via Cortex.

DO NOT

  • Add domain-specific fields to LensSpec or Layer — domain config goes in submodules
  • Import from other lens modules — spec.py is a leaf (imports nothing from project)
  • Put validation logic in models — models are pure data, parser validates
  • Hardcode cost model names — resolution is deferred to binding.py

Realizes: product.lens

Required by: component.prism.bayesian-posterior-engine, component.prism.composition, component.prism.cost-lens-governance, component.prism.cost-primitives, component.prism.evidence-model, component.prism.mobility-surface, component.prism.observation-engine, component.prism.operational-lifecycle, component.prism.scenario-runner, component.prism.scoring-engine, component.prism.semantic-adapter, component.prism.serialization, component.prism.source-adapters, component.prism.temporal-engine, component.prism.threat-engine, component.prism.traversal-engine