Cortex — Data Shape Contract

Eliminate frontend guessing by making Cortex emit stable, render-ready data shapes while keeping (1) ABAC safety, (2) dataspace types sacred, and (3) visualization choice in Beacon's ViewerController. Cortex normalizes polymorphic Elasticsearch aggregation responses into canonical data shapes; Beacon renders from those, never from raw ES JSON.

Vision (Non-negotiables)

Blocks Are Evidence, Not Views

A Block is an immutable-ish evidence object produced by Cortex and rendered by Beacon. Blocks carry provenance and enough shape metadata to render deterministically. Blocks MUST NOT encode "viz type" (no "timeseries chart", "heatmap chart") — Beacon chooses how to render.

Deterministic Shape, Emitted by Cortex

ES aggregation responses are polymorphic. Beacon must not reverse-engineer semantics from raw ES JSON. Cortex normalizes results into canonical data shapes and (optionally) derived projections. Raw ES responses may be included for audit/debug, but viewers should not depend on them.

Dataspace Types Are Sacred

Do not mutate uds_type or inferred domain types inside describe_model. Use render_hints and column_roles as secondary, non-authoritative metadata only.

ABAC Safety

Logs and shape metadata MUST NOT leak ABAC-hidden field names. If a user cannot see a field, blocks must not reveal its name or existence. Shape fingerprints are computed from operator structure, not field names.

Goals

Make aggregation rendering deterministic across: terms + metrics; date_histogram + terms + metrics; filters (multi-window temporal buckets); geohash_grid / geotile_grid + metrics; composite aggs with after_key pagination; pipeline metrics (bucket_script / derivatives). Make new shape variants visible during testing via a stable shape signature + fingerprint and first-seen logging. Keep the UI simple: Beacon renders from canonical projections (bucket_rows, geo_cells, etc.); ViewerController chooses the best viewer from column_roles/hints and projection availability.

Block Payload Contract

Common Block Metadata (required)

Outcome Envelope (required)

outcome (OK | NO_DATA | PARTIAL | ERROR | POLICY_DENIED); warnings[] (strings, no ABAC leaks); errors[] (no secrets, safe summaries); optional federation: { federates_ok, federates_total }.

Shape ID (required for aggregation/geo_agg)

shape_signature (human-readable, no field names); shape_fingerprint (sha256 of canonical signature JSON); shape_features[] (e.g. ["filters_named_keys", "composite_after_key", "pipeline_metrics"]).

Column Metadata

column_meta is advisory; it MUST never contradict ABAC or overwrite uds_type.

Visualization Hints

Cortex emits viz_hints to guide Beacon's ViewerController. These are recommendations, not commands.

viz_hints.recommended (required)

Value	block_kind	Beacon Renderer	Description
`table`	query, aggregation	renderTable	Default tabular view
`kpi_card`	kpi	renderKPI	Key metric cards
`bar_chart`	aggregation, histogram	renderBar	Vertical bar chart
`timeseries_chart`	timeseries	renderTimeseries	Line chart with time axis
`geo_grid`	geo_agg	renderGeoGrid	Grid cells with color scale + legend
`geo_point`	query (with geo)	renderGeo	Point markers on map
`geo_heat`	geo_agg	renderGeoHeatmap	Heat intensity layer
`tabbed_view`	multi_agg	renderTabbed	Tabbed view for multiple aggs
`schema`	schema	renderSchema	Field/type table

viz_hints.alternatives (optional): array of alternative viz types the user can switch to, e.g. ["table", "bar_chart"].

MCP Response Envelope

All MCP tools that return data MUST use this envelope. Beacon reads from structuredContent directly.

{
  "success": true,
  "block_kind": "query_result | artifact_evidence | ai_summary | kpi | schema",
  "evidence_class": "query | aggregation | geo_agg | timeseries | ...",
  "projections": {
    "rows": [],        "columns": [],
    "bucket_rows": {}, "geo_cells": {}
  },
  "column_meta": [ {"field": "sensor.site", "label": "Site", "role": "dimension", "type": "string"} ],
  "viz_hints": { "recommended": "table", "alternatives": ["bar_chart"] },
  "block": {}
}

Key rule: Beacon reads structuredContent.projections.rows and structuredContent.column_meta — these MUST be at the TOP level of the MCP response, not nested inside block. block carries the full shape-contract-compliant block for storage/curation.

Canonical Data Shapes

query (rows)

For "hits"-style queries. Required: projections.rows (data rows), projections.columns (visible fields only). Optional: column_meta, total_hits, sampling.

aggregation

Used when ES returns aggregations. Required: aggregation_tree (raw ES agg response, safe subset or full per policy); projections (at least one of the below, preferably bucket_rows whenever buckets exist).

projections.bucket_rows (preferred default) — a flattened, stable fact table: dimensions: [string], metrics: [string], rows: [object] (each row has all dims + metrics). Rules: always include doc_count if buckets exist; dimension keys use stable names chosen by Cortex (not ABAC-sensitive raw field names); missing metric → null (preferred) or 0; for filters agg include a window dimension using the filter keys.
projections.series_rows (optional) — when the aggregation naturally represents series: x_key, optional series_key, metrics, rows.
projections.pivot (optional) — a pivoted/wide table: row_key, column_key, value_key, rows.

geo_agg

For geohash_grid / geotile_grid. Required: geo_cells (rows describing grid cells with location + metrics). Minimum geo_cells shape: cell_id (geohash or geotile zoom/x/y), metrics: {doc_count, ...}. Optional: centroid_lat/centroid_lon (only if ES provides geo_centroid), bounds. Beacon's ViewerController decodes cell_id natively (_decodeCellId() handles both geohash and geotile); Cortex does NOT pre-compute centroids from cell_id. Optional: join_keys, projections.bucket_rows if geo is one dimension among others.

timeseries

For date_histogram aggregations. Required: projections.series_rows with x_key (time dimension), metrics, rows; also projections.bucket_rows for table fallback. Optional: series_key for stacked/grouped series.

multi_agg

When a query contains multiple top-level aggregations or deeply nested bucket aggregations. Required: projections.bucket_rows (flattened fact table with all dimensions and metrics); agg_blocks (per-aggregation structured data for separate rendering). Recommended viz_hint: tabbed_view; falls back to table.

Hard Shapes To Support

Multi-window temporal filters per site (terms → filters): compute metrics across multiple lookback windows per site. Pattern: terms(site_id) → filters(window keys) → metric(s). Projection: bucket_rows dims ["site_id","window"]; window order w15m, w30m, w60m, w6h, w24h (explicit in column_meta.window.order).
Geohash/grid aggregation: geo_grid(cell) → metric(s), optionally nested with terms(category). Projection: geo_cells primary, bucket_rows secondary for multi-dim analysis.

Shape Fingerprinting and Logging

Shape signature rules: do NOT include field names unless confirmed visible & safe. Include: bucket operator sequence (terms/date_histogram/filters/geotile_grid/composite/nested); bucket depth; count of filter keys; presence of key_as_string, after_key, pipeline metrics; metric operator types present (sum, avg, max, percentiles, bucket_script). Emit structured logs: NEW_AGG_SHAPE (WARN) when a fingerprint is first seen, AGG_SHAPE (DEBUG/INFO) when known, including fingerprint, signature, tool, model, query_hash, query_group_id. Optionally persist fingerprint counts to the insights index.

Viewer Contract (Beacon)

Beacon renders using projections first: prefer projections.bucket_rows or geo_cells; raw aggregation_tree is debug-only. ViewerController selection considers block_kind, presence of projections, column_meta roles/hints, and shape_features (e.g. composite pagination). On viewer failure: log viewer_id + shape_fingerprint + missing_requirements (e.g. bucket_rows missing).

Acceptance Criteria

Any aggregation block with buckets MUST include projections.bucket_rows (unless outcome ∉ {OK, NO_DATA}).
Geo aggregations MUST include geo_cells.
No viewer should parse raw ES aggregation JSON in the normal render path.
New agg shapes must be discoverable via NEW_AGG_SHAPE logs.
No ABAC-hidden field names appear in logs or block metadata.

Open Decisions

Missing metric values: null vs 0 (recommend null). Bucket truncation/topN: include warnings[] when truncated. Composite pagination: include after_key in raw + optionally in shape_features. Federation partials: outcome=PARTIAL + warnings.

Depends on: component.cortex.block-card, component.cortex.intelligence

Realizes: product.block