Cortex — Data Shape Contract
Eliminate frontend guessing by making Cortex emit stable, render-ready data shapes while keeping (1) ABAC safety, (2) dataspace types sacred, and (3) visualization choice in Beacon's ViewerController. Cortex normalizes polymorphic Elasticsearch aggregation responses into canonical data shapes; Beacon renders from those, never from raw ES JSON.
Vision (Non-negotiables)
Blocks Are Evidence, Not Views
A Block is an immutable-ish evidence object produced by Cortex and rendered by Beacon. Blocks carry provenance and enough shape metadata to render deterministically. Blocks MUST NOT encode "viz type" (no "timeseries chart", "heatmap chart") — Beacon chooses how to render.
Deterministic Shape, Emitted by Cortex
ES aggregation responses are polymorphic. Beacon must not reverse-engineer semantics from raw ES JSON. Cortex normalizes results into canonical data shapes and (optionally) derived projections. Raw ES responses may be included for audit/debug, but viewers should not depend on them.
Dataspace Types Are Sacred
Do not mutate uds_type or inferred domain types inside describe_model. Use render_hints and column_roles as secondary, non-authoritative metadata only.
ABAC Safety
Logs and shape metadata MUST NOT leak ABAC-hidden field names. If a user cannot see a field, blocks must not reveal its name or existence. Shape fingerprints are computed from operator structure, not field names.
Goals
Make aggregation rendering deterministic across: terms + metrics; date_histogram + terms + metrics; filters (multi-window temporal buckets); geohash_grid / geotile_grid + metrics; composite aggs with after_key pagination; pipeline metrics (bucket_script / derivatives). Make new shape variants visible during testing via a stable shape signature + fingerprint and first-seen logging. Keep the UI simple: Beacon renders from canonical projections (bucket_rows, geo_cells, etc.); ViewerController chooses the best viewer from column_roles/hints and projection availability.
Block Payload Contract
Common Block Metadata (required)
id, ts; data_source_type (e.g. elasticsearch); model, source; block_kind (one of query_result | artifact_evidence | ai_summary | kpi | schema); evidence_class (granular sub-type for routing: query | aggregation | geo_agg | timeseries | multi_agg | kpi | schema | histogram | model_card | dataset_card); materialization_mode (live | frozen, replaces storage_mode); lifecycle_stage (transient | curated | frozen); origin_surface (explore | monitor); source_tool, query_hash; optional query_group_id (group sibling blocks from one user request), sibling_index / sibling_count (multi-block responses).
Outcome Envelope (required)
outcome (OK | NO_DATA | PARTIAL | ERROR | POLICY_DENIED); warnings[] (strings, no ABAC leaks); errors[] (no secrets, safe summaries); optional federation: { federates_ok, federates_total }.
Shape ID (required for aggregation/geo_agg)
shape_signature (human-readable, no field names); shape_fingerprint (sha256 of canonical signature JSON); shape_features[] (e.g. ["filters_named_keys", "composite_after_key", "pipeline_metrics"]).
Column Metadata
column_meta maps output column keys to roles and hints (optional but strongly recommended): role (dimension | metric | time | geo | id); optional unit; optional render_hint (date_like_iso_string | wkt_point_string | rate | magnitude | categorical | ordinal); optional order (explicit ordering, e.g. ["w15m","w30m","w60m","w6h","w24h"]).
column_meta is advisory; it MUST never contradict ABAC or overwrite uds_type.
Visualization Hints
Cortex emits viz_hints to guide Beacon's ViewerController. These are recommendations, not commands.
viz_hints.recommended (required)
| Value | block_kind | Beacon Renderer | Description |
|---|---|---|---|
table |
query, aggregation | renderTable | Default tabular view |
kpi_card |
kpi | renderKPI | Key metric cards |
bar_chart |
aggregation, histogram | renderBar | Vertical bar chart |
timeseries_chart |
timeseries | renderTimeseries | Line chart with time axis |
geo_grid |
geo_agg | renderGeoGrid | Grid cells with color scale + legend |
geo_point |
query (with geo) | renderGeo | Point markers on map |
geo_heat |
geo_agg | renderGeoHeatmap | Heat intensity layer |
tabbed_view |
multi_agg | renderTabbed | Tabbed view for multiple aggs |
schema |
schema | renderSchema | Field/type table |
viz_hints.alternatives (optional): array of alternative viz types the user can switch to, e.g. ["table", "bar_chart"].
MCP Response Envelope
All MCP tools that return data MUST use this envelope. Beacon reads from structuredContent directly.
{
"success": true,
"block_kind": "query_result | artifact_evidence | ai_summary | kpi | schema",
"evidence_class": "query | aggregation | geo_agg | timeseries | ...",
"projections": {
"rows": [], "columns": [],
"bucket_rows": {}, "geo_cells": {}
},
"column_meta": [ {"field": "sensor.site", "label": "Site", "role": "dimension", "type": "string"} ],
"viz_hints": { "recommended": "table", "alternatives": ["bar_chart"] },
"block": {}
}
Key rule: Beacon reads structuredContent.projections.rows and structuredContent.column_meta — these MUST be at the TOP level of the MCP response, not nested inside block. block carries the full shape-contract-compliant block for storage/curation.
Canonical Data Shapes
query (rows)
For "hits"-style queries. Required: projections.rows (data rows), projections.columns (visible fields only). Optional: column_meta, total_hits, sampling.
aggregation
Used when ES returns aggregations. Required: aggregation_tree (raw ES agg response, safe subset or full per policy); projections (at least one of the below, preferably bucket_rows whenever buckets exist).
projections.bucket_rows(preferred default) — a flattened, stable fact table:dimensions: [string],metrics: [string],rows: [object](each row has all dims + metrics). Rules: always includedoc_countif buckets exist; dimension keys use stable names chosen by Cortex (not ABAC-sensitive raw field names); missing metric → null (preferred) or 0; forfiltersagg include awindowdimension using the filter keys.projections.series_rows(optional) — when the aggregation naturally represents series:x_key, optionalseries_key,metrics,rows.projections.pivot(optional) — a pivoted/wide table:row_key,column_key,value_key,rows.
geo_agg
For geohash_grid / geotile_grid. Required: geo_cells (rows describing grid cells with location + metrics). Minimum geo_cells shape: cell_id (geohash or geotile zoom/x/y), metrics: {doc_count, ...}. Optional: centroid_lat/centroid_lon (only if ES provides geo_centroid), bounds. Beacon's ViewerController decodes cell_id natively (_decodeCellId() handles both geohash and geotile); Cortex does NOT pre-compute centroids from cell_id. Optional: join_keys, projections.bucket_rows if geo is one dimension among others.
timeseries
For date_histogram aggregations. Required: projections.series_rows with x_key (time dimension), metrics, rows; also projections.bucket_rows for table fallback. Optional: series_key for stacked/grouped series.
multi_agg
When a query contains multiple top-level aggregations or deeply nested bucket aggregations. Required: projections.bucket_rows (flattened fact table with all dimensions and metrics); agg_blocks (per-aggregation structured data for separate rendering). Recommended viz_hint: tabbed_view; falls back to table.
Hard Shapes To Support
- Multi-window temporal filters per site (terms → filters): compute metrics across multiple lookback windows per site. Pattern:
terms(site_id) → filters(window keys) → metric(s). Projection:bucket_rowsdims["site_id","window"]; window orderw15m, w30m, w60m, w6h, w24h(explicit incolumn_meta.window.order). - Geohash/grid aggregation:
geo_grid(cell) → metric(s), optionally nested withterms(category). Projection:geo_cellsprimary,bucket_rowssecondary for multi-dim analysis.
Shape Fingerprinting and Logging
Shape signature rules: do NOT include field names unless confirmed visible & safe. Include: bucket operator sequence (terms/date_histogram/filters/geotile_grid/composite/nested); bucket depth; count of filter keys; presence of key_as_string, after_key, pipeline metrics; metric operator types present (sum, avg, max, percentiles, bucket_script). Emit structured logs: NEW_AGG_SHAPE (WARN) when a fingerprint is first seen, AGG_SHAPE (DEBUG/INFO) when known, including fingerprint, signature, tool, model, query_hash, query_group_id. Optionally persist fingerprint counts to the insights index.
Viewer Contract (Beacon)
Beacon renders using projections first: prefer projections.bucket_rows or geo_cells; raw aggregation_tree is debug-only. ViewerController selection considers block_kind, presence of projections, column_meta roles/hints, and shape_features (e.g. composite pagination). On viewer failure: log viewer_id + shape_fingerprint + missing_requirements (e.g. bucket_rows missing).
Acceptance Criteria
- Any aggregation block with buckets MUST include
projections.bucket_rows(unless outcome ∉ {OK, NO_DATA}). - Geo aggregations MUST include
geo_cells. - No viewer should parse raw ES aggregation JSON in the normal render path.
- New agg shapes must be discoverable via
NEW_AGG_SHAPElogs. - No ABAC-hidden field names appear in logs or block metadata.
Open Decisions
Missing metric values: null vs 0 (recommend null). Bucket truncation/topN: include warnings[] when truncated. Composite pagination: include after_key in raw + optionally in shape_features. Federation partials: outcome=PARTIAL + warnings.
Depends on: component.cortex.block-card, component.cortex.intelligence
Realizes: product.block