Cortex — Block and Card Contract

The Block/Card data contract between Cortex (data layer) and Beacon (presentation layer), following MVC principles: the Model is Cards (rows + column_meta, source-agnostic data), the View is Beacon rendering Cards with compatible visualizations, and the Controller is UI controls that switch between views. This complements component.cortex.data-shape (the MCP response/projection contract); this spec defines the Block container, the Card data unit, and the LLM summary.

Design Principles

Source agnostic — same output format whether data comes from ES, SQL, or APIs.
View agnostic — same data can render as table, chart, map, etc.
Traceable — clear lineage from query → response → card → render.
Self-contained — each card has everything needed to render independently.
LLM optimized — compressed summaries for efficient LLM context usage.

Schema

Block (Container)

{
  "id": "uuid",
  "block_kind": "query | aggregation | multi_agg",
  "ts": "2024-01-14T12:00:00Z",
  "query_hash": "abc123",
  "cards": [ {} ],          // Card objects — full data for Beacon UI
  "llm_summary": [ {} ],    // CardSummary objects — compressed for LLM context
  "rows": [],               // convenience alias → cards[0].rows (single-card blocks)
  "column_meta": {},        // alias → cards[0].column_meta
  "viz_hints": {}           // alias → cards[0].viz_hints
}

Card (Data Unit)

{
  "id": "card-uuid",
  "source": { "type": "es_agg | es_query | sql | api", "agg_name": "current_state", "query_fragment": "terms[site]" },
  "title": "Current State by Site",
  "rows": [ {"site": "Site-A", "state": "OK", "value": 42.5} ],
  "column_meta": {
    "site":  {"role": "dimension", "type": "string", "label": "Site"},
    "value": {"role": "metric", "type": "number", "format": "decimal", "label": "Value"}
  },
  "viz_hints": { "recommended": "table", "axes": {"x": "site", "y": ["value"], "group_by": "state"} }
}

LLM Summary (Compressed for Context)

The llm_summary array contains compressed versions of each card, optimized for LLM token efficiency while preserving semantic understanding — the LLM understands data structure without seeing all rows, enabling follow-up queries within context limits.

CardSummary fields: id (matches parent card), title, schema (full column_meta, compact, always included), stats (row_count; per-column: dimensions → unique_values/top_values/null_count, metrics → min/max/sum/avg, time → earliest/latest/interval), sample_rows (first N rows, default 10), truncated (true if sample < total).

Compression rules: always include schema + stats.row_count; dimensions → top 5–10 unique values + cardinality; metrics → min/max/avg/sum (no individual values); time → range + interval; sample rows → max 10 by default; geo → bounding box + centroid (not individual points).

Column Roles and Types

Roles: dimension (categorical grouping), metric (numeric measure), time (temporal), geo (geographic), id (unique identifier). Types: string, number (formats: decimal, currency, percent), datetime (formats: epoch_ms, iso8601), geo_point (formats: wkt, geojson), boolean.

View Compatibility Matrix

Beacon determines compatible views from column_meta roles:

Roles Present	Compatible Views
dimension + metric	table, bar, pie, treemap
time + metric	line, area, sparkline
time + dimension + metric	grouped line, stacked area
geo + metric	map (points, heatmap)
geo + dimension + metric	map with categories
metric only	KPI card, gauge, number
dimension only	table, list

Data Flow and Traceability

1. User asks question / Profile defines monitor blocks
2. LLM generates ES query (knows ES DSL)
3. Cortex executes against ES
4. Cortex transforms each aggregation → Card + Summary
     (flatten buckets → rows; infer column_meta; infer viz_hints; create card + llm_summary)
5. Block with dual outputs: cards (full data → Beacon), llm_summary (compressed → LLM)
6a. Beacon renders UI (view switcher, charts, tables)   6b. LLM interprets (schema, distribution, insights)

Why Two Outputs

Aspect	`cards` (Beacon)	`llm_summary` (LLM)
Purpose	Render visualizations	Understand & reason
Size	All rows (could be 10K+)	Sample + stats (~10 rows)
Schema	Full column_meta	Full column_meta
Data	Complete rows	Sample + aggregated stats
Tokens	N/A	Optimized for context

Token efficiency: 50 rows ~2,500 → ~400 tokens (84% savings); 500 rows ~25,000 → ~500 (98%); 5,000 rows ~250,000 → ~600 (99.8%).

Cortex as Source of Truth

Cortex is the authoritative source for data interpretation. Beacon reads metadata from cards; it does not guess.

Field	Purpose	Beacon Usage
`cards[0].rows`	Actual data rows	Render in tables, charts
`cards[0].column_meta`	Column roles and types	Determine compatible views
`cards[0].viz_hints.recommended`	Best visualization	Default view selection

Beacon must NOT guess: not "column name contains 'date'" but column_meta.time.role === "time"; not "has lat/lon fields" but column_meta.location.role === "geo"; not "looks like KPI data" but viz_hints.recommended === "kpi".

Data Normalization Requirements

Cortex must normalize all data before creating cards: column_defs must be dicts ([{"field": "name", "type": "auto"}]), never strings; rows is a flat list of dicts with consistent keys; every column in rows must have a column_meta entry with role and type; viz_hints must always include recommended.

Externalized Block/Insights API

The Block and Insights surface is intended for external consumption beyond Beacon's in-platform rendering — a durable, contract-stable API that outside consumers can read against. The data unit is the Block container and its Cards (#schema); the externalized surface exposes that contract plus Insights for consumption and learning capture. This is a post-MVP capability: scoped here so the contract is captured, sequenced after the MVP block/card path.

#REQ.externalized-block-api — the Block/Card contract defined in #schema (Block container, Card data unit, and llm_summary) is the externalized payload; external consumers read the same source-agnostic, view-agnostic, self-contained Cards Beacon does — no separate or divergent external shape. The externalization realizes product.block.
#REQ.externalized-insights-api — an Insights API exposes insight objects for external consumption, governed by product.insight (lifecycle, entry context, attestation state). External reads remain subject to the same capability/profile gating and visibility boundary as the in-platform path (per #cortex.mcp-boundary in component.cortex.intelligence) — externalization changes the consumer, not the governance.
#REQ.externalized-learning-capture — the externalized surface captures learning signal from external consumption (which blocks/insights are consumed and how) for the post-MVP learning loop. Deferred to a post-MVP release; captured here as durable intent so the API contract and the learning-capture hook are designed together, not retrofitted.

Migration Path

Phase 1: Cortex emits both old format (projections) and new format (cards). Phase 2: Beacon reads cards, falls back to projections. Phase 3: remove old projection code from Cortex. Phase 4: remove fallback from Beacon.

Testing Strategy

Unit tests for Cortex card generation per aggregation type; contract tests validating card schema against this spec; integration tests for the full flow (profile → ES → card → render); visual tests verifying each view type renders correctly.

Depends on: component.cortex.intelligence

Realizes: product.block

Required by: component.cortex.data-shape