Service Contract — Dual Interface Standard
Status: Adopted — first formal conformance audit complete. All deployed services conform with documented exceptions below. Depends on: platform.axonis-core Milestone: P1 (all services must conform)
Purpose
Every Axonis backend service exposes two interfaces over the same business logic: REST (OpenAPI) and MCP (Streamable HTTP). This spec defines the mandatory structure, authentication, registration, and operational requirements for both.
Service Anatomy
Every service follows this directory layout:
<service>/
<service>/ # Domain package — same name as the repo
__init__.py
... # Pure-Python domain code (compute, models, ops)
server/
__init__.py
__main__.py # Starlette app: /agentspace, /api/vN, /health, /service-info
api/
__init__.py
routes.py # FastAPI REST endpoints
schema/ # OpenAPI YAML component schemas
mcp/
__init__.py
server.py # FastMCP tools + resources
commands.py # Command layer (shared by REST + MCP)
charts/<service>/ # Bitnami Helm chart
.gitlab-ci.yml # CI/CD pipeline
.gitlab-ci-templates/ # Pipeline stage templates
Dockerfile
pyproject.toml # uv + hatchling
Domain package naming
The top-level domain subdirectory must share its name with the repo (titan/titan/, parallax/parallax/, cortex/cortex/, oracle/oracle/, sentinel/sentinel/, beacon/beacon/). This is the repo's primary Python package — pyproject.toml's [tool.hatch.build.targets.wheel] packages lists it alongside server (and protocol for repos that own federation message types).
Documented exceptions (intentional; must not be replicated by other repos):
axonis-coreusesaxonis-core/axonis/. The pip name (axonis-core) differs from the import root (axonis) soaxoniscan host future sub-libraries.xanaduusesxanadu/xanaqu/. The package namexanaqupredates the repo rename and is kept as-is to avoid churn across consumers (titan, parallax) that importxanaqu.*.conduitusesconduit/conduit/. The domain library (conduit) contains the Airflow client, domain classes, git client, config, and templates.server/is a thin web wrapper that imports fromconduit.*.pyproject.tomlships["conduit", "server"].prismplaces its MCP server atmcp_server/(repo root) rather thanserver/mcp/. The compute domain (lens/) is the primary library;server/is a thin REST wrapper. The MCP server imports business logic directly fromlens.governance.*modules rather than acommands.py— those modules are the command layer.
Entry Point Pattern
Every server/__main__.py must:
import os
from contextlib import asynccontextmanager
from starlette.applications import Starlette
from starlette.routing import Mount, Route
from starlette.responses import JSONResponse
from axonis.auth.oauth import OauthAuthentication
from server.mcp.server import mcp_app, session_manager
from server.api.routes import app as rest_app
SERVICE_NAME = "<service>"
SERVICE_VERSION = "<version>"
@asynccontextmanager
async def lifespan(_app):
async with session_manager.run():
yield
async def health(request):
return JSONResponse({"status": "ok", "service": SERVICE_NAME, "version": SERVICE_VERSION})
async def service_info(request):
return JSONResponse({
"name": SERVICE_NAME,
"version": SERVICE_VERSION,
"description": "<description>",
"mcp_path": "/agentspace",
"health_path": "/health",
"api_path": "/api/vN",
"tools_count": <N>,
"resources_count": <N>,
"capabilities": [<list>],
})
app = Starlette(
routes=[
Route("/health", health),
Route("/service-info", service_info),
Mount("/agentspace", app=mcp_app),
Mount("/api/vN", app=rest_app),
],
lifespan=lifespan,
)
Authentication
REST endpoints
Every REST endpoint must validate the Bearer token:
from axonis.auth.decorators import requires_auth
@router.post("/lens/execute")
@requires_auth
async def execute_lens(request, token_payload):
...
MCP tools
MCP tool calls arrive through Starlette middleware. The service must wrap the Starlette app with auth middleware that validates the Bearer token from the MCP request headers and injects the token payload into the request scope:
from axonis.auth.middleware import OAuthMiddleware
app = OAuthMiddleware(app) # wraps the Starlette app
Inside MCP tools, the token payload is available from the request context.
Service-to-service calls
Internal calls between services use the gateway client with a service account token:
from axonis.gateway.client import ServiceClient
parallax = ServiceClient(
base_url=os.getenv("PARALLAX_URL", "http://parallax:8000"),
service_token=os.getenv("SERVICE_TOKEN"),
)
result = parallax.post("/api/v2/fusion/run", body={...})
MCP Server Pattern
Every MCP server must use FastMCP with stateless HTTP transport:
from mcp.server.fastmcp import FastMCP
from mcp.server.transport_security import TransportSecuritySettings
domain = os.getenv("FEDERATE_DOMAIN", "localhost")
mcp = FastMCP(
"<service>",
stateless_http=True,
json_response=True,
transport_security=TransportSecuritySettings(
enable_dns_rebinding_protection=True,
allowed_hosts=[f"{domain}:*", f"{domain}", "localhost:*", "127.0.0.1:*"],
),
)
# Tools call the command layer, not Elasticsearch directly
@mcp.tool()
def some_tool(param: str) -> str:
result = commands.some_function(param)
return json.dumps(result, indent=2, default=str)
Invariant: MCP tools and REST endpoints must call the same command layer. No business logic in tools or routes — only in commands.py.
Documented exception — conduit: Conduit has no commands.py. Its MCP tools (server/tools/) and REST routes (server/api/routes.py) both import the same module-level singletons from the conduit library: airflow_client (Phase 1 Airflow proxy) and pipeline / pipeline_connection (Phase 2 domain). These singletons are the command layer — they encapsulate all business logic and are shared between MCP and REST. A separate commands.py indirection layer adds no value for a service that is itself a proxy.
REST Server Pattern
Every REST server uses FastAPI:
from fastapi import FastAPI
app = FastAPI(title="<Service> API", version="<version>")
@app.post("/lens/execute")
async def execute_lens(request: ExecuteLensRequest) -> ExecuteLensResponse:
result = commands.execute_lens(request.spec, request.inputs)
return ExecuteLensResponse(**result)
Invariant: REST endpoints use Pydantic models for request/response validation. OpenAPI schemas are auto-generated from these models and also maintained as YAML in server/api/schema/ for cross-reference.
Dual-path mounting: Services that own domain objects must mount their routers at both the new and legacy path prefixes. The Ingress layer routes both formats to the service without rewriting the path, so the service must handle both:
# server/api/routes.py — mount the same router at both prefixes
app.include_router(insight_router, prefix="/api/v1/insight")
app.include_router(insight_router, prefix="/userspace/insight") # legacy — backward compat
The object name used in both path prefixes must match the name declared in the objects field of
/service-info (lowercase, singular). Services that declare no owned objects do not mount legacy paths.
Documented exceptions (intentional; must not be replicated by other repos):
oracleusesOracleAuthMiddleware(fromserver/middleware/auth.py) instead ofOAuthMiddlewarefrom axonis-core. Oracle is a gateway that implements a full OIDC callback flow (/callbackroute, token exchange) in addition to Bearer validation.OAuthMiddlewareonly validates existing tokens and cannot handle the callback flow.
Command Layer Pattern
The command layer contains all business logic. It is imported by both MCP tools and REST routes:
# server/mcp/commands.py
from axonis.memory.service import MemoryService
from parallax.userspace.fusion import Lens, EntityMatch # parallax owns fusion domain
_memory = MemoryService(user_id=..., conversation_id=..., profile_id=...)
def execute_lens(spec, inputs, conversation_id, user_id, **kwargs):
result = ...
_memory.store(f"Lens {spec['lens_id']} executed", memory_type="fact", tags=["lens", "run"])
return result
LLM Capability (Optional)
Any service may include an optional LLM capability, configured via environment variables with the service-name prefix:
from axonis.llm.spec import LLMSpec
from axonis.llm.client import LLMClient
# In service config (e.g., cortex/cortex/core/config.py):
llm_spec = LLMSpec.from_env(prefix="CORTEX_LLM")
# In a command or executor:
if llm_spec.is_configured():
client = LLMClient(llm_spec)
response = await client.complete(messages)
else:
response = fallback_implementation()
Rules:
- The {SERVICE}_LLM_PROVIDER, {SERVICE}_LLM_MODEL, {SERVICE}_LLM_API_KEY env vars configure the LLM. If unset, llm_spec.is_configured() returns False.
- Services must always implement a fallback for any LLM-dependent operation (see each service spec for its fallback).
- Services that expose a chat endpoint must use ConversationStore from axonis.memory.conversation for conversation history.
- No service "owns" LLM exclusively. Each service manages its own LLM config independently.
Chat Endpoint Pattern (Optional)
Services that expose a chat interface must implement it at POST /api/vN/chat with this request/response contract:
class ChatRequest(BaseModel):
message: str
conversation_id: str = "" # empty = new conversation
model: str = "default"
stream: bool = False # when true, respond as text/event-stream (SSE) — see below
class ChatResponse(BaseModel):
response: str
conversation_id: str
tool_calls: list = []
model_used: str = ""
tokens: dict = {} # {"input": N, "output": N}
This contract is identical across all services that implement chat (Oracle, Cortex, Beacon, etc.). Clients can use the same code regardless of which service they talk to.
The chat endpoint requires the llm_chat capability in the active profile. Return HTTP 503 if LLM is not configured. Return HTTP 403 if the profile lacks the capability.
Streaming (optional). When stream: true and the service supports it, respond with Content-Type: text/event-stream and emit SSE events: delta ({"text": "..."}) for incremental assistant text, tool_call / tool_result for the tool-use lifecycle, and a terminal done carrying the full ChatResponse (or error). The terminal done payload is byte-identical to the non-streaming ChatResponse, so streaming is a strict superset — a client may ignore intermediate events and read done alone. A service that does not support streaming ignores the flag and returns the normal ChatResponse (the flag is a hint, never an error). Streaming maps directly onto axonis-core's Client.stream() / StreamChunk (platform.axonis-core).
Service Registration
Every service exposes GET /service-info (defined above). The oracle gateway discovers services by calling this endpoint. No platform-specific registration mechanism (no K8s ConfigMaps, no Consul).
The /service-info response is the registration contract. It must include:
- name — unique service identifier
- version — semantic version
- mcp_path — path to MCP mount (always /agentspace)
- health_path — path to health endpoint (always /health)
- api_path — path to REST API (e.g. /api/v1)
- tools_count — number of MCP tools
- capabilities — list of capability tags for routing
- objects — list of domain object names this service owns (lowercase, singular); used by Oracle to build
OBJECT_ROUTES and manage Ingress resources. Services that own no domain objects (e.g. Oracle itself)
emit an empty list.
Example:
{
"name": "cortex",
"version": "1.4.2",
"description": "Intelligence service",
"mcp_path": "/agentspace",
"health_path": "/health",
"api_path": "/api/v1",
"tools_count": 24,
"resources_count": 0,
"capabilities": ["intelligence", "llm_chat"],
"objects": ["insight", "signal", "block", "profile", "report", "task", "edition"]
}
mcp_endpoint legacy field: Some services emit mcp_endpoint: "/agentspace/mcp" instead of mcp_path: "/agentspace". Oracle's registry normalises both — it strips the trailing /mcp from mcp_endpoint to derive mcp_path. New services must use mcp_path. Existing services may keep mcp_endpoint until their next major refactor.
Elasticsearch Access
All Elasticsearch operations go through axonis-core classes:
from axonis.elastic.client import get_client
from axonis.userspace.intelligence import Insight # any UDS-backed class
# Direct ES query (rare — prefer UDS)
client = get_client()
result = client.search(index="insight", body=query)
# Preferred: UDS CRUD
insights = Insight()
all_insights = insights.read()
one_insight = insights.read(uid="insight_001")
insights.create({"title": "AML pattern detected", ...})
(Domain-specific UDS classes that depend on heavy compute live in their owning service: e.g. Lens, EntityMatch, EntityCluster are imported from parallax.userspace.fusion, not from axonis-core.)
Invariant: No service may construct its own Elasticsearch client. All access goes through axonis.elastic.
Try-Elasticsearch-first with local fallback
Elasticsearch is the preferred persistence path and a service MUST attempt it first
(through axonis.elastic / UDS as above). But ES is not guaranteed to be reachable — a
federate-role deployment (e.g. parallax running as a Customer Intelligence Node) frequently
runs with no ES at all. Such a service MUST degrade gracefully to a sanctioned local store
abstraction rather than failing.
A local store fallback (e.g. parallax's CorrelationStore SQLite backend,
component.parallax.local-persistence-adapter) is not an Elasticsearch client and does not
violate the invariant above — it is a distinct backend selected only when ES is unavailable. The
invariant constrains how ES is reached (always axonis.elastic, never a hand-rolled client); it
does not require that ES be the only persistence backend.
Rules:
- ES-first. When ES is reachable, reads and writes go through axonis.elastic / UDS. The local
store is the fallback, not the default-when-convenient.
- One interface, pluggable backend. The ES path and the local path implement the same store
interface so callers (REST controllers, MCP tools, the compute pipeline) are backend-agnostic.
- No own ES client. The fallback never reimplements ES access; it is a different store entirely.
- Backend selection is observable. The active backend is reported on /health (the issues
array names elasticsearch when ES is unavailable and the service is running on its local store).
The same principle applies to Redis: a service MUST attempt Redis first for cache / ephemeral state and MUST degrade to an in-process equivalent when Redis is unavailable, never failing solely because Redis is down (see §Health Check).
Health Check
Every service responds to GET /health with:
{"status": "ok", "service": "<name>", "version": "<version>"}
If one or more dependencies are unavailable, return:
{
"status": "degraded",
"service": "<name>",
"version": "<version>",
"issues": ["elasticsearch", "redis"]
}
HTTP 200 for ok and degraded. HTTP 503 only if the service cannot process any requests.
External-state availability degrades gracefully — a service is never 503 solely because Redis or
Elasticsearch is unavailable. This covers memory and persistence: a service that backs entity /
correlation / audit data attempts Elasticsearch first and falls back to its sanctioned local store
(§Try-Elasticsearch-first with local fallback) without erroring; a service that uses Redis attempts
it first and falls back to an in-process equivalent. The issues array in the health response
reports each unavailable dependency (elasticsearch, redis) so operators can observe which
backends are degraded.
Helm Chart
Every service has a Bitnami-pattern Helm chart at charts/<service>/ with:
- Chart.yaml with bitnami/common 2.31.4 dependency
- values.yaml with standardized sections: <service>.config, <service>.secrets, <service>.image, probes, HPA, ingress, service account
- All 10 standard templates: _helpers.tpl, deployment.yaml, service.yaml, configmap.yaml, secret.yaml, ingress.yaml, hpa.yaml, service-account.yaml, clusterrolebinding.yaml, NOTES.txt
Known gap — conduit: No Helm chart exists. Conduit is deployed manually or via Argo CD direct manifest. Adding a standard chart is tracked as an open item.
CI/CD Pipeline
Every service has .gitlab-ci.yml with stages: qa, package, deploy, release, security.
Templates in .gitlab-ci-templates/:
- prepare-environment.yml — workflow rules, file change triggers
- qa-code-analysis.yml — lint (ruff/flake8) + test (pytest) + coverage
- package.yml — Docker build + push
- helm-release.yml — Helm chart publish
- deploy-branch.yml — branch deployment
- deploy-staging.yml — staging deployment
- deployment-setup.yml — K8s context setup
- semantic-release.yml — version bump + changelog
- security-scanner.yml — container scanning, SAST, secret detection
Cross-Cutting Requirements
Every service must:
-
Extend
AxonisSettingsfromaxonis.settingsfor all configuration. Declare service-specific fields in aSettings(AxonisSettings)subclass. Expose config only via a@lru_cache get_settings()function. No module-level singletons, noos.getenvin service source files. See platform.service-configuration for the full pattern, naming conventions, and migration path. -
Use
MemoryServicefromaxonis.memory.servicefor all conversational memory operations. Passservice="<name>"when constructing — this field gates both writes (stamps the record) and reads (filters recall to records this service wrote). There is no cross-service recall throughMemoryService; cross-service knowledge transfer is Apollo's responsibility (component.oracle.apollo). Do not constructMemory(UDS)directly for conversational use.
Test Requirements
Every service must have: - Unit tests for all command layer functions - Integration tests for MCP tool calls (mock backend) - Integration tests for REST endpoint calls (TestClient) - Auth tests verifying token validation on both interfaces - Health check test - Service-info test
Cross-cutting service requirements
These are the platform-wide MUST / MUST-NOT clauses that apply to every Axonis backend service, independent of which domain it implements. (Relocated from the former per-service implementations spec, whose per-service sections now live in each owning repo.)
Every service must
- Import
axonis-corefor auth, elastic, redis, UDS, userspace, schema, and memory (axonis.userspace.intelligence.Memory) - Validate Bearer tokens on every request (REST and MCP)
- Expose
/healthand/service-infoper platform.service-contract - Use the command layer pattern (business logic shared by REST + MCP)
- Use
MemoryServicefromaxonis.memory.servicefor all conversational memory. Passservice="<name>"when constructing — this field gates both writes (stamps the record) and reads (filters recall to records this service wrote). There is no cross-service recall throughMemoryService; cross-service knowledge transfer is Apollo's responsibility (component.oracle.apollo). For non-conversational bulk reads (analytics, admin), directMemory(UDS)use is acceptable. - Have a Bitnami Helm chart at
charts/<service>/ - Have CI/CD pipeline matching the standard template
- Use
uvfor dependency management withhatchlingbuild backend - Use
rufffor linting
No service may
- Import another service's code (use REST/MCP to communicate)
- Construct its own Elasticsearch client (use
axonis.elastic). A sanctioned local store fallback used when ES is unavailable (§Try-Elasticsearch-first with local fallback) is not an ES client and is permitted. - Skip authentication on any endpoint
- Expose itself directly outside the cluster (only oracle is exposed)
- Define Schema constants (all in
axonis.schema)
Depends on: platform.axonis-core
Required by: component.beacon.ticketing, component.beacon.workbench, component.conduit.service, component.cortex.intelligence, component.geodex.operations, component.oracle.apollo, component.oracle.gateway, component.parallax.service-interface, component.postern.proxy, component.prism.service-interface, component.sentinel.alerting, component.titan.runtime, component.xanadu.messaging, platform.apollo, platform.devops-cicd, platform.ingress-routing, platform.observability, platform.service-configuration