Skip to content

Testing — Real-Service Integration Standard

How every Axonis repo tests: against real services, never mocks. A test that can't reach its dependency skips with a clear reason — it never silently substitutes a double. This is the platform-wide standard; axonis-core is the reference implementation (see #reference).

Ground rules

  • #REQ.no-mocks — No mocks. No unittest.mock, MagicMock, AsyncMock, @patch, patch.dict('sys.modules', …), or hand-rolled in-memory fakes. A test exercises the real service (Redis, Elasticsearch, Keycloak, MCP, the LLM) or it skips.
  • #REQ.skip-clean — Skip, never fake. When a service is unreachable, the test skips with an explicit reason (require_*() gate). A skip is honest; a silent double is not.
  • #REQ.fix-prod-not-test — Fix production, not the test. Mocks hide real bugs. When converting a test surfaces a production defect, fix it in production code; do not paper over it in the test.
  • Enter from the earliest real wire. Integration tests take serialized input shaped like real client traffic (JSON/CSV/YAML), not a hand-built domain object fed to the consumer (see the workspace CLAUDE.md "Testing" rule). The create-test skill encodes this.

LLM transport is Claude Code, not the Anthropic SDK

Tests that need an LLM use a Claude Code CLI adapter (ClaudeCliProvider) that shells out to claude -p --output-format json and conforms to the axonis.llm.client.Client.complete() interface. The CLI binary is on PATH and auth-bound to the developer's Claude session — no ANTHROPIC_API_KEY. Gate with require_claude_cli(). Expect ~5–15s per call (Sonnet via Claude Code's own auth).

Real-service test infrastructure

The pattern (reference impl in axonis-core/tests/):

  • #REQ.autosource-env — conftest auto-sources the dev env. conftest.py sources developers-environment/env/development.axonis.ai.env + tokens.env so tests use the same configuration surface as the running platform (platform.service-configuration). It provides per-test-namespaced redis_client and es_client / disposable_index fixtures.
  • require_*() gates (tests/_integration.py): require_redis(), require_es(), require_keycloak(), require_authenticated_user() (validates the real AUTHORIZATION token via Keycloak introspect, cached per-process), require_claude_cli(). Each skips cleanly when its dependency is unreachable.
  • Disposable indices. disposable_index_for(monkeypatch, alias) redirects a Schema alias to a fresh axonis-test-* index with production mapping JSON and auto-teardown — real ES, no dev keyspace pollution (e.g. ratelimit tests use Redis db 15).

Token refresh

If require_authenticated_user() skips with Keycloak introspect failed: Inactive Authentication Session, refresh the test tokens with utils/auth/refresh_test_tokens.py.

CI-gated destructive tests

Tests that mutate a real cluster are gated behind opt-in env so they skip locally by default and run in CI where the runner sets the flag:

Gate Opt-in Skips when
require_k8s() AXONIS_RUN_K8S_TESTS=true (+ valid KUBE_CONFIG) unset, kubeconfig missing, or cluster API unreachable
require_airflow() gated on TCP reach + valid AIRFLOW_HOST the Airflow port isn't listening

Local access to such services is via utils/forwarding/port_forward.py (e.g. an airflow-webserver forward to localhost:9443).

Accepted residue

A few tests may legitimately keep a unittest.mock import, documented per-file: - Auth tests that @patch an Authenticator.validate collaborator where real conversion would require bypassing Keycloak's roles/markings or running multiple KC users. - from unittest.mock import ANY used purely as a wildcard matcher (not a mock) for binary/opaque fields. The "done" check greps for mock usage and filters the documented residue out.

Acceptance check

# no mock usage outside the documented residue
grep -rEn "from unittest\.mock|MagicMock|@patch|AsyncMock|\bMock\(" tests/ \
  | grep -v __pycache__ | grep -vE "<documented-residue-files>"
# (no output)

# suite runs: every test passes or skips with a clear reason
python -m pytest tests/ -q

Reference implementation — axonis-core

axonis-core is the first repo fully converted to this standard. Concrete artifacts: tests/conftest.py, tests/_integration.py, tests/_claude_cli.py.

Converting its suite off mocks surfaced four production bugs that were fixed in production code (do not revert):

File Bug Fix
axonis/redis/client.py Client.delete() passed a list to hdel (expects positional str) → DataError; hidden by a patched hdel. super().hdel(self.namespace, key)
axonis/memory/store.py _get_redis() ignored REDIS_TLS/REDIS_VERIFY/REDIS_USERNAME → silently failed against the TLS dev Redis. added ssl_kwargs gated on REDIS_TLS + username lookup
axonis/middleware/ratelimit.py same TLS/auth gap in _get_redis(). same fix
axonis/memory/service.py same TLS/auth gap in Service._get_redis(). same fix

These bugs were invisible while the tests mocked Redis — the strongest argument for the no-mocks rule. New repos adopt the axonis-core/tests/ infrastructure as the template.


Depends on: platform.service-configuration