Testing — Real-Service Integration Standard
How every Axonis repo tests: against real services, never mocks. A test that can't reach its
dependency skips with a clear reason — it never silently substitutes a double. This is the
platform-wide standard; axonis-core is the reference implementation (see #reference).
Ground rules
- #REQ.no-mocks — No mocks. No
unittest.mock,MagicMock,AsyncMock,@patch,patch.dict('sys.modules', …), or hand-rolled in-memory fakes. A test exercises the real service (Redis, Elasticsearch, Keycloak, MCP, the LLM) or it skips. - #REQ.skip-clean — Skip, never fake. When a service is unreachable, the test skips with an
explicit reason (
require_*()gate). A skip is honest; a silent double is not. - #REQ.fix-prod-not-test — Fix production, not the test. Mocks hide real bugs. When converting a test surfaces a production defect, fix it in production code; do not paper over it in the test.
- Enter from the earliest real wire. Integration tests take serialized input shaped like real
client traffic (JSON/CSV/YAML), not a hand-built domain object fed to the consumer (see the
workspace CLAUDE.md "Testing" rule). The
create-testskill encodes this.
LLM transport is Claude Code, not the Anthropic SDK
Tests that need an LLM use a Claude Code CLI adapter (ClaudeCliProvider) that shells out to
claude -p --output-format json and conforms to the axonis.llm.client.Client.complete()
interface. The CLI binary is on PATH and auth-bound to the developer's Claude session — no
ANTHROPIC_API_KEY. Gate with require_claude_cli(). Expect ~5–15s per call (Sonnet via Claude
Code's own auth).
Real-service test infrastructure
The pattern (reference impl in axonis-core/tests/):
- #REQ.autosource-env — conftest auto-sources the dev env.
conftest.pysourcesdevelopers-environment/env/development.axonis.ai.env+tokens.envso tests use the same configuration surface as the running platform (platform.service-configuration). It provides per-test-namespacedredis_clientandes_client/disposable_indexfixtures. require_*()gates (tests/_integration.py):require_redis(),require_es(),require_keycloak(),require_authenticated_user()(validates the realAUTHORIZATIONtoken via Keycloak introspect, cached per-process),require_claude_cli(). Each skips cleanly when its dependency is unreachable.- Disposable indices.
disposable_index_for(monkeypatch, alias)redirects a Schema alias to a freshaxonis-test-*index with production mapping JSON and auto-teardown — real ES, no dev keyspace pollution (e.g. ratelimit tests use Redis db 15).
Token refresh
If require_authenticated_user() skips with Keycloak introspect failed: Inactive Authentication
Session, refresh the test tokens with utils/auth/refresh_test_tokens.py.
CI-gated destructive tests
Tests that mutate a real cluster are gated behind opt-in env so they skip locally by default and run in CI where the runner sets the flag:
| Gate | Opt-in | Skips when |
|---|---|---|
require_k8s() |
AXONIS_RUN_K8S_TESTS=true (+ valid KUBE_CONFIG) |
unset, kubeconfig missing, or cluster API unreachable |
require_airflow() |
gated on TCP reach + valid AIRFLOW_HOST |
the Airflow port isn't listening |
Local access to such services is via utils/forwarding/port_forward.py (e.g. an
airflow-webserver forward to localhost:9443).
Accepted residue
A few tests may legitimately keep a unittest.mock import, documented per-file:
- Auth tests that @patch an Authenticator.validate collaborator where real conversion would
require bypassing Keycloak's roles/markings or running multiple KC users.
- from unittest.mock import ANY used purely as a wildcard matcher (not a mock) for binary/opaque
fields.
The "done" check greps for mock usage and filters the documented residue out.
Acceptance check
# no mock usage outside the documented residue
grep -rEn "from unittest\.mock|MagicMock|@patch|AsyncMock|\bMock\(" tests/ \
| grep -v __pycache__ | grep -vE "<documented-residue-files>"
# (no output)
# suite runs: every test passes or skips with a clear reason
python -m pytest tests/ -q
Reference implementation — axonis-core
axonis-core is the first repo fully converted to this standard. Concrete artifacts:
tests/conftest.py, tests/_integration.py, tests/_claude_cli.py.
Converting its suite off mocks surfaced four production bugs that were fixed in production code (do not revert):
| File | Bug | Fix |
|---|---|---|
axonis/redis/client.py |
Client.delete() passed a list to hdel (expects positional str) → DataError; hidden by a patched hdel. |
super().hdel(self.namespace, key) |
axonis/memory/store.py |
_get_redis() ignored REDIS_TLS/REDIS_VERIFY/REDIS_USERNAME → silently failed against the TLS dev Redis. |
added ssl_kwargs gated on REDIS_TLS + username lookup |
axonis/middleware/ratelimit.py |
same TLS/auth gap in _get_redis(). |
same fix |
axonis/memory/service.py |
same TLS/auth gap in Service._get_redis(). |
same fix |
These bugs were invisible while the tests mocked Redis — the strongest argument for the no-mocks
rule. New repos adopt the axonis-core/tests/ infrastructure as the template.
Depends on: platform.service-configuration