Skip to content

Model

Definition

A Model is the untrained model definition — the architecture plus the source code/notebook/files that define a model before any training run. It is the template a TrainedModel realizes: it names an architecture (e.g. HeteroSecureBoost, TorchClassifier), from which the platform derives the model's type, library, and framework, and it carries the source payload that gets pushed to git storage. The Model itself is never "trained" — it is the immutable-ish source definition; training produces a separate TrainedModel that points back at it.

Defined axonis-core/axonis/ml_userspace/model.py:9 (class Model(UDS), alias Schema.MODEL). Persistence: the Elasticsearch userspace index (metadata only, subtype = "model"), plus a per-uid Gitea git repo holding the source, branch = version. (Note: an unrelated class Model(Enum) in rest/server/api/models/objects.py:706 is a codegen artifact, not this object.)

Lifecycle

created (definition recorded; source pushed to a git branch, default main) → referenced by zero or more TrainedModels → deleted. No status enum — a Model is the template, not a run. On create, the decoded source files are pushed to Gitea (model.py:24, storage.push_decoded_source_model); a framework = 'pretrained' model skips the source push (model.py:23).

Journey through the code

REST create target=model → rest userspace. Note model is not in the rest USERSPACE map (commands.py:63), so in the REST process it falls through to generic UDS CRUD; the rich Model class with its git-push logic runs inside the axonis-core / titan context after federation propagation. create resolves type from MODEL_TYPE_MAP[definition] (constants.py:17), writes the ES record, then get_storage(uid, branch).push_decoded_source_model(obj) writes the source to Gitea. create_from_federate (model.py:43) is the inter-federate path and requires a notebook_uid or model_file. The consume path is titan's predictor/training, which reads the Model to realize a TrainedModel.

Data shape

definition (architecture name, key into MODEL_TYPE_MAP), type / library / framework (derived from the map — e.g. HeteroSecureBoost → {tabular_classification, federated, federated}, TorchClassifier → {tabular_classification, pytorch, simple}, constants.py:18), branch (git branch, default main), files_bin / model_file / notebook_uid (the source payload, stripped from the response after the push). Storage: ES userspace index (metadata) + Gitea repo per uid, branch = version.

Invariants

  • Non-pretrained Models always push source to git storage; create_from_federate requires model content (model.py:56).
  • type is auto-filled from MODEL_TYPE_MAP when definition is set and type is absent.
  • product.trained-model — a TrainedModel references a Model (source['model']) and copies its library / framework / type; the Model is the config, the TrainedModel is the run plus the resulting weights.
  • product.dataset — a training run pairs a Model (architecture) with a Dataset (data).
  • product.fusion — federated fusion can reference titan-trained models (rankers, LLM adapters) by registry key.

Open questions

  • Source-of-truth location — like Dataset/TrainedModel, the real Model class ships in the published axonis-core wheel, not a tracked local branch; the canonical path needs pinning.
  • REST vs federate splitmodel is absent from the rest USERSPACE map, so the rich create/push logic executes only after federation propagation lands on a federate. Intentional (REST = thin proxy, federate = executor) or a gap?

Realized by: component.titan.runtime