Seldon Core 2.10 Install Guide: Enterprise ML Serving

Deploy Seldon Core 2.10 for enterprise ML serving on Kubernetes. Real Helm commands, Kafka gotchas, the SSE4.2 CPU trap, and verification steps.

Jordan West2026-05-278 min readIntermediate

You have twelve models, three teams pushing updates weekly, and infra costs that scale linearly with how many pods you spin up. This is the wall every ML platform team hits around model number ten – and it’s the exact problem enterprise ML serving with Seldon Core 2 is built to solve. According to the official Seldon Core 2 documentation, it’s an MLOps and LLMOps framework for deploying, managing and scaling AI systems in Kubernetes – from singular models to modular, data-centric applications, in a standardized way across a wide range of model types, on-prem or in any cloud.

This guide covers the 2.10.0 release – focused on scalability, usability and bugfixes, as noted in the official release notes. Not the legacy v1 path (which Seldon is winding down). The v2 stack that actually gets installed in production today.

What you’re actually installing

Core 2 is not a single binary. It’s an operator, a runtime, a set of inference servers, and optionally Kafka glue for pipelines. Four Helm charts get installed: CRDs, the setup chart (operator), the runtime, and example servers (mlserver-0 and triton-0).

Before anything else: Core 2 ships under the Business Source License (BSL 1.1), not Apache 2.0 – confirmed in both the GitHub repo and PyPI listing. If your procurement team is allergic to BSL, find that out now, before you spend three days deploying it. The license question trips up more enterprise installs than any Helm misconfiguration.

Worth pausing on that for a second. BSL 1.1 means the software is source-available but commercial use restrictions apply until the “change date” – after which it typically converts to an open-source license. What does that mean for a team running it internally? Probably nothing. For a vendor embedding it in a product? Entirely different conversation. The honest answer is: check with your legal team before the cluster is built, not after.

System requirements (read this before anything else)

The official prerequisites list looks short. The CPU one is the trap.

Requirement	Minimum	Notes
Kubernetes	Check chart compatibility	Verify against the specific chart version you’re installing
Helm	3.x	Required, not optional
CPU instruction set	SSE4.2 / x86-64-v2	Silently breaks installs – see errors section
Per-server resources	1 CPU, 2Gi RAM	Defaults for both mlserver and triton in components-values.yaml (as of 2.10)
Kafka	Optional	Only if you need Pipelines

Download and prepare

Everything ships via Helm. Add the chart repo and create the two namespaces you’ll need:

helm repo add seldon-charts https://seldonio.github.io/helm-charts
helm repo update seldon-charts

kubectl create ns seldon-system
kubectl create ns seldon-mesh

The split matters. The operator lives in seldon-system. Your actual model-serving runtime lives in seldon-mesh (or whatever you name it). Unlike Seldon Core v1’s global installation model, Core 2 is installed in the namespace where you want the inference multi-model servers to run – a deliberate design choice that makes multi-team isolation much cleaner.

Install Seldon Core 2.10 – the four-chart sequence

Order matters. CRDs first, then operator, then runtime, then servers. Reverse it and you’ll watch the operator fail to find resources it expects.

# 1. CRDs (cluster-scoped)
helm upgrade seldon-core-v2-crds seldon-charts/seldon-core-v2-crds 
 --namespace default --install

# 2. Operator (cluster-wide mode)
helm upgrade seldon-core-v2-setup seldon-charts/seldon-core-v2-setup 
 --namespace seldon-mesh 
 --set controller.clusterwide=true 
 --install

# 3. Runtime
helm upgrade seldon-core-v2-runtime seldon-charts/seldon-core-v2-runtime 
 --namespace seldon-mesh --install

# 4. Example servers (mlserver-0 and triton-0)
helm upgrade seldon-core-v2-servers seldon-charts/seldon-core-v2-servers 
 --namespace seldon-mesh --install

One flag on the operator: cluster-wide installation is only available from version 2.6.0 onwards (per the Seldon Enterprise Platform install docs). On something older? You’re stuck installing per-namespace and re-running the setup chart for every model namespace. That’s the single biggest reason teams upgrade from 2.5.

The Kafka decision: Kafka is only required for Pipelines – the Kafka-backed dataflow between models. If you don’t use Pipelines, skip seldon-modelgateway, seldon-pipelinegateway, and seldon-dataflow-engine entirely. Installing those three without Kafka leaves the dataflow engine pod in a crash loop. Most install guides don’t flag this distinction at all.

First-time configuration: the minimal components-values.yaml

The defaults work as a starting point. The trimmed version that actually matters for a non-Kafka install:

controller:
 clusterwide: true
envoy:
 service:
 type: ClusterIP
scheduler:
 service:
 type: ClusterIP
serverConfig:
 mlserver:
 resources:
 cpu: 1
 memory: 2Gi
 triton:
 resources:
 cpu: 1
 memory: 2Gi

Pipelines needed? Add Strimzi. The docs name it as the recommended operator for self-hosted Kafka in dev/test environments. Production is a different story – Confluent Cloud, MSK, or another managed Kafka solution handles the operational burden that Strimzi inside the same cluster can’t.

Verify the install

Three checks. Skip any and you’ll find out things are broken at the worst possible moment.

Pods are running:kubectl get pods -n seldon-mesh – you want seldon-v2-controller-manager, seldon-scheduler, seldon-envoy, and your server pods (mlserver-0, triton-0) all in Running state. If any are in Pending or CrashLoopBackOff, the errors section below covers the three most common causes.
CRDs registered:kubectl get crd | grep seldon – should list models.mlops.seldon.io, servers.mlops.seldon.io, pipelines.mlops.seldon.io, experiments.mlops.seldon.io, plus seldonruntimes and seldonconfigs.
Scheduler healthy:kubectl logs -n seldon-mesh statefulset/seldon-scheduler – clean startup looks like timestamped lines with no FATAL or Kafka connection errors. If you’re not running Pipelines and you see Kafka errors anyway, a gateway component got installed that shouldn’t have been.

Apply a tiny Model CR pointing at any public model URI. If the model reaches Available within a minute or two, you’re done.

Common install errors and what they actually mean

Dataflow engine in CrashLoopBackOff. Almost always Kafka. Either Kafka isn’t reachable at the bootstrap address configured in your values file, or you installed the gateways but never set up Kafka at all. Fix: install Strimzi, or remove the three gateway components.

Pods stuck Pending – no clear error. Resource shortage is the usual suspect. But if node events look fine and pods still won’t start, the CPU instruction set is the next thing to check. Run lscpu | grep sse4_2 on the node. As of the 2.10 docs, Core 2 requires SSE4.2 / x86-64-v2 – older VM types and certain cloud instance families don’t have it. The failure mode is silent: Triton and MLServer pods simply never come up cleanly, with no error message that points at the CPU.

Pipelines stuck in PipelineTerminated after a network blip. This is a documented bug in 2.10.0. When a network partition occurs between the dataflow-engine and Kafka, and the dataflow-engine restarts, pipelines can end up marked as PipelineTerminated with the message “pipeline removed” – even after connectivity is restored. The workaround: delete any Pipeline in that state and re-deploy the same manifest. Annoying but fast.

There’s a pattern across all three: most Seldon Core 2 failures come from missing dependencies, not broken ones. Kafka, the right CPU, the right install order. Get those right and most failure modes disappear.

Upgrading and uninstalling

Moving from 2.9 to 2.10? The 2.10.0 release notes flag one thing worth reading before you bump: new scaling configuration options landed under config.ScalingConfig in SeldonConfig. Set them now if they’re relevant to your workload – don’t wait until the next upgrade cycle.

Upgrade order mirrors install – CRDs first, then setup, runtime, servers. Each chart with helm upgrade --install.

To uninstall cleanly:

helm uninstall seldon-core-v2-servers -n seldon-mesh
helm uninstall seldon-core-v2-runtime -n seldon-mesh
helm uninstall seldon-core-v2-setup -n seldon-mesh
helm uninstall seldon-core-v2-crds -n default
kubectl delete ns seldon-mesh seldon-system

Strimzi and any PodMonitors for Prometheus need separate cleanup. The Helm charts won’t touch them.

For deeper config – service meshes, observability, autoscaling tuning – the official Seldon Core 2 docs, the production environment install page, and the release notes are the only sources worth trusting. Third-party tutorials lag the actual chart values by months.

FAQ

Do I really need Kafka to run Seldon Core 2?

No. Kafka is only required for Pipelines. Skip the three gateway components and you have a fully working multi-model serving stack – no Kafka anywhere in the picture.

Can I run this in a namespace-scoped mode instead of cluster-wide?

Yes, but only if you’re on 2.6.0 or later does cluster-wide mode even exist as an option. Before that, namespace-scoped was the only mode. The tradeoff: namespace-scoped means re-running the setup chart for every namespace where you want inference servers – manageable for one team, tedious for three. Cluster-wide mode was introduced specifically to eliminate that overhead, which is why it’s the default recommendation in current docs (as of 2.10).

What’s the realistic minimum cluster size for production?

The official defaults are 1 CPU and 2Gi RAM per server pod (mlserver-0 and triton-0 each), plus whatever the operator, scheduler, and envoy consume. Two server pods alone are 2 CPU / 4Gi before any model is loaded. The actual constraint is usually memory – Triton in particular, because model weights live in RAM. Multi-model serving helps here: Core 2’s overcommit feature lets you deploy more models than available memory technically allows, loading and unloading them as traffic demands. That’s the main reason to choose Core 2 over a simpler serving stack when your model count grows past a handful.

Next step: deploy a real model. Pick the smallest sklearn or ONNX artifact you have, write a Model CR pointing at its URI, and apply it. If it reaches Available, you’re ready to start migrating workloads.