Architecture
XTrinode has a control plane that reconciles runtime intent and a query plane that routes SQL traffic to the right Trino coordinator.
Architecture Diagram
Section titled “Architecture Diagram”Component Overview
Section titled “Component Overview”| Component | Responsibility |
|---|---|
| Operator | Reconciles XTrinode and XTrinodeCatalog resources into Kubernetes resources. |
| API server | Coordinates runtime lifecycle operations such as resume, suspend, status, and control-plane actions. |
| Gateway | Routes Trino client traffic by hostname, X-Trino-XTrinode header, shared pool, or default route. |
| KEDA | Optionally scales worker pools from configured metrics. |
| Cluster API providers | Optionally create per-runtime cloud node pools. |
| XTrinodeCatalog | Keeps catalog declarations separate from runtime lifecycle. |
Query Plane
Section titled “Query Plane”The gateway is the Trino-facing entrypoint. It can sit behind an ingress for external traffic or be reached as an internal Kubernetes service. It is not an Ingress controller; cloud load balancers, DNS, and TLS edge policy stay outside the gateway.
It provides:
- hostname-based routing;
X-Trino-XTrinodeheader routing;- default-route fallback when no explicit selector is provided;
- shared-pool load balancing;
- sticky query routing;
- optional authentication and rate limiting;
- active health checks and circuit breaking;
- auto-resume when a selected backend is paused or unavailable.
The operator writes route entries to the trino-gateway-routes ConfigMap in
the gateway namespace. Health and metrics endpoints bypass gateway auth and
rate limiting so platform probes keep working.
Control Plane
Section titled “Control Plane”The operator is the reconciliation owner. It resolves catalogs, applies runtime guardrails, renders Trino resources, configures scaling, registers gateway routes, evaluates lifecycle state, and updates status.
The API server handles lifecycle requests that require coordination. Resume
requests use Kubernetes Lease objects so many simultaneous first queries do
not stampede the control plane.
When a gateway resume call wins the lease, the API server records resume intent on the runtime and returns retry guidance. If another caller already holds the lease, later callers get retry guidance instead of triggering another resume.
Runtime Lifecycle
Section titled “Runtime Lifecycle”Runtimes move through a declarative lifecycle:
Pending -> Reconciling -> Ready -> Suspending -> Suspended -> Resuming -> Reconciling -> ReadyThe expanded operating model also includes route draining, runtime readiness, resume leases, and finalizer-backed cleanup.
| State | Meaning |
|---|---|
| Pending | A runtime resource exists and needs initial reconciliation. |
| Reconciling | Kubernetes resources, catalogs, routes, and scaling objects are being applied. |
| Ready | The coordinator is reachable and the gateway can send new queries. |
| Suspending | The controller is enforcing suspended invariants. |
| Suspended | Intent or idle policy has paused compute. |
| Resuming | Demand or an explicit command is bringing compute back. |
| Error | Reconciliation or lifecycle control failed and needs operator attention. |
Deletion is handled by finalizers and route cleanup, but it is not a
status.phase value.
Gateway route state is more specific than the high-level runtime phase:
| Route state | Meaning |
|---|---|
RUNNING | New queries can be routed to the backend. |
RESUMING | Resume is in progress; clients should retry later. |
PAUSED | Compute is intentionally unavailable. |
DRAINING | Existing sticky queries can continue, but new queries should avoid the backend. |
REMOVED | The backend has been deregistered. |
Runtime Reconciliation
Section titled “Runtime Reconciliation”For create and update operations, the operator:
- Reads the
XTrinodespec. - Resolves selected
XTrinodeCatalogresources. - Extracts secret references for catalog credentials.
- Applies namespace guardrails before runtime resources.
- Waits for requested node-pool readiness before scheduling Trino pods.
- Applies services, config maps, service accounts, coordinator, workers, and optional monitoring.
- Applies wake TTL and fixed-worker or KEDA scaling resources.
- Publishes gateway route state as
RESUMINGuntil runtime readiness passes, then switches the route toRUNNINGand updates status.
Catalog Flow
Section titled “Catalog Flow”Catalogs are separated from runtimes so data-source definitions can be reused across teams and compute units.
Secret -> XTrinodeCatalog -> catalog ConfigMap with secret-backed placeholders -> selected by XTrinode runtime -> mounted into Trino coordinator and workersScaling Model
Section titled “Scaling Model”XTrinode supports three worker modes:
- fixed worker counts for predictable small deployments;
- KEDA-managed worker pools for dynamic runtime capacity.
- native HPA-managed workers through the privileged
valuesOverlay.server.autoscalingescape hatch.
KEDA can react to metrics, including Prometheus-backed query pressure when that integration is configured. Gateway-observed query pressure is useful for scale-from-zero because worker metrics do not exist while workers are already at zero.
Failure Boundaries
Section titled “Failure Boundaries”| Failure or condition | Expected containment |
|---|---|
| One runtime overloads | Gateway routing, worker limits, backend state, and namespace resources isolate other runtimes. |
| First query hits a suspended runtime | Gateway asks the API server to resume and returns retry guidance. |
| Many clients trigger resume together | API server lease gating lets one resume operation win. |
| Runtime is not ready yet | Operator keeps the backend out of normal routing until readiness passes. |
| Cloud node pool cannot provision | XTrinode status and Kubernetes events expose scheduling and provider failures. |
| Gateway route ConfigMap has invalid YAML or no valid entries | Gateway keeps the last-good in-memory routes instead of replacing them with invalid state. |
| Delete is interrupted | Finalizers and route deregistration let reconciliation resume cleanup. |
Platform Namespaces
Section titled “Platform Namespaces”A common deployment separates:
xtrinode-systemfor the operator and API server;xtrinode-gatewayfor the query gateway;- team namespaces for
XTrinode,XTrinodeCatalog, secrets, runtime pods, KEDA objects, and optional node-pool ownership objects.