Skip to content

Getting Started

Start with the path that matches what you are trying to prove.

An XTrinode resource represents a Trino compute runtime. It is not a new SQL engine. It is the platform-owned identity around a Trino coordinator, workers, catalog selection, routing, lifecycle, scaling, and status.

The practical model is:

  • platform teams install the control plane once;
  • workload teams request named runtimes;
  • the operator reconciles Kubernetes resources;
  • the gateway gives users a stable Trino endpoint;
  • the API server coordinates lifecycle actions such as resume and suspend.

Use the local k3d and Tilt workflow from the main XTrinode repository when you want to see the operator, gateway, API server, KEDA, Redis, and a real Trino deployment working together.

The local path is best for:

  • understanding reconciliation behavior;
  • testing gateway routing and resume behavior;
  • iterating on operator or gateway code;
  • validating example manifests before moving to cloud infrastructure.

Use the umbrella Helm chart to install the control-plane components and custom resource definitions into Kubernetes:

  • xtrinode-operator for reconciliation;
  • xtrinode-api-server for lifecycle operations;
  • xtrinode-gateway for Trino query routing;
  • CRDs for XTrinode and XTrinodeCatalog;
  • optional KEDA resources.

Keep chart versions and component image versions aligned for the release you deploy. Control-plane images are published through GHCR. See Versioning and Packaging.

Create an XTrinode custom resource in the namespace owned by the team or workload. Start with a small runtime and add catalogs, routing, KEDA, or a node-pool spec when those boundaries are needed.

apiVersion: analytics.xtrinode.io/v1
kind: XTrinode
metadata:
name: analytics
namespace: team-a
spec:
size: s
minWorkers: 0
maxWorkers: 2
suspended: false
routing:
hostnameDomain: trino.example.com

The size field selects the starting machine and pod-resource profile. Use it as the first sizing decision before reaching for low-level overrides.

Common first choices:

Workload shapeRuntime shape
Small ad hoc SQLsize: s, fixed low worker count
Intermittent analyticsminWorkers: 0, KEDA query scaling
Team-isolated production runtimenamespace boundary, catalog selector, dedicated route
Cost or blast-radius isolationoptional spec.nodePool

Define XTrinodeCatalog resources for the data sources a team should query, then select those catalogs from the runtime. Secrets remain Kubernetes secrets; catalog resources reference them instead of embedding credentials.

spec:
catalogSelector:
matchLabels:
team: analytics

Connect through the gateway by hostname, X-Trino-XTrinode header, or a default route. Dedicated runtimes normally get dedicated routing groups. Shared pools can load-balance multiple runtime backends.

Hostname routing is the cleanest user experience when you can create DNS names:

spec:
routing:
hostnameDomain: trino.example.com

That gives a stable runtime hostname such as:

Terminal window
trino --server https://team-a--analytics.trino.example.com

The generated hostname uses the routing group. For a dedicated runtime without an explicit routingGroup, the operator derives namespace--name.

Header routing is useful when several runtimes share the same gateway hostname:

spec:
routing:
header: X-Trino-XTrinode=team-a/analytics

Then connect through the shared gateway endpoint:

Terminal window
trino \
--server https://trino-gateway.example.com \
--http-header "X-Trino-XTrinode: team-a/analytics"

Shared pools use one routingGroup across multiple runtimes:

spec:
routing:
routingGroup: shared-analytics

Use shared pools when several compatible runtimes can load-balance query traffic. Use dedicated routing when a team or workload needs clear ownership, isolation, or troubleshooting boundaries.

For the current platform positioning:

  • use GCP/GKE/CAPG when you want the fully exercised checked-in cloud path;
  • use generic Kubernetes when you already have a cluster and registry access;
  • use AWS or Azure when you are ready to work with the provider-specific Terraform and registry/deploy material while live CAPA/CAPZ parity continues to mature.
PathUse whenStart here
Local k3d/TiltYou want to validate operator, gateway, API server, KEDA, Redis, and real Trino behavior quickly.Main repository local development workflow
Generic KubernetesYou already have cluster, registry, ingress, DNS, and Helm ownership.Deployment
GCP/GKE/CAPGYou want the fully exercised cloud path with Terraform, GKE, Artifact Registry, CAPG, runtime node pools, and KEDA/resume smoke coverage.GCP
AWS/EKS/CAPAYou want the experimental provider-validation path with checked-in Terraform, registry/deploy material, and tested provider resource generation.AWS
Azure/AKS/CAPZYou want the experimental provider-validation path with checked-in Terraform, registry/deploy material, and tested provider resource generation.Azure

Current component image tags are listed in Versioning.

For a first GCP deployment from the main repository, use the ordered from-scratch sequence:

Terminal window
make gcp-management-up
make gcp-images-push
make gcp-control-plane-deploy

make gcp-management-up creates the GKE management cluster first, configures kubectl, then applies the Kubernetes and cloud add-ons. If the GKE infrastructure already exists and images are already pushed, make deploy-gcp is fine for redeploying the control plane.

Start with fixed workers, then enable KEDA once you have Prometheus or another scaling signal ready.

spec:
minWorkers: 0
maxWorkers: 8
keda:
enabled: true
scalerType: prometheus
scalingMetric: query
threshold: "1"
prometheusServer: http://prometheus-operated.monitoring.svc.cluster.local:9090

When you need node-level isolation or chargeback on GCP, add a managed node pool spec. The runtime remains the user-facing unit; the node pool is just the capacity boundary behind it.

spec:
size: s
nodePool:
name: analytics-nodes
provider: gcp
providerMode: managed
clusterName: xtrinode-gke-test
kubernetesVersion: v1.35.3
minNodes: 0
maxNodes: 10
gcp:
machineType: e2-standard-4

Read Scaling when choosing between fixed workers, KEDA query scaling, scale-from-zero, warm floors, and node-pool behavior. Read Sizing and overrides before using valuesOverlay or changing provider machine types.