GCP
GCP/GKE/CAPG is the fully exercised checked-in cloud integration path for the current release line. It includes Terraform, deployment automation, CAPG bootstrap, managed node-pool smoke coverage, and KEDA/resume smoke coverage. AWS/EKS and Azure/AKS also have Terraform and registry/deploy material, but they remain experimental provider-validation paths until live CAPA/CAPZ parity is proven.
What The GCP Path Covers
Section titled “What The GCP Path Covers”The GCP path combines:
- Terraform for base infrastructure;
- GKE for the Kubernetes runtime;
- Artifact Registry when a private image mirror is required;
- GHCR images for the public XTrinode component image tags;
- CAPI/CAPG for managed GKE cluster and node-pool integration;
- Helm for operator, API server, gateway, and CRD installation.
IAM Bootstrap
Section titled “IAM Bootstrap”The checked-in GCP path currently uses the already-authenticated gcloud
account for sandbox provider validation. Use a project-admin or Owner-level
account only in a disposable project, and replace it with a dedicated
least-privilege service account when the provider path is hardened.
Do not commit credentials, generated kubeconfigs, Terraform state, or
terraform.tfvars.
Configure local auth:
export GCP_PROJECT_ID="<YOUR_PROJECT_ID>"
gcloud auth logingcloud auth application-default logingcloud config set project "$GCP_PROJECT_ID"Enable the APIs used by Terraform:
gcloud services enable \ container.googleapis.com \ sqladmin.googleapis.com \ artifactregistry.googleapis.com \ servicenetworking.googleapis.com \ compute.googleapis.comFrom-Scratch GCP Deploy
Section titled “From-Scratch GCP Deploy”Terraform prepares the base environment:
VPC -> subnet -> firewall rules -> NAT -> optional database services -> Artifact Registry -> IAMFor a first deploy from scratch, use the main repository’s ordered GCP targets:
make gcp-management-upmake gcp-images-pushmake gcp-control-plane-deploymake gcp-management-up wraps the safer two-phase Terraform flow: create the
GKE management cluster first, configure kubectl, then apply the Kubernetes and
cloud add-ons. Use the lower-level Terraform targets only for Terraform-only
debugging.
For private clusters, use Cloud Shell, a bastion, VPN, or an authorized network entry to obtain Kubernetes credentials.
gcloud container clusters get-credentials xtrinode-gke-test \ --zone us-central1-a \ --project "$GCP_PROJECT_ID"Images
Section titled “Images”Current component image tags are listed in Versioning. If your GCP organization requires private regional images, mirror those tags into Artifact Registry and configure the Helm image registry values accordingly.
The main repository provides a Makefile target for the Artifact Registry image path:
make gcp-images-pushIf the GKE infrastructure already exists and images are already pushed,
deploy-gcp is the shorter control-plane redeploy path:
make deploy-gcpCAPI/CAPG Flow
Section titled “CAPI/CAPG Flow”CAPI/CAPG can create the GKE workload cluster and provider-managed node pools after the GCP control plane is deployed. The high-level sequence is:
- Bring up the GCP management cluster and XTrinode control plane.
- Bootstrap CAPI/CAPG management components.
- Create the CAPG-managed GKE workload cluster.
- Run the managed node-pool smoke path.
- Inspect the CAPG workload cluster nodes.
CAPG GKE support uses the EXP_CAPG_GKE=true feature gate.
After the GCP control plane is deployed, the CAPG validation flow uses:
make gcp-capg-management-upmake gcp-capg-workload-upmake gcp-capg-nodepool-smokemake gcp-capg-workload-nodesRuntime Node Pool Example
Section titled “Runtime Node Pool Example”apiVersion: analytics.xtrinode.io/v1kind: XTrinodemetadata: name: analytics namespace: team-aspec: size: s minWorkers: 0 maxWorkers: 8 nodePool: name: analytics-nodes provider: gcp providerMode: managed clusterName: xtrinode-gke-test kubernetesVersion: v1.35.3 minNodes: 0 maxNodes: 10 nodeLabels: xtrinode.io/runtime: analytics xtrinode.io/node-pool: analytics-nodes gcp: machineType: e2-standard-4The operator creates the provider node-pool resources. CAPG and the cloud provider create the actual nodes inside the existing GCP network. Cluster Autoscaler then scales the node pool based on pod scheduling demand.
GCP Checks
Section titled “GCP Checks”kubectl get cluster -Akubectl get machinepool -Akubectl describe machinepool analytics-nodes -n team-akubectl get nodes -L xtrinode.io/node-pool,xtrinode.io/runtimekubectl get events -n team-a --sort-by='.lastTimestamp'For failed node pools, verify the GCP project, region, cluster name, IAM roles, quota, CAPG controller health, and machine type availability.
Hardening Later
Section titled “Hardening Later”After live provider paths are hardened, replace direct project-admin usage with a dedicated service account and document the exact required roles.