Skip to content

Azure

Azure/AKS/CAPZ support is present as checked-in infrastructure and deployment material, but it is an experimental provider-validation path. Treat this page as a status and orientation note, not a hardened runbook or parity claim with the GCP path.

AreaStatus
TerraformChecked-in module material for resource group, VNet, AKS, ACR, optional PostgreSQL, and namespaces.
Deploy automationChecked-in scripts exist for provider validation; live parity with GCP is still tracked.
RegistryACR-backed image publishing is wired in the repository path.
Runtime node poolsAzure provider resource generation is covered locally; live CAPZ-managed node-pool parity remains tracked.
Auth postureCurrent validation uses the already-authenticated Azure CLI account in a sandbox subscription.

Use a sandbox subscription or resource group. Do not commit credentials, generated kubeconfigs, Terraform state, or terraform.tfvars.

The repository includes:

  • Terraform for the base AKS environment;
  • ACR image publishing flow for XTrinode components;
  • deployment script wiring for CRDs, API server auth Secrets, Helm charts, and optional observability;
  • node-pool resource generation code and tests;
  • private-cluster access notes.

The main repository carries Azure deploy targets and scripts, but this page intentionally stays at status level until the live provider smoke path is promoted.

  • The path is not yet documented as a production runbook.
  • Live CAPZ-managed node-pool parity is still tracked.
  • IAM is not yet least-privilege documented.
  • Terraform defaults create a private AKS cluster, so Helm and kubectl need a network path to the AKS API server.
  • Validate suspend, resume, gateway routing, KEDA scaling, image pulls, and node-pool behavior in the target subscription before relying on the path.

Before this page becomes a full runbook, Azure needs live validation for:

  • Terraform from-zero deployment;
  • registry image push and pull from AKS nodes;
  • Helm install and upgrade from a valid private-cluster access path;
  • API server auth and gateway resume-token flow;
  • gateway routing, suspend, resume, KEDA scaling, and teardown;
  • managed node-pool behavior through the chosen Azure provider integration.

After that validation, the broad CLI login should be replaced with a least-privilege service principal or managed identity and exact permissions should be documented.