Introduction: Production AKS is Different
Running Kubernetes in a demo is straightforward. A single node pool, default settings, and a few kubectl apply commands and you have something working. Production is a different challenge entirely. The configuration decisions you make when provisioning a cluster are often difficult and costly to change later — node pool architecture, networking model, identity strategy, and autoscaling configuration all have long-term implications.
AKS abstracts the Kubernetes control plane, but everything else is your responsibility: node sizing, pod resource limits, network policies, secret management, and high availability. This guide covers the production best practices we apply on every AKS cluster we deploy — not theory, but the specific configuration choices that prevent incidents.
“The most expensive AKS mistakes are not the ones that cause outages — they are the configuration decisions made in week one that force a cluster rebuild in month six. Get the foundations right from the start.”
- Separate system and user node pools to protect cluster-critical components from noisy workloads.
- Set resource requests and limits on every pod — without them, one bad deployment can starve the entire node.
- Use Pod Disruption Budgets to guarantee availability during node upgrades and autoscaling events.
- Replace pod-managed identities with Azure Workload Identity — the deprecated approach is a security risk.
- Deploy across availability zones for resilience against single datacenter failures.
Cluster Configuration Best Practices
Separate system and user node pools
AKS requires at least one system node pool to run cluster-critical components: CoreDNS, the metrics server, and the kube-proxy. If a workload on the same node pool consumes all available CPU or memory, these components are evicted and the cluster becomes unstable. Always separate system and user node pools.
# System node pool — cluster-critical components only
az aks nodepool add \
--cluster-name myAKSCluster \
--resource-group myRG \
--name system \
--mode System \
--node-count 3 \
--node-vm-size Standard_D2s_v3 \
--zones 1 2 3 \
--node-taints CriticalAddonsOnly=true:NoSchedule
# User node pool — application workloads
az aks nodepool add \
--cluster-name myAKSCluster \
--resource-group myRG \
--name apps \
--mode User \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--zones 1 2 3 \
--enable-cluster-autoscaler \
--min-count 2 \
--max-count 10Resource requests and limits
Kubernetes schedules pods based on resource requests — the guaranteed minimum CPU and memory a pod needs. Without requests, the scheduler places pods arbitrarily, leading to overloaded nodes and evictions. Without limits, a single runaway pod can consume an entire node's resources.
# Every container must have requests and limits defined
spec:
containers:
- name: api
image: myregistry.azurecr.io/api:1.0.0
resources:
requests:
cpu: 100m # 0.1 CPU cores guaranteed
memory: 128Mi # 128 MB guaranteed
limits:
cpu: 500m # max 0.5 CPU cores
memory: 256Mi # max 256 MB — OOMKilled if exceededUse a LimitRange in each namespace to enforce default requests and limits for pods that do not specify them. This prevents a missing resources block from silently deploying with no constraints.
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: production
spec:
limits:
- type: Container
default:
cpu: 200m
memory: 256Mi
defaultRequest:
cpu: 100m
memory: 128MiPod Disruption Budgets
During node upgrades, autoscaling scale-down events, or voluntary evictions, Kubernetes may need to terminate pods. Without a Pod Disruption Budget (PDB), it can terminate all replicas of a deployment simultaneously — causing a full outage. A PDB guarantees that a minimum number of pods remain available during disruption.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
namespace: production
spec:
minAvailable: 2 # at least 2 pods must remain running
selector:
matchLabels:
app: apiNetwork Policies for micro-segmentation
By default, every pod in a Kubernetes cluster can communicate with every other pod. In production, apply Network Policies to restrict traffic to only what is explicitly needed — deny all, then allow specific paths.
# Deny all ingress by default for a namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
namespace: production
spec:
podSelector: {}
policyTypes: [Ingress]
---
# Allow ingress to the API only from the ingress controller
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-to-api
namespace: production
spec:
podSelector:
matchLabels:
app: api
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: ingress-nginxWorkload Identity and Secret Management
Applications running on AKS frequently need to access other Azure services: Key Vault for secrets, Storage for files, Service Bus for messaging. The wrong way to do this is to put service principal credentials in environment variables or Kubernetes Secrets (which are only base64-encoded, not encrypted at rest by default). The right way is Azure Workload Identity.
Azure Workload Identity
Azure Workload Identity allows a Kubernetes pod to authenticate to Azure services using a federated identity — no client secrets, no certificates, no credentials to rotate. The pod's Kubernetes service account is linked to an Azure Managed Identity via OIDC federation. When the pod calls an Azure SDK, it automatically gets a token.
# 1. Annotate the Kubernetes service account with the managed identity client ID
apiVersion: v1
kind: ServiceAccount
metadata:
name: api-service-account
namespace: production
annotations:
azure.workload.identity/client-id: "<managed-identity-client-id>"
---
# 2. Label the pod to use workload identity
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
template:
metadata:
labels:
azure.workload.identity/use: "true"
spec:
serviceAccountName: api-service-account
containers:
- name: api
image: myregistry.azurecr.io/api:1.0.0// In the application — DefaultAzureCredential picks up the workload identity token automatically
using Azure.Identity;
using Azure.Security.KeyVault.Secrets;
var client = new SecretClient(
new Uri("https://my-keyvault.vault.azure.net/"),
new DefaultAzureCredential() // uses workload identity when running on AKS
);
var secret = await client.GetSecretAsync("DatabaseConnectionString");Secrets Store CSI Driver
The Secrets Store CSI Driver mounts Azure Key Vault secrets directly into pods as files or environment variables, keeping secrets out of Kubernetes Secret objects entirely. Secrets are fetched from Key Vault at pod startup and automatically rotated when they change.
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: keyvault-secrets
namespace: production
spec:
provider: azure
parameters:
usePodIdentity: "false"
clientID: "<managed-identity-client-id>"
keyvaultName: "my-keyvault"
tenantId: "<tenant-id>"
objects: |
array:
- |
objectName: DatabaseConnectionString
objectType: secret
- |
objectName: ApiKey
objectType: secretHigh Availability and Autoscaling
A production AKS cluster must survive the failure of a single node, availability zone, or even a temporary Azure platform issue — without an outage. High availability on AKS is achieved through a combination of multi-zone node pools, the cluster autoscaler, and the Horizontal Pod Autoscaler.
Multi-zone node pools
Deploy node pools across all three availability zones in your Azure region. AKS spreads nodes evenly across zones. Combined with pod anti-affinity rules that prevent multiple replicas from landing on the same zone, your application survives a full zone outage.
# Force replicas to spread across availability zones
spec:
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: apiCluster Autoscaler
The Cluster Autoscaler adds nodes when pods cannot be scheduled due to insufficient resources, and removes nodes when they are underutilised. Configure it with a sensible min/max range and tune the scale-down delay to avoid aggressive deprovisioning that causes pod churn.
# Cluster autoscaler profile — applied at cluster level
az aks update \
--resource-group myRG \
--name myAKSCluster \
--cluster-autoscaler-profile \
scale-down-delay-after-add=10m \
scale-down-unneeded-time=10m \
scale-down-utilization-threshold=0.5 \
max-graceful-termination-sec=600Horizontal Pod Autoscaler (HPA)
The HPA scales the number of pod replicas based on CPU utilisation, memory, or custom metrics. It works in tandem with the Cluster Autoscaler: HPA requests more pods, the autoscaler adds more nodes to accommodate them.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # scale up when avg CPU > 70%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Cluster upgrades without downtime
AKS releases new Kubernetes versions regularly. Staying within the supported version window (N-2 minor versions) is required for Microsoft support. Use the surge upgrade feature to provision extra nodes before draining old ones — this avoids capacity constraints during upgrades and, combined with PDBs, ensures zero-downtime rolling upgrades.
Want us to review your AKS configuration?
We audit production AKS clusters, identify gaps in security, reliability, and cost efficiency, and provide a prioritised remediation plan.
Closing Thoughts
Production AKS is not complicated — but it requires deliberate configuration from the start. Separate your node pools, set resource requests and limits on every pod, deploy across availability zones, and replace any pod-managed identities with Azure Workload Identity. These changes alone will make your cluster significantly more reliable, secure, and cost-efficient.
Add Pod Disruption Budgets before you enable automatic upgrades, configure the Cluster Autoscaler with conservative scale-down settings, and use the Secrets Store CSI Driver to keep credentials out of Kubernetes Secrets. The teams that invest in these foundations in week one never have to deal with the painful cluster rebuilds that come from skipping them.



