Kubernetes on AKS: Production Best Practices

Introduction: Production AKS is Different

Running Kubernetes in a demo is straightforward. A single node pool, default settings, and a few kubectl apply commands and you have something working. Production is a different challenge entirely. The configuration decisions you make when provisioning a cluster are often difficult and costly to change later — node pool architecture, networking model, identity strategy, and autoscaling configuration all have long-term implications.

AKS abstracts the Kubernetes control plane, but everything else is your responsibility: node sizing, pod resource limits, network policies, secret management, and high availability. This guide covers the production best practices we apply on every AKS cluster we deploy — not theory, but the specific configuration choices that prevent incidents.

“The most expensive AKS mistakes are not the ones that cause outages — they are the configuration decisions made in week one that force a cluster rebuild in month six. Get the foundations right from the start.”

Separate system and user node pools to protect cluster-critical components from noisy workloads.
Set resource requests and limits on every pod — without them, one bad deployment can starve the entire node.
Use Pod Disruption Budgets to guarantee availability during node upgrades and autoscaling events.
Replace pod-managed identities with Azure Workload Identity — the deprecated approach is a security risk.
Deploy across availability zones for resilience against single datacenter failures.

Cluster Configuration Best Practices

Separate system and user node pools

AKS requires at least one system node pool to run cluster-critical components: CoreDNS, the metrics server, and the kube-proxy. If a workload on the same node pool consumes all available CPU or memory, these components are evicted and the cluster becomes unstable. Always separate system and user node pools.

yaml

# System node pool — cluster-critical components only
az aks nodepool add \
  --cluster-name myAKSCluster \
  --resource-group myRG \
  --name system \
  --mode System \
  --node-count 3 \
  --node-vm-size Standard_D2s_v3 \
  --zones 1 2 3 \
  --node-taints CriticalAddonsOnly=true:NoSchedule

# User node pool — application workloads
az aks nodepool add \
  --cluster-name myAKSCluster \
  --resource-group myRG \
  --name apps \
  --mode User \
  --node-count 3 \
  --node-vm-size Standard_D4s_v3 \
  --zones 1 2 3 \
  --enable-cluster-autoscaler \
  --min-count 2 \
  --max-count 10

Resource requests and limits

Kubernetes schedules pods based on resource requests — the guaranteed minimum CPU and memory a pod needs. Without requests, the scheduler places pods arbitrarily, leading to overloaded nodes and evictions. Without limits, a single runaway pod can consume an entire node's resources.

yaml

# Every container must have requests and limits defined
spec:
  containers:
    - name: api
      image: myregistry.azurecr.io/api:1.0.0
      resources:
        requests:
          cpu: 100m       # 0.1 CPU cores guaranteed
          memory: 128Mi   # 128 MB guaranteed
        limits:
          cpu: 500m       # max 0.5 CPU cores
          memory: 256Mi   # max 256 MB — OOMKilled if exceeded

Use a LimitRange in each namespace to enforce default requests and limits for pods that do not specify them. This prevents a missing resources block from silently deploying with no constraints.

yaml

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
    - type: Container
      default:
        cpu: 200m
        memory: 256Mi
      defaultRequest:
        cpu: 100m
        memory: 128Mi

Pod Disruption Budgets

During node upgrades, autoscaling scale-down events, or voluntary evictions, Kubernetes may need to terminate pods. Without a Pod Disruption Budget (PDB), it can terminate all replicas of a deployment simultaneously — causing a full outage. A PDB guarantees that a minimum number of pods remain available during disruption.

yaml

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-pdb
  namespace: production
spec:
  minAvailable: 2        # at least 2 pods must remain running
  selector:
    matchLabels:
      app: api

Network Policies for micro-segmentation

By default, every pod in a Kubernetes cluster can communicate with every other pod. In production, apply Network Policies to restrict traffic to only what is explicitly needed — deny all, then allow specific paths.

yaml

# Deny all ingress by default for a namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes: [Ingress]
---
# Allow ingress to the API only from the ingress controller
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ingress-to-api
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: api
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: ingress-nginx

Workload Identity and Secret Management

Applications running on AKS frequently need to access other Azure services: Key Vault for secrets, Storage for files, Service Bus for messaging. The wrong way to do this is to put service principal credentials in environment variables or Kubernetes Secrets (which are only base64-encoded, not encrypted at rest by default). The right way is Azure Workload Identity.

Azure Workload Identity

Azure Workload Identity allows a Kubernetes pod to authenticate to Azure services using a federated identity — no client secrets, no certificates, no credentials to rotate. The pod's Kubernetes service account is linked to an Azure Managed Identity via OIDC federation. When the pod calls an Azure SDK, it automatically gets a token.

yaml

# 1. Annotate the Kubernetes service account with the managed identity client ID
apiVersion: v1
kind: ServiceAccount
metadata:
  name: api-service-account
  namespace: production
  annotations:
    azure.workload.identity/client-id: "<managed-identity-client-id>"

---
# 2. Label the pod to use workload identity
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  template:
    metadata:
      labels:
        azure.workload.identity/use: "true"
    spec:
      serviceAccountName: api-service-account
      containers:
        - name: api
          image: myregistry.azurecr.io/api:1.0.0

csharp

// In the application — DefaultAzureCredential picks up the workload identity token automatically
using Azure.Identity;
using Azure.Security.KeyVault.Secrets;

var client = new SecretClient(
    new Uri("https://my-keyvault.vault.azure.net/"),
    new DefaultAzureCredential()   // uses workload identity when running on AKS
);

var secret = await client.GetSecretAsync("DatabaseConnectionString");

Secrets Store CSI Driver

The Secrets Store CSI Driver mounts Azure Key Vault secrets directly into pods as files or environment variables, keeping secrets out of Kubernetes Secret objects entirely. Secrets are fetched from Key Vault at pod startup and automatically rotated when they change.

yaml

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: keyvault-secrets
  namespace: production
spec:
  provider: azure
  parameters:
    usePodIdentity: "false"
    clientID: "<managed-identity-client-id>"
    keyvaultName: "my-keyvault"
    tenantId: "<tenant-id>"
    objects: |
      array:
        - |
          objectName: DatabaseConnectionString
          objectType: secret
        - |
          objectName: ApiKey
          objectType: secret

High Availability and Autoscaling

A production AKS cluster must survive the failure of a single node, availability zone, or even a temporary Azure platform issue — without an outage. High availability on AKS is achieved through a combination of multi-zone node pools, the cluster autoscaler, and the Horizontal Pod Autoscaler.

Multi-zone node pools

Deploy node pools across all three availability zones in your Azure region. AKS spreads nodes evenly across zones. Combined with pod anti-affinity rules that prevent multiple replicas from landing on the same zone, your application survives a full zone outage.

yaml

# Force replicas to spread across availability zones
spec:
  template:
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: api

Cluster Autoscaler

The Cluster Autoscaler adds nodes when pods cannot be scheduled due to insufficient resources, and removes nodes when they are underutilised. Configure it with a sensible min/max range and tune the scale-down delay to avoid aggressive deprovisioning that causes pod churn.

yaml

# Cluster autoscaler profile — applied at cluster level
az aks update \
  --resource-group myRG \
  --name myAKSCluster \
  --cluster-autoscaler-profile \
    scale-down-delay-after-add=10m \
    scale-down-unneeded-time=10m \
    scale-down-utilization-threshold=0.5 \
    max-graceful-termination-sec=600

Horizontal Pod Autoscaler (HPA)

The HPA scales the number of pod replicas based on CPU utilisation, memory, or custom metrics. It works in tandem with the Cluster Autoscaler: HPA requests more pods, the autoscaler adds more nodes to accommodate them.

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70   # scale up when avg CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Cluster upgrades without downtime

AKS releases new Kubernetes versions regularly. Staying within the supported version window (N-2 minor versions) is required for Microsoft support. Use the surge upgrade feature to provision extra nodes before draining old ones — this avoids capacity constraints during upgrades and, combined with PDBs, ensures zero-downtime rolling upgrades.

Want us to review your AKS configuration?

We audit production AKS clusters, identify gaps in security, reliability, and cost efficiency, and provide a prioritised remediation plan.

Book a cluster review

Closing Thoughts

Production AKS is not complicated — but it requires deliberate configuration from the start. Separate your node pools, set resource requests and limits on every pod, deploy across availability zones, and replace any pod-managed identities with Azure Workload Identity. These changes alone will make your cluster significantly more reliable, secure, and cost-efficient.

Add Pod Disruption Budgets before you enable automatic upgrades, configure the Cluster Autoscaler with conservative scale-down settings, and use the Secrets Store CSI Driver to keep credentials out of Kubernetes Secrets. The teams that invest in these foundations in week one never have to deal with the painful cluster rebuilds that come from skipping them.

Kubernetes on AKS: Production Best Practices

Introduction: Production AKS is Different

Cluster Configuration Best Practices

Separate system and user node pools

Resource requests and limits

Pod Disruption Budgets

Network Policies for micro-segmentation

Workload Identity and Secret Management

Azure Workload Identity

Secrets Store CSI Driver

High Availability and Autoscaling

Multi-zone node pools

Cluster Autoscaler

Horizontal Pod Autoscaler (HPA)

Cluster upgrades without downtime

Closing Thoughts

More articles

CI/CD Pipelines with Azure DevOps and GitHub Actions

Building RAG Pipelines with Azure AI Search and GPT-4o

Azure Cost Optimisation: Cut Your Cloud Bill by 40%