BlogCloud Costs

Azure Cost Optimisation: Cut Your Cloud Bill by 40%

Cloud computing promised to reduce costs.

Author

Artan Ajredini

Artan Ajredini

CEO & Cloud Architect

5 min read
24 March 2025

Why Cloud Costs Spiral

Cloud computing promised to reduce costs. Pay only for what you use, scale down when demand drops, no more idle hardware. In practice, most organisations find their Azure bill grows faster than expected — and faster than the business value it delivers.

The root cause is almost always the same: cloud resources are easy to provision and easy to forget. A developer spins up a VM to test something, the test finishes, the VM stays running. A project ends, but the storage accounts, public IPs, and load balancers quietly accumulate charges every hour.

In our experience working with Azure customers across industries, most organisations have between 25% and 45% in immediate savings available — without any impact on performance or reliability.

The good news: cloud cost optimisation is not a one-time project. It is a discipline. Once you establish the right habits, monitoring processes, and automated guardrails, costs become predictable and controllable. This guide walks through the most impactful techniques, in order of effort vs. return.

Right-Sizing Virtual Machines

Virtual machines are typically the largest line item on an Azure bill. They are also the most common source of waste. Teams provision VMs based on peak estimated load, then those VMs run at 5–15% CPU utilisation for months.

Azure Advisor analyses your VM usage and flags machines that are consistently over-provisioned. It recommends smaller SKUs that would handle your actual workload at a fraction of the cost. This is the first place to look.

How to use Azure Advisor for right-sizing

  1. Open the Azure Portal and search for "Azure Advisor".
  2. Go to the Cost tab — this lists all right-sizing recommendations.
  3. Review each recommendation: Advisor shows current vs. recommended SKU, estimated monthly savings, and CPU/memory percentile data.
  4. For each VM, verify the workload profile: is the low CPU expected (e.g. a scheduled batch job) or is it genuinely idle?
  5. Resize or deallocate as appropriate. For production VMs, schedule the resize during a maintenance window.

A common pattern: teams run D4s_v3 (4 vCPUs, 16 GB RAM) when a B2ms (2 vCPUs, 8 GB RAM) or even a B1ms would suffice. The price difference can be 50–70% for the same availability.

Use Burstable VMs for variable workloads

The B-series (Burstable) VMs are ideal for workloads that are mostly idle but occasionally spike — development servers, CI build agents, internal tooling. They accumulate CPU credits during idle periods and spend them on bursts. For these workloads, B-series can cost 3–4× less than equivalent D-series VMs.

Reserved Instances and Savings Plans

If you have workloads that run continuously — production web servers, databases, always-on services — you are almost certainly overpaying by using pay-as-you-go pricing.

Azure offers two commitment-based discount models:

  • Reserved Instances (RIs) — commit to a specific VM size in a specific region for 1 or 3 years. Discounts of 40–72% off pay-as-you-go. Best for stable, predictable workloads.
  • Azure Savings Plans — commit to a fixed hourly spend (e.g. $5/hour) across any compute. More flexible than RIs: the discount applies across VM sizes, regions, and even Azure App Service. Typical savings: 15–37%.

Reservations are not a lock-in trap — they are a pricing instrument. You can exchange or cancel a reservation within the first 12 months, subject to an early termination fee.

Which one to choose?

Use Reserved Instances when the workload is stable and you know the exact VM SKU and region. Use Savings Plans when your compute footprint is more dynamic — you scale between sizes, move across regions, or run a mix of VMs and App Service.

A practical approach: run your workload on pay-as-you-go for 30–60 days, export your usage from Azure Cost Management, identify the steady-state baseline, then cover that baseline with reservations. Keep the variable capacity on pay-as-you-go.

Auto-Shutdown and Serverless Replacements

Non-production environments — development, staging, QA — do not need to run 24/7. A dev VM running overnight and on weekends that nobody is using is pure waste.

Auto-shutdown for dev/test VMs

Azure DevTest Labs and the native VM auto-shutdown feature let you schedule VMs to shut down automatically at a defined time. Enable it across all non-production VMs and you can cut their cost by 65% or more (e.g. running 9am–7pm weekdays = ~55 hours/week vs. 168 hours/week).

json
{
  "autoShutdownProfile": {
    "shutdown": {
      "status": "Enabled",
      "taskType": "ComputeVmShutdownTask",
      "dailyRecurrence": { "time": "1900" },
      "timeZoneId": "UTC",
      "notificationSettings": {
        "status": "Enabled",
        "timeInMinutes": 30
      }
    }
  }
}

Replace always-on VMs with serverless

For batch workloads, background jobs, and event-driven processing, an always-on VM is overkill. Azure Container Apps, Azure Functions, and Azure Container Instances offer consumption-based pricing — you pay only when code is running.

  • Azure Functions — best for short-lived, event-triggered tasks (queue processing, HTTP triggers, timers). Free tier: 1 million executions/month.
  • Azure Container Apps — best for containerised microservices that scale to zero. No minimum instance cost when idle.
  • Azure Container Instances — best for one-off batch jobs. Billed per second, no standing infrastructure.

A common migration pattern: a VM running a cron job every 15 minutes, consuming 100% of a $150/month machine, replaced by an Azure Function on a Consumption plan costing under $5/month.

Identifying and Removing Orphaned Resources

Orphaned resources are Azure assets that were created to support something that no longer exists. They generate charges silently in the background. The most common culprits:

  • Managed disks — VMs are deleted but their OS and data disks remain. A 256 GB Premium SSD costs ~$35/month unattached.
  • Public IP addresses — reserved IPs not attached to any resource still incur a small hourly charge.
  • Load balancers — provisioned for a service that was decommissioned.
  • Snapshots — point-in-time disk snapshots accumulate over time and are rarely cleaned up.
  • App Service Plans — the plan incurs cost even if all apps running on it are deleted.
  • Unused Azure SQL databases — databases in the "paused" state (serverless tier) or low-DTU tiers left over from old projects.

Automated orphan detection

Azure Resource Graph lets you query your entire subscription for unattached resources. The query below finds all managed disks with no owner:

kusto
Resources
| where type == "microsoft.compute/disks"
| where properties.diskState == "Unattached"
| project name, resourceGroup,
          sku = tostring(properties.sku.name),
          sizeGB = properties.diskSizeGB,
          location
| order by sizeGB desc

Run this query monthly in Azure Resource Graph Explorer (Portal → Resource Graph Explorer). Build a similar query for unattached public IPs, empty App Service Plans, and unused snapshots. Tag anything you are unsure about and set a 30-day review reminder before deleting.

Cost Alerts, Budgets, and Governance

Cost optimisation is not a one-time cleanup — it requires ongoing visibility. Without alerts, a misconfigured autoscaler or a runaway background job can double your bill before anyone notices.

Set up budgets in Azure Cost Management

  1. Open Azure Cost Management + Billing in the Portal.
  2. Navigate to Budgets and click Add.
  3. Set a monthly budget at the subscription or resource group level.
  4. Configure alert thresholds at 80%, 100%, and 120% of budget.
  5. Add email recipients — include the team lead and the Azure admin.
  6. Optionally trigger an action group to notify via Teams or PagerDuty.

Use cost allocation tags

Enforce a tagging policy so every resource is tagged with at minimum: Environment (dev/staging/prod), Team or CostCentre, and Project. This lets you break down costs by team and project in Cost Management, making it immediately clear who owns the spend.

bicep
resource taggingPolicy 'Microsoft.Authorization/policyAssignments@2022-06-01' = {
  name: 'require-cost-tags'
  properties: {
    displayName: 'Require cost allocation tags'
    policyDefinitionId: '/providers/Microsoft.Authorization/policyDefinitions/require-tag-on-resource'
    parameters: {
      tagName: { value: 'CostCentre' }
    }
    enforcementMode: 'Default'
  }
}

Azure Policy for cost guardrails

Use Azure Policy to prevent expensive resources from being created without approval. Common guardrails include: blocking VM SKUs above a certain size, requiring tags on all resources, and denying resource creation in regions outside your approved list. These prevent cost surprises before they happen.

Want us to audit your Azure costs?

We will review your subscription, identify the biggest savings opportunities, and give you a prioritised action plan — free of charge.

Book a free cost review

Closing Thoughts

A 25–40% cost reduction is achievable in most Azure subscriptions without touching architecture or sacrificing reliability. The work is not glamorous — right-sizing VMs, cleaning up orphaned disks, turning off dev environments at night — but the compound effect is significant.

Start with Azure Advisor and Resource Graph. They will surface the quick wins. Then establish budgets, alerts, and a monthly cost review cadence. Cost optimisation is a culture, not a project — and the teams that treat it that way consistently get more from their cloud investment.

More articles

View all
Building Cloud-Native Microservices on Azure
about 1 year ago1 min read

Building Cloud-Native Microservices on Azure

Moving from a monolithic architecture to microservices unlocks independent deployability, targeted scaling, and team autonomy — but it also introduces complexity around service discovery, distributed tracing, and data consistency. In this deep-dive, we design a cloud-native order processing system using Azure Kubernetes Service, Azure Service Bus for asynchronous messaging, and Azure Cosmos DB for per-service data isolation. We implement the Outbox Pattern to ensure reliable event publishing, add distributed tracing with Azure Monitor and OpenTelemetry, and set up a service mesh using NGINX Ingress with rate limiting and TLS termination. The article concludes with practical advice on when microservices are the right choice and how to avoid the most common pitfalls teams fall into during decomposition.

Read article
Getting Started with Azure OpenAI Service
about 1 year ago1 min read

Getting Started with Azure OpenAI Service

Azure OpenAI Service brings powerful large language models — including GPT-4o, GPT-4 Turbo, and Embeddings — directly into your Azure environment, giving you enterprise-grade security, compliance, and regional data residency. In this guide, we walk through provisioning your first Azure OpenAI resource, deploying a model, and making your first API call from a .NET or Python application. We also cover key concepts like token limits, system prompts, temperature settings, and how to structure effective prompts for consistent results. Whether you are building a customer support chatbot, a document summarisation tool, or an internal knowledge assistant, this article gives you a solid foundation to start shipping AI features with confidence.

Read article
Building RAG Pipelines with Azure AI Search and GPT-4o
about 1 year ago1 min read

Building RAG Pipelines with Azure AI Search and GPT-4o

Retrieval-Augmented Generation (RAG) is the architecture that turns a general-purpose language model into a domain expert grounded in your own data. Instead of fine-tuning — which is expensive and produces models that go stale — RAG retrieves the most relevant documents at query time and passes them as context to the model. In this article, we build a complete RAG pipeline on Azure: documents are uploaded to Azure Blob Storage, indexed by Azure AI Search using vector embeddings from Azure OpenAI, and retrieved at query time using hybrid search (vector + keyword). The retrieved chunks are then assembled into a prompt sent to GPT-4o, which generates a grounded answer with source citations. We cover chunking strategies, embedding model selection, index schema design, semantic ranking, and how to evaluate retrieval quality. Full code examples in Python using the Azure SDK are included.

Read article