Reduce costs with Azure Spot virtual machines

What is Azure Spot VM?

Azure Spot VM is an Azure feature that allows you to take advantage of the unused capacity of the underlaying platform. If an host has some capacity compute left, these ‘spots’ will be filled with you Spot enabled virtual machines. When enabling this feature, you receive a discount up to 90 percent of the normal pricing in some cases.

Only pricing and eviction are the differences between Spot enabled virtual machines and the regular virtual machines. The compute, networking, storage, etc are exactly the same. The virtual machine can be attached to a virtual network or a load balancing solution, such as a internal/external load balancer. Also, the management capabilities are exactly the same and are done though the Azure Portal or with Infrastructure as Code (IaC) like Bicep, ARM, Powershell or Terraform.

But what is the main difference between Azure Spot virtual machines and regular virtual machines? Availability! When enabling the Spot feature on your virtual machine, there will be some uninterrupted availability. Microsoft will claim back the unused capacity of the hosts where your Spot enabled virtual machines are running, so customers who are paying the full pricing can allocated that space for their virtual machines. Your Azure Spot enabled virtual machines will be stopped and de-allocated.

What are the evictions options?

There are two eviction options when enabling Azure Spot on your virtual machine.

Capacity Only: Azure will only evict your virtual machine when there is some capacity needed. In this case your Spot virtual machine will be stopped (de-allocated) so other regular virtual machines can start.

Price or Capacity: in this scenario, Azure evict your virtual machine when the pricing of the Spot VM exceeds the configured threshold during the creation of your virtual machine or Azure needs compute capacity for regular virtual machines.

How to configure Spot VM?

You can enable the Spot feature when creating the virtual machine. You can do this through the Azure Portal, but you can also do this using Infrastructure as Code (IaC). For example: ARM, Bicep or Terraform. In this blogpost, I’m using Terraform to enable Azure Spot on my virtual machine.

You can configure two eviction policies, Stop / Deallocate or Delete. When using stop/deallocate, the Spot enabled virtual machine will be stopped and deallocated. You can manually start the VM at any time, available of the unused compute capacity, the VM will start. Otherwise there is no unused capacity and your VM will not be started.

When you choose Delete, your VM will be stopped and deleted.
Important note! Only the machine configuration will be deleted, the disk(s) and network interface(s) not. You will be charged for the used disk storage.

What are the restrictions of Azure Spot virtual machines?

There are some restrictions of Azure Spot enabled virtual machines.

Not all machine sizes are supported for Azure Spot

For example, the B-serie VM size is excluded. Also the NV, NC, H serie and promo VM series are excluded. You can find a complete overview of the supported machine series in the link below.

Azure Spot Virtual Machines – Pricing | Microsoft Azure

Subscription support

All free Microsoft Azure subscriptions are excluded for the Spot feature. Also subscription linked to an MPN or Partner Agreement are excluded. Supported subscriptions include Enterprise Agreements, PayG (Pay-as-you-Go) and NCE/CSP (New Commerce Experience or Cloud Solution Provider).

Storage restrictions

Azure Spot enabled virtual machines only supports the regular disk storage, no ephemeral disks (local storage).

What about converting a virtual machine?

It is not possible to convert a regular virtual machine to a Spot enabled virtual machine or the other way, converting a Spot enabled virtual machine to a regular virtual machine. You can delete the virtual machine configuration, create a new virtual machine and attach the disks and network interface to the new machine.

Important note! Depending on your workload, there is a risk of uninterrupted availability. So in any scenarios, please make the business case of using Azure Spot VM. For some mission critical workloads, you definitely NOT want to use Azure Spot VM. But in case of a Dev/Test environment, where you want to test some applications or software, in is pretty good to use Azure Spot VM. You have the benefits of low pricing compute costs and the massive functionality of the Azure platform.

Let’s start configuring an Azure Spot VM using Terraform. In this scenario I have deployed two virtual machines. One regular virtual machine with Azure Hybrid Benefit (license) enabled and one Azure Spot enabled virtual machine with Azur Hybrid Benefit (license) enabled. Later in this blog we will take a look from the cost perspective and the differences in pricing.

resource "azurerm_windows_virtual_machine" "vm" {
  admin_username                  = var.vm_username
  admin_password                  = "${data.azurerm_key_vault_secret.vmpassword.value}"
  # var.vm_password
  location                        = var.location
  name                            = var.vm_name
  network_interface_ids           = [azurerm_network_interface.netinterface.id]
  resource_group_name             = var.vm_rg_name
  license_type                    = "Windows_Server"
  secure_boot_enabled             = true
  provision_vm_agent              = true
  size                            = var.vm_size
  tags = {
    Environment                   = var.tag_environment
    Workload                      = var.tag_workload
  }
  timezone                        = var.vm_timezone
  vtpm_enabled                    = true
  zone                            = var.vm_avzone
  priority                        = "Spot"
  eviction_policy                 = "Deallocate"
  os_disk {
    name                          = "${var.vm_name}-osdisk"
    caching                       = "ReadWrite"
    storage_account_type          = var.vm_storage
  }
  source_image_reference {
    offer                         = var.offer
    publisher                     = var.publisher
    sku                           = var.sku
    version                       = "latest"
  }
  boot_diagnostics {
    storage_account_uri           = data.azurerm_storage_account.vm_bootdiag.primary_blob_endpoint
  }
}

I’m using exact the same Terraform module for both virtual machine, but in the first one I removed line 72 and 73 for Azure Spot VM. After deploying both virtual machines, this is the result.

The virtual machine types are exactly the same (Standard_D2as_v5), both with an OS disk of 127 GiB. The first VM has OS Hybrid Benefit enabled. The second VM has OS Hybrid Benefit enabled and Spot VM enabled with eviction policy Stop/Deallocate. Both servers are running Windows Server 2022 Azure Edition and have Secure Boot, vTPM and Trusted Launch enabled.

After a few days, we can take a look in Azure Cost Management, so we can see our scenario from a cost perspective. Here we can see the differences between a regular virtual machine and a Azure Spot enabled virtual machine.

I’ve created two little dashboards within Cost Management. The first one (light-blue) is the compute cost of virtual machine ‘mss-hb-001’ with only Azure Hybrid Benefit enabled. The second one is the compute cost of virtual machine ‘mss-hbspot-001’ with Azure Hybrid Benefit and Spot enabled.

As you can see, the compute costs are much lower for the Spot enabled virtual machine. In the next screenshots you can see that the attached disks pricing is the same for both virtual machines. This is because Azure Spot is only reducing the compute costs.

Summary

Azure Spot virtual machine is a great feature to enabled in some cases. When using Infrastructure as Code, you can easily enable this feature with a couple lines of code. Make sure you don’t enable this feature for mission critical virtual machines or workloads, because you can have uninterrupted availability.

Thanks for reading this blog.