Technical

gp3 For AWS EBS Volumes: Why, When and How To Switch.

January, 8 2021

Aled Sage

During Re:Invent 2020, AWS announced the next-generation general purpose SSD volume for EBS: gp3.

If you’re using EC2 instances, then you should seriously consider switching to gp3 for your EBS volumes. It will save you about 20% on your EBS bill, and in many cases improves performance.

Importantly, it’s simple to change - you can do it to a running instance.

What are we talking about?

EBS (Elastic Block Store) volumes are the disks you attach to EC2 instances. They give you block level storage volumes that can also persist independently from the life of the instance. For many VMs, this is the type of the root volume and of any other attached disks.

EBS offers different volume types - gp2 was the default and often the best choice. It’s a general purpose SSD that performs well for workloads with random IO (as opposed to long sequential reads/writes). The new gp3 is the next generation of this general purpose volume type. As such it will replace gp2. And unlike switching EC2 instance types, it’s super-easy and low risk to change.

When to change

When it comes to cost and performance, gp3 is better than gp2 in every way - so definitely worth changing, but see the limitations section. The gp3 volume type is also better than io1 and io2 for some use-cases.

Performance Comparison

The performance base-level is a consistent 3000 IOPS and 125 MiB/s throughput. You can also pay extra to increase each of those parameters independently, up to 16,000 IOPS and 1,000 MiB/s. In comparison, gp2 gives you 3 IOPS/GB (minimum 100 IOPS) and provides up to 250 MiB/s of throughput depending on the volume size and burst credits.

With gp3, we no longer worry about burst credits - we get consistent performance. With gp2, burst credits allowed the volume to burst to 3000 IOPS for 30 minutes or so, before being throttled to the base level. This was great for spiky loads, but also tripped people up where load tests passed because they weren’t run for long enough, and then hit performance issues in production.

The ability to set IOPS and throughput independently means it is better than io1 and io2 for some use-cases: you can provision moderately high-performance volumes cheaper. However, you can’t go as high (e.g. io1 and io2 can go up to 64,000 IOPS, and io2 Block Express can go up to 256,000 IOPS but that is still in preview). Another advantage of io2 is better reliability: gp2 and gp3 are designed to provide 99.8% - 99.9% durability (i.e. an annual failure rate of 0.1% - 0.2%), compared to 99.999 percent durability for io2.

Cost Comparison

Cost is the other big improvement. The base-line cost for gp3 is 20% less than for gp2 (e.g. in us-east-1 the cost is $0.08 per GB per month, compared to $0.10 for gp2; prices in many other regions are slightly higher, but still give the 20% saving). You can pay extra to increase performance: $0.005 per IOPS per month over 3,000; and for throughput $0.04 per MiB/s per month over 125 MiB/s. Francisco Gimeno’s blog shows some nice graphs of percentage saving for different volume sizes.

To compare like-for-like with gp2, above 1000 GB you’d have to explicitly set a higher IOPS for gp3 (and a higher throughput above 334 GB volume). However, the cost comparison should really be for what your use-case actually needs, tweaking the performance accordingly: not all large volumes need higher performance, while some small volumes do.

Compared to io1 and io2, the cost savings of gp3 can be huge. The base cost of io2 in us-east-1 is $0.125/GB-month plus $0.065 per IOPS per month (for the first 32,000 IOPS, less after that). For a 100GB volume with 16,000 IOPS and 256 MiB/s, gp3 would cost $78 per month (i.e. 0.08*100 + 0.005 * (16000-3000) + 0.04 * (256-125)), and io2 would cost $1052 per month (i.e. 0.125*100 + 0.065*16000; but an oversized 5TB gp2 would also have been a lot cheaper than io2 at $533).

How to change

First, think about the IOPS and throughput. For small volumes (below 1TB) the defaults are usually fine. If changing from larger gp2 volumes, do you need to match the higher default performance that gp2 offers, or will a lower number do?

In most cases, it’s simple to change a volume to gp3. The change can be made to a volume while it is in active use. In a few clicks or 1 property in your infrastructure-as-code!

Changing via the AWS Console

In the AWS console, navigate to the EC2 service and find the volume. Click Actions -> Modify Volume, and choose the new volume type:

Modifying via the AWS console

The confirmation dialog warns that “it may take some time for performance changes to take effect.” However, it seemed fast for me (for an admittedly small 8GB volume). The EC2 console’s Volumes listing showed gp3 within seconds, though the volume’s state showed “”in-use - optimising (..%)” for about 5 minutes. When running benchmarking tests with fio on the volume during the change, I saw the IOPS performance improve within seconds. With a larger 1000GB production volume, there was no noticeable interruption to service but it took 39 hours to go from the “optimizing” state to “completed”!

One thing to be careful of is the rollback strategy for your change. If you reach the “maximum volume modification rate per volume limit”, you’ll have to wait at least six hours before changing the volume type again. For me, this meant I could only change the volume type once so any rollback would be delayed for six hours!

maximum modification rate error

AWS CLI

Using the AWS CLI, you can change the volume type with a command like that below:

aws ec2 modify-volume --volume-id vol-XXXXXXXXX --volume-type gp3

To check progress of the modification (e.g. percent complete, and start/end times), use the command below:

aws ec2 describe-volumes-modifications --volume-id vol-XXXXXXXXX

Terraform

If you use Terraform to provision and manage EBS volumes, changing the type is as simple as changing the string from gp2 to gp3.

Below is some example Terraform showing how the volume type is specified. It shows two ways of defining the volume: inline in the aws_instance, and as a separate volume that is then attached.

Note that changing the root block device from gp2 to gp3 worked for me, without needing to replace the instance (despite the Terraform documentation suggesting otherwise).

provider "aws" {

version = "~> 3.22.0"

region = "eu-central-1"

}

resource "aws_instance" "instance_with_volume_inline" {

ami = "ami-0bd39c806c2335b95"

instance_type = "t3.medium"

subnet_id = "subnet-xxxxxxxx"

root_block_device {

volume_type = "gp3" # Easy to change

volume_size = 8

}

resource "aws_instance" "instance_with_volume_separate" {

ami = "ami-0bd39c806c2335b95"

instance_type = "t3.medium"

subnet_id = "subnet-xxxxxxxx"

}

resource "aws_ebs_volume" "volume" {

availability_zone = "eu-central-1a"

size = 8

type = "gp3" # Easy to change

}

resource "aws_volume_attachment" "this_ec2" {

device_name = "/dev/sdh"

volume_id = aws_ebs_volume.volume.id

instance_id = aws_instance.instance_with_volume_separate.id

}

CloudFormation

If using CloudFormation, be sure to upgrade cfn-lint (otherwise it will say that gp3 is an invalid value for your VolumeType). If you’re not using cfn-lint when writing Cloudformation, I’d highly recommend it!

Below is an example CloudFormation template - the change is as simple as setting the VolumeType to gp3. However, there are a few things to note:

If you have specified your root volume inline in the BlockDeviceMappings, then changing it from gp2 to gp3 will cause the VM to be replaced with a new volume - so avoid that if you don’t want to cause an outage and lose any changes made to your root volume!
Applying the CloudFormation change waits for the volume modification to complete (see CLI section for how to query this). For a 8GB volume it consistently took about 5 mins 30 secs to complete the CloudFormation change.

If changing the volume type multiple times (e.g. to roll back), it gets extremely slow: it took about 6 hours to complete the second CloudFormation change (with the volume unchanged until near the end). Presumably CloudFormation was waiting due to the “maximum volume modification rate per volume limit”. The stack was in the UPDATE_IN_PROGRESS state all this time, meaning no other modifications could be made (unless you resort to the “cancel update stack” action) - not good!

AWSTemplateFormatVersion: "2010-09-09"

Resources:

InstanceWithVolumeInline:

Type: AWS::EC2::Instance

Properties:

ImageId: "ami-0bd39c806c2335b95"

InstanceType: "t3.medium"

SubnetId: "subnet-xxxxxxxx"

BlockDeviceMappings:

- DeviceName: "/dev/xvda"

Ebs:

DeleteOnTermination: true

VolumeSize: 8

VolumeType: "gp3" # Don't change this, it replaces the VM

InstanceWithVolumeSeparate:

Type: AWS::EC2::Instance

Properties:

ImageId: "ami-0bd39c806c2335b95"

InstanceType: "t3.medium"

SubnetId: "subnet-xxxxxxxx"

EbsVolume:

Type: AWS::EC2::Volume

Properties:

AvailabilityZone: "eu-central-1a"

Size: 8

VolumeType: "gp3" # Easy to change

VolumeAttachment:

Type: AWS::EC2::VolumeAttachment

Properties:

Device: "/dev/sdh"

VolumeId: !Ref EbsVolume

InstanceId: !Ref InstanceWithVolumeSeparate

Limitations and Gotchas

There were a few limitations reported for gp3, which might mean you need to wait for adopting gp3 for some use cases. Thanks to @QuinnyPig for the Twitter thread on this topic!

If you’re using Auto-Scaling Groups with Launch Configurations, then gp3 is not for you (yet). Launch Configurations does not support gp3 or io2. However, you can use gp3 with Launch Templates.

When modifying old VMs, there is a worrying-looking warning that the volume attachment may not support live volume modifications (the documentation links in the screenshot are to Linux and Windows). However, testing this on an old m4.large Ubuntu 14.04 that had been running for over 4 years (!), the change worked perfectly.

A few other reported issues seem to be fixed now, based on my testing:

There was an error in pricing forecasts when using gp3. Early adopters saw projected bills with gp3 about a million times higher than they should have been.
AMI creation failed if the root volume was gp3. The workaround was to convert to gp3 and then create the AMI. However, this worked for me when testing in eu-central-1.
The I/O performance status check reported incorrectly for gp3 volumes. It was reporting no-data-available, but should have said “Not Applicable”. However, I didn’t see this during my testing so presume it is fixed.
Users reported errors in CloudFormation when launching EC2 instances with a gp3 volume, if the throughput parameter was not explicitly set. However, this worked for me using the CloudFormation shown earlier.

Conclusions

If you are using EC2 instances, then you should seriously think about switching to gp3 volume types. If you’re using gp2 (i.e. the previous generation of general purpose SSD) then it will always save you money (usually about 20%) and will improve or match performance. If you’re using io1 or io2 with 16,000 IOPS or lower, then you could save huge amounts.

Making the change is straightforward, so you can start saving on your AWS costs immediately!

gp3 For AWS EBS Volumes: Why, When and How To Switch.

What are we talking about?

When to change

Performance Comparison

Cost Comparison

How to change

Changing via the AWS Console

AWS CLI

Terraform

CloudFormation

Limitations and Gotchas

Conclusions

Subscribe

Related Posts

Remote Access to EC2 instances, the easy (and secure) way

6 Reasons Why Cloud Migration Is A No-Brainer

Remote Access to Windows EC2 instances, the easy (and secure) way

gp3 For AWS EBS Volumes: Why, When and How To Switch.

What are we talking about?

When to change

Performance Comparison

Cost Comparison

How to change

Changing via the AWS Console

AWS CLI

Terraform

CloudFormation

Limitations and Gotchas

Conclusions

Subscribe

Share

Related Posts

Remote Access to EC2 instances, the easy (and secure) way

6 Reasons Why Cloud Migration Is A No-Brainer

Remote Access to Windows EC2 instances, the easy (and secure) way