AWS Backup service: Pros & Cons
The AWS Backup service is a fully managed service to centralise and automate backups in AWS. The backup policies, permissions and tasks are consolidated into one place.
Customers commonly use AWS Backup for RDS (database) snapshots. When deciding whether to use the service, a common question is: what are the pros and cons of AWS Backup compared to just using native RDS snapshots.
Separate, locked down, permissions
The vault, which stores the RDS snapshots and other backups, has its own permissions model. This can be configured in its own AWS account. It is simpler to lock down and reason about access to the vault to prevent malicious deletion.
Ask yourself: “could someone delete my RDS database instance and all its backups?” If all the backups are stored in the same account in the RDS service, then it is harder to prevent deletion (both right now and in the future when someone might change the IAM permissions). One way is to use Service Control Policies to forbid deletion of RDS snapshots in the production account (or deletion of the KMS keys), but a malicious user could still try to delete all backups by changing the snapshot retention period.
Particularly when a company is switching from on-prem or co-located to the cloud, it is mind-blowing that someone could delete all your backups with just a few API calls! Locking down the backups is vital.
Backup rotation schemes
More sophisticated backup rotation schemes, such as grandfather-father-son. Rather than a simple “keep for X days”, you could configure AWS Backup to keep daily backups for a week, weekly backups for a month, and monthly backups for a year.
Support for multiple services
AWS Backup can store backups from a range of AWS services, including RDS, DynamoDB, EFS (Elastic File System), EBS (Elastic Block Store) and Storage Gateway (but not S3).
This allows backups from multiple parts of a system to be stored and managed in a consistent way.
Large enterprises in particular want centralised management of backups, rather than individual application teams being responsible for their own backup strategy. This is important for compliance, risk management and consistent best practices across teams. The AWS Backup service supports this, thus simplifying backup management across your enterprise.
Where a disaster recovery strategy is needed cross-region, AWS Backup can play an important part. You can copy backups to multiple AWS Regions automatically or on-demand.
As per the AWS Backup pricing, you pay based on storage used. For Aurora, it is $0.021 per GB-Month; for other RDS snapshots it is $0.095 per GB-Month.
This means that backing up a 500GB RDS instance would cost $47.50.
Subsequent backups are cheaper because they are incremental - only the changed parts of the data will be saved and charged for. Therefore the total cost depends on how much your data changes over the lifetime of the backups.
No improvement to Recovery Point Objective (RPO)
For disaster recovery in RDS, I’d recommend that you prefer RDS snapshots, and only resort to backups in the AWS Backup vault when other backups are not available. This has been confirmed with AWS Support as a good strategy.
The Recovery Point Objective (RPO) is the maximum amount of data that could be lost during a disaster, usually measured in minutes or hours. To keep this number low, use RDS point-in-time recovery - transaction logs for DB instances are uploaded to S3 behind-the-scenes every 5 minutes. If that is not available, use RDS snapshots - if you want better than 24 hours RPO, then use a scheduled lambda to automatically create “manual” snapshots at the required frequency.
In AWS Backup, you cannot specify or estimate the exact time it will take to create the backup. I’ve seen it frequently take 4 hours to make a backup available. For a lot of that time, the API reports PercentDone=0 meaning the backup has not started or backed up any data. With a 24 hour schedule, that means the RPO could in fact be 28 hours or more.
The API configuration for a backup rule allows you to specify the ScheduleExpression (e.g. 4am each day), the StartWindowMinutes for when a scheduled job should be cancelled if it hasn’t started in time, and the CompletionWindowMinutes for when a started job should be cancelled if it hasn’t completed in time. This allows you to alert if backups have not been created in a timely manner.
Setting up and configuring AWS Backup will obviously require work, as will writing disaster recovery runbooks, training, and running disaster recovery tests.
However, care should be taken that we are comparing like-for-like. The work to configure AWS Backup can be reused by multiple teams in the organisation, particularly if CloudFormation or Terraform is used for automation. Also, if an application team has to figure out and implement their own backup strategy for each of RDS, EFS, EBS, DynamoDB, etc, then the total effort may be more than when using AWS Backup.
AWS Backup is a very useful service for many use cases - the advantages often outweigh the disadvantages.
For large enterprises with many applications, the benefits of centralised control and consistency are extremely compelling. For mid-sized and smaller companies, the compelling argument is often the ability to lock down backups (think: could someone delete my RDS database instance and all its backups?). For applications with persisted state in many services, the use of AWS Backup brings a consistent approach.If you’d like to talk more and get help with AWS Backup, please reach out to Cloudsoft for a free consultation with a cloud expert.