One of the major challenges in Cloud is controlling costs effectively. With an on-premise data center infrastructure, the costs are known ahead and planned in advance and usually predictable. On the other hand, Cloud platforms usually do not have an upfront cost and you pay as you use. While the Cloud seems to be cost effective in the long run, there are elements of uncertainty which might give us a shock if we do not plan and use Cloud resources judiciously. The objective of this article is to set up and operate non-critical database workloads while keeping their costs within reasonable limits.
This article assumes that the reader is familiar with AWS and Database concepts and terminologies. While this article uses PostgreSQL as an example, this can be applied to other Database Engines as well.
There are various ways to set up a relational database in AWS. The cost depends on how you set it up and what Database Engine you use. Note that the techniques discussed here are applicable only to non-critical workloads where high-availability, automatic failover, etc. are not required. The topics discussed here may not be relevant to critical workloads which need to be always on due to continuous development, deployment or testing.
AWS provides different options to set up a Relational Database. And the available options depend on the Database Engine you choose.
- Aurora
- Relational Database Service (RDS)
- Databases hosted on EC2 Virtual Machines
Contents
Aurora
Aurora is Amazon’s own fully managed Database Engine that runs on top of either PostgreSQL or MySQL. It is a drop-in replacement for the respective database engines and has many features built on top of what the underlying Engines provide.
Of all the available options, Aurora provides the most features and is the most expensive. It provides built-in high availability by replicating data to two nodes in each Availability Zone (AZ) within the region. Most regions typically have 3 AZs which means it keeps six copies of your data and is available even if an entire AZ goes down. It provides automatic failover to secondary replica with zero data loss when the primary goes down. The failover is transparent and client applications can continue to access the database with the same endpoint. Aurora supports up to 15 read replicas with milliseconds replication latency. The primary writer instance and the read replicas share the same underlying storage and therefore there is no real replication happening. This makes it a very good solution for read heavy workloads.
Aurora is well suited for Production workloads with variable load as well as continuous peak loads, where high availability is needed. It may not be suited for non-critical workloads due to its high cost.
Relational Database Service (RDS)
RDS is an AWS service that makes it easy to set up and operate a Relational Database in AWS at the click of a button. It supports different Database Engines like SQL Server, Oracle, PostgreSQL, MySQL, MariaDB, etc. Unlike Aurora, RDS does not provide high availability and automatic failover but these features can to be set up manually. It comes at a relatively lower cost compared to Aurora for the same configuration.
RDS suits Production workloads with variable load as well as continuous peak loads, where high availability is not a major concern and downtime of few minutes to few hours is acceptable. Note that you can still set up high availability manually by having a cold or warm standby instance and manually failing over to it when primary instance goes down. Failovers may result in data loss.
RDS is only a little less expensive than Aurora and therefore may not be the right choice for non-critical workloads. It doesn’t provide any added benefits compared to Databases hosted on EC2 other than the ease with which you create and manage the database. RDS supports up to 5 read replicas, and the replication latency is a little higher than that of Aurora.
Databases hosted on EC2 Virtual Machines
While the whole point of using a Cloud Database is to take advantage of the simplified set up and operation provided by Cloud platforms, there may be times when this might be an overkill and may not provide a reasonable cost-benefit ratio. In such cases, you can spin EC2 instances and manually set up and administer the databases. EC2 Virtual Machines are very cost effective when compared to Aurora and RDS. Examples of such workloads could be non-critical databases which doesn’t need to be always on, databases that are used for development purposes, Staging, Pre-Production and UAT databases that are used only for testing & validation, etc.
If you are using open source databases like PostgreSQL, MySQL or MariaDB, then there is no need to worry about licensing. However, if you are using proprietary databases like Oracle or SQL Server, you can use the Express editions which are free to use but come with database size limitations. Alternatively, you can also use Licensed versions of these databases if you already own the licenses.
Hosting Databases on EC2 also allows you to use all command line tools, plugins, etc. that are not fully available on Aurora and RDS. It also gives you complete control over the database and the underlying operating system tools. This method can also be used to host databases that are not supported by Aurora and RDS. For example, MongoDB, CouchBase, HazelCast etc.
Caveats
Hosting Databases on EC2 requires that you have a Database Administrator to set up, configure, and to continuously monitor and maintain the Databases. Aurora and RDS provide GUI tools to create and manage Databases at the click of a button. Since this article is specifically focused on non-critical workloads, this might not be an important point of consideration.
If you are using proprietary Database Engines like Oracle and SQL Server, you may have to incur additional licensing costs which are not covered here. Alternatively, you can use the free Express editions of these databases to avoid licensing costs. Express editions come with certain limitations in terms of storage and performance, but should be enought for non-critical workloads. Aurora and RDS include the license for proprietary Database Engines and you don’t have to worry about it. In short:
Cost Comparison
Let’s take an example of a indicative price comparison for PostgreSQL Database (US East – N Virginia) as of April 2020. These are On-Demand prices and Reserved prices are far less than the ones mentioned here. Note that there are other charges like Storage, Data transfer, etc. The comparisons are only indicative and may change drastically depending on how the services are used.
Note: Prices mentioned are for the EC2 r5 and RDS db.r5 Instance Families.
Configuration | Aurora | RDS (Single AZ) | EC2 (Linux) |
---|---|---|---|
2 CPU, 4 GB RAM | $0.082/hr | $0.072/hr | $0.0416/hr |
4 CPU, 32 GB RAM | $0.58/hr | $0.50/hr | $0.252/hr |
16 CPU, 128 GB RAM | $2.32/hr | $2.00/hr | $1.008/hr |
Cost Exercise
Let’s calculate the monthly costs for roughly equivalent configurations for the three approaches using AWS Pricing Calculator.
Configuration | Aurora | RDS (Single AZ) | EC2 (Linux) |
---|---|---|---|
Instance Type | db.r5.xlarge (4 CPU, 32 GB Memory) | db.r5.xlarge (4 CPU, 32 GB Memory) | r5.xlarge (4 CPU, 32 GB Memory) |
Storage | 100 GB | 100 GB | 100 GB (EBS) |
Backup Storage | 20 GB | 20 GB | 20 GB |
Data Transfer | 10 GB | 10 GB | 10 GB |
On-Demand Price | $434 | $378 | $175 |
Standard Reserved Price (1 year – no upfront) | $288 | $224 | $114 |
Spot Instance | N/A | N/A | $49 |
Summary
As you can see, you can easily save a minimum of 50% in EC2 VMs compared to Aurora and RDS. If your Databases need not be available all the time, you can opt for Spot instances that allows you to save up to 90% of your overall costs. Note that the these comparisons are very approximate and may vary drastically in real world usage.
Aurora if:
- Your Database Engine is PostgreSQL or MySQL
- You don’t have a dedicated DBA
- Your Database needs to be always on and no downtime is allowed during an outage.
RDS if:
- Your Database Engine is PostgreSQL, MySQL, MariaDB, Oracle, SQL Server.
- You don’t have a dedicated DBA
- Your Database needs to be available most of the time, but it is okay to have a few minutes of downtime during an outage.
EC2 if:
- Your Database Engine is anything, even those outside of what Aurora/RDS offers, i.e., Redis, Hazelcast, MongoDB, CouchDB, CouchBase, Cassandra, etc.
- You have a dedicated DBA who can create & configure the databases and set up backup scripts initially and be available when things go wrong.
- Your Database availability is not a big deal and it is okay to have a few hours of downtime during an outage.
Excellent Shameel ! Nice Article , keep going …
Thank you Arunkumar.
Nice Article
Thank you Sivashankar.