For this article we compared small 2-vCPU, 8GB-RAM instances of managed MySQL v5.6.x platform database machines from Amazon Web Services, Google Cloud SQL and Microsoft Azure Managed Databases. We unearthed remarkable performance-to-cost ratio variations among these providers for this configuration.

Many Public cloud providers package popular database engines (MySQL, Postgres, Mssql, MariaDB, Oracle) into fully-managed instances for users. This usually means that users are shielded from the nitty-gritty of storage management, backups, inter-zone/region replication and setting up read replicas or monitoring for their databases.

This is nothing short of a revolution for IT organizations because it takes a huge chunk of database sysadmin work away, so in-house database administrators can focus more on the data and less on the infrastructure that hosts it. Emergent software and IT architectures like micro-services, server-less-computing and containers assume that all state is cleanly kept in databases; with managed databases developers can be confident that they are using database hosting best practices for their applications.

Setup

The table below shows the 3 configurations used in our testing, corresponding to GCP, AWS, and Azure, including their hourly costs (retail/non-discounted costs). This data is current as of May 2018.

MySQL 5.6 Database VM type Notable specs $USD/hour DB instance $USD/hour + 125GB SSD
Google GCP/db-n1-standard-2 2 vCPU, 7.5GB, 125GB ssd $0.19 link; ca-central1 0.22
AWS db.m4.large 2 vCPU, 6.5 GB RAM, 125GB gp2 $0.175 link; us-central-1 0.19
MS Azure General Purpose 2 vCPU, 8 GB RAM, 125GB ssd $0.176 link; US East 0.19

We did not provision read replicas, failover nodes or backups for this set of tests.

We acknowledge there are scores of other database instance configurations across different cloud providers. We chose these relatively small instance types because we believe that many small and mid-size business applications are served with machines like these (databases containing a up to few million records, serving read-heavy workloads for example). This size is also popular among developers and functional testers since these instances are relatively inexpensive, yet not sluggish. Moreover, public cloud providers have newer solutions for larger RDBMS use-cases (AWS Aurora/Google Cloud Spanner/Azure Cosmo DB etc.) and those will be the topic of another BigBitBus article in the future.

Workload

We used the well known employees sample database and ran a read-heavy workload (with fewer writes). We used Jmeter to run the queries against the database. The write workload was generated exactly as described in our previous article; we used 2 threads (users) for generating writes. The read workload was generated by 20 threads repeatedly running the following SQL query:

SELECT COUNT(*) FROM employees
WHERE (first_name LIKE
'${__RandomString(2,abcdefghijklmnopqrstuvwxyz,)}%'
AND last_name LIKE
'${__RandomString(2,abcdefghijklmnopqrstuvwxyz,)}%');

Jmeter’s __RandomString function generates a 2-character random string. The above query simply generates 2 random 2-letter strings, and counts how many employee records’ first_names and last_names share the generated 2-letter random beginnings. We inserted indices for the first_name and last_name columns to speed up the reads. We co-located our Jmeter host in the same region as the database server and gave it enough vCPUs so that we hit the database server’s CPU bottleneck with a single Jmeter host.

The entire database can easily fit into the 8-GB RAM (so very few disk reads) and the low cadence of write operations makes for light disk writes. We designed this test to be so because we didn’t want to get into the complexity of differentiating between the multiple storage options of cloud providers in this article.

We have not tweaked any MySQL-specific parameters for these tests; they are all set to the defaults provided by the platform.

All BigBitBus testing is completely transparent, if you are interested in repeating our tests then the entire jmeter test file, along with the salt formula used to automate Jmeter can be downloaded from here.

Results

We report the summary statistics from the Jmeter tests in the table below. Sl.1-7 are read and write operations for “generating” new data and writing it to the database by 2 Jmeter writer threads (as described in detail here.) Sl.8 is the read operation being executed by 20 reader threads.

Sl. Operation AWS GCP Azure
1 Get 2 random employees 10.37 10.13 1.44
2 Get a random deparment 10.40 10.15 1.44
3 Get a random title 10.39 10.15 1.44
4 Insert employee 10.39 10.14 1.45
5 Insert salary 10.39 10.15 1.45
6 Insert title 10.39 10.15 1.44
7 Insert dept_emp 10.39 10.15 1.44
8 Find matching 6408.40 6768.05 3961.40
  TOTAL 6479.50 6838.90 3971.45

The table shows that AWS and GCP’s managed MySQL instances are neck-to-neck in terms of operation throughput; Azure MySQL is significantly slower. We confirmed via the cloud provider’s monitoring systems that each of the database instances was CPU bound (100% CPU utilization) during the test runs; disk IO and memory usage was very light. Recall that all the three tested database instance are similarly priced and have closely matching specifications (CPU, RAM, disk, MySQL version). We suspect that the Azure managed MySQL instance vCores used in our test are either significantly slower (CPU) or that their default MySQL database settings are not tuned to the level of AWS and GCP. Since we do not have terminal access into the MySQL instances we cannot be more specific about what the differences may be.

The Figure below shows plots the total throughput (operations/sec) as well as the “value-for-money”: number of operations achieved in the different provider setups per dollar spent. It is interesting to note that while the GCP database is higher performing than the AWS database, the latter delivers more operations per dollar (since GCP’s MySQL setup is pricier).

Fig.1: Managed Database Performance Comparison: AWS, GCP, and Azure

Outlook

This article is literally a drop in the ocean when it comes to answering the question of which cloud provider’s managed database is better or worse. But hopefully we have communicated the idea of how to undertake such an analysis for your use case when choosing between different platforms and providers. The key take-away from our results is that there are very significant differences between managed database offerings of cloud providers, even if they look similar at the surface (similar vCPU and memory specs for example).

Managed databases are pricey (compared to plain IaaS VMs) and they are being increasingly spun up for development, testing and production uses in organizations. Their sustained use and associated performance and cost considerations make it worth investing time and effort in determining the right choices. Needless to say, different database schemas and application database access patterns can have vastly different performance results. Speak to us if you would like to work with us to model your workloads on different providers, instance configurations and database types to get accurate performance-to-cost ratios for your use-cases.

*

Sachin Agarwal is a computer systems researcher and the founder of BigBitBus.

BigBitBus is on a mission to bring greater transparency in public cloud and managed big data and analytics services.