Performance and scaling of the Amazon-managed MySQL, Relational Data Store (RDS):
- Horizontal scaling
- Sharding (distribute data [tables or rows] among multiple RDS instances; Tumblr uses sharded MySQL and it worked well for them) – there is no explicit support so the applications have to handle it themselves, i.e. know which table/rows to read from which instance
- Read-replicas: RDS supports set up of read-only replicas using MySQL’s own replication; the replicas are evidently only usable for reading and may contain little stale data
- Vertical scaling (stronger EC2 instances) – there are interesting results from a benchmark of RDS with various instances/DB sizes (6/2011, complete report); key observations:
- “With hardly any dependency on the database size, MySQL reaches its optimal throughput at around 64 concurrent users. Anything above that causes throughput degradation.”
- “Throughput is improving as machines get stronger. However, there is a sweet-spot, a point where adding hardware doesn’t help performance. The sweet spot is around the XL machine, which reaches a [max] throughput of around 7000 tpm.” (transactions per minute => ~ 110 tx/sec)
Disclaimer: No banchmark proves anything generally applicable, it’s always necessary to run one’s own production load and measure that to see how in reality a DB performs for one’s actual needs.
- The number of concurrent connections is by default derived from the memory, namely 150 for a small 1.5GB instance and 650 for a large 7.5GB instance. According to one expert it’s completely OK to set it to 1000 connections without regard to memory; MySQL should handle it.