AWS Database Service Selection for System Architects
Overview
AWS offers multiple database services optimized for different workload patterns. Choosing the right database is one of the most critical architectural decisions, as migrating between database types later is expensive and complex.
This guide provides decision frameworks to help you select between:
- RDS/Aurora: Relational databases (MySQL, PostgreSQL, SQL Server, Oracle)
- DynamoDB: NoSQL key-value and document database
- ElastiCache: In-memory caching (Redis, Memcached)
- Redshift: Data warehouse for analytics
Quick Decision Tree
Start: What type of workload?
├─ OLTP (Transactional) → Relational or NoSQL?
│ ├─ Need SQL, ACID transactions, complex queries → RDS/Aurora
│ │ ├─ Need 5x-3x performance, auto-scaling → Aurora
│ │ └─ Cost-sensitive, predictable workload → RDS
│ │
│ └─ Need horizontal scaling, key-value access → DynamoDB
│ ├─ Unpredictable traffic → On-Demand mode
│ └─ Predictable traffic → Provisioned mode
│
├─ OLAP (Analytical) → Redshift
│ ├─ Variable workload → Serverless
│ └─ Steady workload → Provisioned + Reserved Instances
│
└─ Caching (Performance acceleration) → ElastiCache
├─ Need advanced features → Redis/Valkey
└─ Simple key-value → Memcached
Database Service Comparison Matrix
| Dimension | RDS/Aurora | DynamoDB | ElastiCache | Redshift |
|---|---|---|---|---|
| Workload Type | OLTP (transactional) | OLTP (NoSQL) | Caching | OLAP (analytical) |
| Data Model | Relational (SQL) | Key-value, document | Key-value | Relational (SQL) |
| Query Type | Complex SQL, joins, aggregations | Key-based lookups, simple queries | Key-based retrieval | Complex aggregations, multi-table joins |
| Scalability | Vertical (instance size) | Horizontal (unlimited) | Vertical + horizontal | Horizontal (add nodes) |
| Latency | Low milliseconds | Single-digit milliseconds | Sub-millisecond | Sub-second (analytical queries) |
| Capacity | Up to 128 TB (Aurora) | Unlimited | Up to 419 GB per node | Petabytes |
| Consistency | ACID transactions | Eventual or strongly consistent | Eventual (replica lag) | Strong |
| Cost | $0.12-$13/hour (instance-based) | $0.25-$1.25 per million requests | $0.016-$13/hour (node-based) | $0.25-$13/hour (node-based) |
| Use Case | Web apps, ERP, CRM | Mobile apps, IoT, gaming | Session storage, query caching | BI, data warehousing, reporting |
Workload Pattern Analysis
Transactional Workloads (OLTP)
Characteristics:
- Frequent reads and writes
- Low-latency requirements (<100ms)
- Small transactions (individual rows or documents)
- Concurrent users accessing different data
RDS/Aurora When:
- Complex queries with joins across multiple tables
- ACID transactions required
- Existing SQL codebase or team expertise
- Relational data model (foreign keys, normalization)
DynamoDB When:
- Predictable access patterns (key-based lookups)
- Need unlimited horizontal scaling
- Single-digit millisecond latency required
- Schemaless flexibility beneficial
Analytical Workloads (OLAP)
Characteristics:
- Complex aggregations across large datasets
- Infrequent writes, read-heavy
- Queries scan millions of rows
- BI tools and dashboards
Redshift When:
- Need SQL-based analytics
- Large datasets (>100 GB)
- Complex joins and aggregations
- BI tool integration (Tableau, Looker, QuickSight)
Athena When (not covered in detail, but worth mentioning):
- Ad-hoc queries on S3 data
- Don’t want to manage infrastructure
- Query infrequently (<once per day)
Caching Workloads
Characteristics:
- Read-heavy access to frequently requested data
- Tolerance for eventual consistency
- Performance acceleration goal
- Offload database reads
ElastiCache When:
- Database read bottleneck
- Need sub-millisecond latency
- Frequently accessed data fits in memory
- Can tolerate cache misses
CloudFront When (CDN, not database):
- Static content caching
- Global user base
- Edge-based caching
RDS vs Aurora Decision Framework
Both are relational databases, but Aurora offers cloud-native advantages at premium cost.
Feature Comparison
| Feature | RDS | Aurora |
|---|---|---|
| Performance | Standard engine performance | 5x MySQL, 3x PostgreSQL |
| Storage | Up to 64 TB | Auto-scales to 128 TB |
| Read Replicas | Up to 5 | Up to 15 |
| Failover Time | 60-120 seconds | <35 seconds |
| Replication Lag | Seconds to minutes | <100ms |
| Serverless | No | Aurora Serverless v2 |
| Global Database | Manual setup | <1 second cross-Region replication |
| Cost | Lower baseline | 20-30% premium |
Decision Criteria
Choose RDS When:
- Engine compatibility: Need Oracle, SQL Server, MariaDB, or Db2 (Aurora only supports MySQL/PostgreSQL)
- Cost-sensitive: Budget-constrained, predictable workload
- Workload fits single instance: <5 read replicas sufficient
- Development/testing: Lower-cost environments
Choose Aurora When:
- Performance critical: Need 5x-3x throughput improvement
- High availability: <35 second failover required (99.99% SLA)
- Read scaling: Need >5 read replicas (up to 15)
- Global applications: Multi-Region replication with <1 second lag
- Variable workload: Aurora Serverless v2 auto-scales capacity
- Write-heavy: Aurora I/O-Optimized eliminates per-request I/O charges
Cost Comparison Example (db.r6g.xlarge, us-east-1):
- RDS MySQL: $0.252/hour ($185/month) + storage ($0.115/GB)
- Aurora MySQL Standard: $0.29/hour ($213/month) + I/O ($0.20/M requests)
- Aurora I/O-Optimized: $0.348/hour ($255/month), no I/O charges
Breakeven: Aurora costs 15-20% more but delivers 3-5x performance. If performance improvement reduces infrastructure costs elsewhere (fewer replicas, smaller instances), Aurora is cost-effective.
DynamoDB vs RDS/Aurora Decision Framework
NoSQL vs relational database choice depends on access patterns and scalability requirements.
When to Choose DynamoDB Over RDS/Aurora
1. Scalability Beyond Relational Limits:
- Need to scale beyond single-instance vertical limits
- Workload requires millions of requests per second
- Unpredictable traffic spikes (DynamoDB on-demand auto-scales)
2. Predictable Access Patterns:
- Queries are key-based lookups (no complex joins)
- Access patterns known upfront (can model partition/sort keys)
- Single-table design feasible
3. Single-Digit Millisecond Latency:
- Need consistent sub-10ms response times at scale
- Latency more important than query flexibility
4. Serverless Architecture:
- Want zero operational overhead
- Pay-per-request pricing preferred
- No server management desired
When to Choose RDS/Aurora Over DynamoDB
1. Complex Queries Required:
- Need joins across multiple tables
- Ad-hoc queries with unpredictable filters
- Complex aggregations (GROUP BY, window functions)
2. ACID Transactions Across Tables:
- Multi-table transactions required
- Strong consistency across related entities
- Traditional relational integrity (foreign keys)
3. Existing SQL Ecosystem:
- Team expertise in SQL
- Existing SQL-based applications
- BI tools require SQL interface
4. Unknown Access Patterns:
- Exploratory analytics
- Access patterns evolving
- Need query flexibility without remodeling data
Cost Comparison Example
Scenario: 10 million requests/month, 50 GB storage, read-heavy (90% reads, 10% writes)
DynamoDB On-Demand:
- Reads: 9M × 0.5 RRUs (eventually consistent) = 4.5M RRUs × $0.25/M = $1.13
- Writes: 1M × 1 WRU = 1M WRUs × $1.25/M = $1.25
- Storage: 50 GB × $0.25 = $12.50
- Total: $14.88/month
RDS (db.t4g.micro):
- Instance: $0.016/hour = $12/month
- Storage: 50 GB × $0.115 = $5.75
- Total: $17.75/month
RDS (db.r6g.large for performance):
- Instance: $0.201/hour = $147/month
- Storage: 50 GB × $0.115 = $5.75
- Total: $152.75/month
Analysis: For this workload, DynamoDB is cheapest. RDS requires larger instance for comparable performance, making it 10x more expensive. However, RDS provides SQL flexibility that DynamoDB lacks.
ElastiCache vs DynamoDB vs RDS Decision Framework
When to use caching vs primary database.
ElastiCache Use Cases
Use ElastiCache When:
- Database read bottleneck: 50-90% of queries hit same data repeatedly
- Sub-millisecond latency needed: DynamoDB (single-digit ms) not fast enough
- Offload database reads: Reduce RDS/Aurora load without scaling instance
- Session storage: Distributed session management for stateless apps
- Real-time analytics: Leaderboards, counters, rate limiting
Don’t Use ElastiCache As Primary Database because:
- No persistence (Redis snapshots available but not primary design)
- Cache invalidation complexity
- Limited query capabilities (key-based only)
Decision Matrix
| Need | Primary Database | Caching Layer |
|---|---|---|
| Query result caching | RDS/Aurora | ElastiCache (lazy loading) |
| Session storage | DynamoDB or ElastiCache | N/A (use directly) |
| Real-time leaderboards | DynamoDB | ElastiCache Redis (sorted sets) |
| User profiles (read-heavy) | RDS/Aurora | ElastiCache (write-through) |
| Product catalog | RDS/Aurora | ElastiCache (TTL-based) |
Common Architecture: RDS/Aurora + ElastiCache
- RDS: Source of truth
- ElastiCache: Read cache (lazy loading or write-through)
- 50-90% database load reduction
Cost Example:
- RDS without cache: db.r6g.2xlarge ($0.806/hour = $588/month)
- RDS with cache: db.r6g.large ($0.201/hour = $147/month) + cache.r6g.large ($0.201/hour = $147/month) = $294/month
- Savings: $294/month (50% reduction by offloading reads to cache)
Redshift vs RDS/Aurora Decision Framework
Data warehouse vs transactional database choice depends on query patterns.
When to Choose Redshift Over RDS/Aurora
1. Analytical Workload (OLAP):
- Complex aggregations across large datasets
- Queries scan millions of rows
- Infrequent writes, read-heavy
- BI dashboards and reports
2. Large Datasets:
-
100 GB of analytical data
- Petabyte-scale data warehousing
- Historical data analysis
3. Columnar Storage Benefits:
- Queries access subset of columns (not full rows)
- Heavy use of aggregations (SUM, AVG, COUNT)
- Compression benefits from columnar format
When to Choose RDS/Aurora Over Redshift
1. Transactional Workload (OLTP):
- Frequent writes (inserts, updates, deletes)
- Low-latency point queries (<100ms)
- ACID transactions required
2. Small Datasets:
- <100 GB of data
- Redshift overkill for small datasets
3. Real-Time Requirements:
- Sub-second write-to-read latency
- Immediate consistency required
Cost Comparison Example
Scenario: 500 GB data, 1,000 complex analytical queries/month
RDS (db.r6g.2xlarge):
- Instance: $0.806/hour = $588/month
- Storage: 500 GB × $0.115 = $57.50
- Total: $645.50/month
- Query Performance: Slow (not optimized for analytics)
Redshift Serverless:
- Compute: 8 RPUs × 50 hours (1,000 queries × 3 min avg) × $0.375/RPU-hour = $150
- Storage: 500 GB × $0.024 = $12
- Total: $162/month
- Query Performance: 10-100x faster (columnar, MPP)
Analysis: Redshift is 75% cheaper and significantly faster for analytical workloads.
Hybrid Architectures
Real-world systems often combine multiple database services for different use cases.
Common Patterns
1. OLTP + OLAP (Transactional + Analytical):
- RDS/Aurora: Operational database (writes, reads, transactions)
- Redshift: Analytics (aggregations, BI dashboards)
- Data flow: RDS → S3 (daily export) → Redshift COPY
Use case: E-commerce platform with real-time transactions + daily sales reports
2. OLTP + Caching:
- RDS/Aurora: Primary database
- ElastiCache: Read cache (50-90% load reduction)
- Data flow: Application → ElastiCache (cache miss) → RDS → ElastiCache (cache write)
Use case: High-traffic web application with database read bottleneck
3. NoSQL + Caching:
- DynamoDB: Primary NoSQL database
- DAX: DynamoDB Accelerator (microsecond latency)
- Data flow: Application → DAX (cache hit) → DynamoDB (cache miss)
Use case: Mobile app with millions of users, need microsecond latency
4. Data Lake + Data Warehouse:
- S3: Data lake (raw data, Parquet format)
- Redshift Spectrum: Query S3 without loading
- Redshift: Frequently accessed data (loaded via COPY)
- Data flow: S3 (infrequent data) + Redshift (frequent data), joined in queries
Use case: Large-scale analytics with hot/cold data separation
5. Multi-Database (Polyglot Persistence):
- RDS: User accounts, orders (relational)
- DynamoDB: Session storage, real-time data
- ElastiCache: Query result caching
- Redshift: Analytics and reporting
- S3: Object storage (images, documents)
Use case: Complex enterprise application with diverse data patterns
Decision Framework Worksheet
Use this worksheet to systematically evaluate database service selection.
Step 1: Classify Workload Type
- OLTP (Transactional): Frequent reads/writes, low latency, small transactions → RDS/Aurora or DynamoDB
- OLAP (Analytical): Complex aggregations, large scans, BI dashboards → Redshift
- Caching: Performance acceleration, offload database reads → ElastiCache
Step 2: Evaluate Data Model Requirements
- Relational: SQL, joins, foreign keys, ACID → RDS/Aurora
- NoSQL: Key-value, schemaless, horizontal scaling → DynamoDB
- Columnar: Analytical queries, subset of columns → Redshift
Step 3: Assess Access Patterns
- Complex queries: Joins, ad-hoc filters, aggregations → RDS/Aurora or Redshift
- Key-based lookups: Partition key + sort key queries → DynamoDB
- Frequent repeated queries: Same data accessed repeatedly → ElastiCache
Step 4: Determine Scalability Needs
- Vertical scaling sufficient: <128 TB, predictable growth → RDS/Aurora
- Horizontal scaling required: Unlimited growth, millions of requests/sec → DynamoDB
- Analytical scaling: Petabyte-scale data warehouse → Redshift
Step 5: Performance Requirements
- Sub-millisecond: Caching required → ElastiCache (Redis/Memcached)
- Single-digit millisecond: Key-value access → DynamoDB
- Low milliseconds: Transactional queries → RDS/Aurora
- Sub-second: Analytical queries → Redshift
Step 6: Cost Optimization
- Variable workload: On-demand or Serverless → DynamoDB On-Demand, Aurora Serverless v2, Redshift Serverless
- Steady workload: Provisioned capacity + Reserved Instances → RDS/Aurora RI, DynamoDB Provisioned, Redshift RI
- Cost-sensitive: Evaluate per-request vs per-hour pricing
Step 7: Operational Overhead
- Zero management: Serverless options → DynamoDB On-Demand, Aurora Serverless v2, Redshift Serverless
- Minimal management: Managed services → RDS, Aurora, Redshift provisioned
- Full control: Self-managed → EC2 (not recommended)
Migration Considerations
Migrating Between Database Services
RDS to Aurora:
- Effort: Low (same SQL engine)
- Method: Create Aurora read replica, promote to primary
- Downtime: <5 minutes (with read replica promotion)
RDS to DynamoDB:
- Effort: High (data model redesign)
- Method: AWS Database Migration Service (DMS) + application refactor
- Downtime: Can be zero (dual-write during migration)
On-Premises to RDS/Aurora:
- Effort: Medium (lift-and-shift)
- Method: AWS DMS or native replication
- Downtime: Minimal (cutover window)
Transactional to Analytical (RDS to Redshift):
- Effort: Low (data pipeline setup)
- Method: S3 export → Redshift COPY or AWS DMS
- Downtime: None (separate systems)
Key Takeaways
RDS/Aurora:
- Use for transactional workloads requiring SQL, joins, ACID transactions
- Aurora provides 3-5x performance, <35s failover, up to 15 read replicas at 20-30% premium
- Choose RDS for cost-sensitive or engine-specific needs (Oracle, SQL Server, Db2)
DynamoDB:
- Use for horizontal scaling beyond relational limits, key-based access patterns
- On-demand mode for unpredictable traffic, Provisioned for steady workloads >70% utilization
- Single-digit millisecond latency, unlimited scalability, zero operational overhead
ElastiCache:
- Use for caching frequently accessed data, offloading database reads (50-90% reduction)
- Redis for advanced features (data structures, persistence, replication)
- Memcached for simple key-value caching with multi-threading
Redshift:
- Use for analytical workloads, BI dashboards, complex aggregations on large datasets
- Serverless for variable workloads (<50% utilization), Provisioned + RI for steady workloads
- Redshift Spectrum queries S3 data lake without loading ($5/TB scanned)
Hybrid Architectures:
- Combine services for different use cases (OLTP + OLAP, OLTP + caching)
- Use data pipelines to sync between databases (RDS → Redshift)
- Polyglot persistence: Choose best database for each data pattern
Cost Optimization:
- Reserved Instances: 64-69% savings for predictable workloads
- Serverless options: 40-95% savings for variable workloads
- Right-size instances: Monitor utilization, scale based on metrics
- Caching: Reduce database instance size by offloading reads
Decision Factors:
- Workload type (OLTP, OLAP, caching)
- Data model (relational, NoSQL, columnar)
- Access patterns (complex queries, key-based, repeated queries)
- Scalability needs (vertical, horizontal, petabyte-scale)
- Performance requirements (sub-millisecond to sub-second)
- Cost constraints (variable vs steady workload)
- Operational overhead tolerance (serverless vs managed vs self-managed)
Found this guide helpful? Share it with your team:
Share on LinkedIn