AWS Route 53 for System Architects
What Is Amazon Route 53?
Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service that translates domain names into IP addresses and routes end users to applications.
What Problems Route 53 Solves:
- DNS reliability: Traditional DNS vulnerable to outages; Route 53 provides 100% uptime SLA
- Global traffic distribution: Manual traffic routing complex; Route 53 automates based on latency, geography, health
- Failover complexity: Detecting and routing around failures requires monitoring; Route 53 health checks automate failover
- Multi-region applications: Directing users to nearest region manually is impractical; Route 53 latency-based routing automates
- Traffic testing: A/B testing and blue/green deployments require traffic splitting; Route 53 weighted routing enables controlled rollouts
When to use Route 53:
- You need highly available DNS with 100% uptime SLA
- You require automated failover for multi-region applications
- You want to route users to the lowest-latency endpoint
- You need geolocation-based content delivery
- You want to implement blue/green deployments or A/B testing
Routing Policies
Route 53 offers seven routing policies for different traffic management scenarios.
Simple Routing
One-to-one mapping between domain and single resource.
How It Works:
- Returns single resource (IP address, load balancer, CloudFront distribution)
- No health checks
- If multiple values specified, returns all values in random order (client chooses)
Use Cases:
- Single web server
- Static websites
- Applications without redundancy
Example:
example.comβ203.0.113.5
Weighted Routing
Distribute traffic across multiple resources in specified proportions.
How It Works:
- Assign weight to each record (0-255)
- Traffic percentage = Record weight / Sum of all weights
- Can associate health checks (skip unhealthy resources)
Use Cases:
- A/B testing: 90% production, 10% new version
- Blue/green deployments: Gradually shift traffic from blue to green
- Load distribution: Unequal capacity across regions (70% us-east-1, 30% eu-west-1)
Example:
- Record A (weight 7): 70% traffic β us-east-1 load balancer
- Record B (weight 3): 30% traffic β eu-west-1 load balancer
Cost: No additional charge beyond standard query fees.
Latency-Based Routing
Route users to lowest-latency AWS Region.
How It Works:
- Route 53 measures latency from userβs location to each AWS Region
- Returns resource in Region with lowest latency
- Based on actual latency measurements, not geographic proximity
Use Cases:
- Multi-region applications prioritizing performance
- Global user base with uneven distribution
- Applications where speed matters more than data residency
Example:
- User in Tokyo β Routes to ap-northeast-1 (lowest latency)
- User in London β Routes to eu-west-2 (lowest latency)
- User in New York β Routes to us-east-1 (lowest latency)
Performance: Typically reduces latency by 50-70% vs single-region deployment.
Failover Routing
Active-passive failover for high availability.
How It Works:
- Define primary and secondary resources
- Health check monitors primary resource
- If primary fails, Route 53 automatically returns secondary
- Failback when primary recovers
Use Cases:
- Disaster recovery: Primary region (us-east-1) fails β Secondary region (us-west-2)
- Maintenance windows: Direct traffic to secondary during primary maintenance
- Active-passive architecture: Database read replicas, standby servers
Example:
- Primary: us-east-1 load balancer (health check: HTTPS on /health)
- Secondary: us-west-2 load balancer (no health check)
- Primary fails β Traffic routes to us-west-2
Failover Time: 30-60 seconds (health check frequency + TTL)
Geolocation Routing
Route based on userβs geographic location.
How It Works:
- Route 53 identifies userβs location (continent, country, state)
- Returns resource mapped to that location
- Can define default location (catch-all)
Use Cases:
- Content localization: Serve region-specific content (language, currency, pricing)
- Data residency: Keep EU usersβ data in EU regions (GDPR compliance)
- License restrictions: Block access from specific countries
- Load distribution: Regional load balancers
Example:
- Users in EU β eu-west-1 load balancer (GDPR compliance)
- Users in US β us-east-1 load balancer
- Users in Asia β ap-southeast-1 load balancer
- Default β us-east-1 load balancer
Granularity: Continent β Country β State (US only)
Geoproximity Routing (2024 Enhancement)
Route based on geographic location with bias adjustments.
How It Works:
- Routes to nearest resource by default
- Apply bias (+/-99) to expand or shrink geographic region
- Positive bias: Attract more traffic from farther away
- Negative bias: Reduce traffic from nearby areas
Use Cases:
- Cost optimization: Shift traffic to cheaper regions
- Capacity management: Reduce load on constrained resources
- Testing: Gradually expand new regionβs coverage
- Data residency with flexibility: Prefer local resources but allow overflow
Example:
- us-east-1 (bias +50): Expanded coverage, attracts traffic from wider area
- eu-west-1 (bias 0): Standard coverage
- ap-southeast-1 (bias -25): Reduced coverage, only serves nearby users
2024 Update: Expanded availability from Traffic Flow to all DNS records (public and private hosted zones).
Multivalue Answer Routing
Return multiple IP addresses with health checks.
How It Works:
- Returns up to 8 healthy records randomly selected
- Each record has own health check
- Client chooses from returned values
- Unhealthy records excluded
Use Cases:
- Simple load distribution: Multiple web servers without load balancer
- Cost optimization: Avoid load balancer costs for small applications
- DNS-based availability: Client-side failover
Example:
- 10 web server IP addresses with health checks
- Route 53 returns 8 healthy IPs
- Client connects to one randomly
vs Weighted Routing: Multivalue is simpler (equal distribution, no weight configuration).
Health Checks
Health checks monitor resource availability and enable automated failover.
Types of Health Checks
1. Endpoint Health Checks:
- Monitor HTTP/HTTPS/TCP endpoints
- Specified by IP address or domain name
- Check interval: 10 seconds (fast) or 30 seconds (standard)
- Failure threshold: 3 consecutive failures = unhealthy
2. Calculated Health Checks:
- Monitor status of other health checks
- Use Boolean logic (AND, OR, NOT)
- Example: Require 2 of 3 child health checks healthy
3. CloudWatch Alarm Health Checks:
- Monitor CloudWatch alarms
- Use any CloudWatch metric
- Example: Lambda error rate, DynamoDB throttles
Health Check Configuration
Endpoint Health Check Parameters:
- Protocol: HTTP, HTTPS, TCP
- Port: Default 80 (HTTP), 443 (HTTPS), or custom
- Path:
/healthor custom health check endpoint - Interval: 30 seconds (standard, $0.50/month) or 10 seconds (fast, $1/month)
- Failure threshold: Default 3 (1-10 allowed)
- String matching: Optional HTTP response body check
Health Check Regions:
- Route 53 checks from multiple global locations (15+ regions)
- Majority consensus determines health status
- Prevents false positives from single location issues
Failover Scenarios
Active-Passive Failover:
- Primary resource with health check
- Secondary resource (no health check, always considered healthy)
- Primary unhealthy β Traffic routes to secondary
Active-Active Failover:
- Multiple resources, each with health check
- Traffic distributed across healthy resources
- Unhealthy resources automatically excluded
Combination Failover:
- Mix routing policies (latency + failover, weighted + health checks)
- Complex multi-region architectures
- Example: Latency-based routing with failover per region
Health Check Pricing (2024)
- Standard health checks (30s interval): $0.50/month per health check
- Fast health checks (10s interval): $1.00/month per health check
- Calculated health checks: $0.50/month per health check
- CloudWatch alarm health checks: $0.50/month per health check
Example Cost:
- 10 endpoints Γ $0.50 = $5/month
- High availability across 3 regions (6 endpoints + 3 calculated) = $4.50/month
DNS Fundamentals
Record Types
A Record: IPv4 address
example.comβ203.0.113.5
AAAA Record: IPv6 address
example.comβ2001:0db8:85a3::8a2e:0370:7334
CNAME Record: Alias to another domain
www.example.comβexample.com- Cannot be used for zone apex (example.com)
Alias Record (AWS-specific):
- Points to AWS resources (CloudFront, ALB, S3, API Gateway)
- Can be used for zone apex
- Free queries (no charge for Alias record queries to AWS resources)
MX Record: Mail exchange
- Priority + mail server
10 mail.example.com
TXT Record: Text information
- SPF, DKIM, domain verification
TTL (Time to Live)
Controls how long DNS resolvers cache the record.
Short TTL (60-300 seconds):
- Faster propagation of changes
- Higher query costs (more queries to Route 53)
- Use for: Frequently changing IPs, testing, deployments
Long TTL (3600-86400 seconds):
- Lower query costs (fewer queries)
- Slower propagation of changes
- Use for: Stable resources, cost optimization
Alias Records: TTL managed by Route 53 (cannot be changed).
Best Practice: Start with short TTL (60s) during testing, increase to long TTL (3600s+) for production stability.
Hosted Zones
Container for DNS records for a domain.
Public Hosted Zones
DNS records accessible from the internet.
Pricing:
- $0.50/month per hosted zone
- $0.40 per million queries (first 1 billion/month)
- $0.20 per million queries (after 1 billion/month)
Use Cases:
- Public-facing websites
- APIs accessible from internet
- Email servers
Private Hosted Zones
DNS records accessible only within VPCs.
Pricing:
- $0.50/month per hosted zone
- Queries are FREE (no per-query charges)
Use Cases:
- Internal services (databases, microservices)
- Private APIs
- Service discovery within VPC
Configuration:
- Associate with one or more VPCs
- Can span multiple AWS accounts (VPC sharing)
Best Practice: Use private hosted zones for internal services to avoid exposing internal DNS and save on query costs.
Cost Optimization
Strategies to Reduce Route 53 Costs
Cost Optimization: Use Alias Records
Queries to Alias records pointing to AWS resources are FREE. This is the single biggest cost optimization for Route 53.
Example: 100 million queries/month to CloudFront:
- Via A record: $40/month
- Via Alias record: $0/month
- Savings: 100%
1. Use Alias Records for AWS Resources:
- Queries to Alias records pointing to AWS resources are FREE
- Standard A/AAAA records: $0.40 per million queries
- Savings: 100% on queries to CloudFront, ALB, S3, API Gateway
2. Increase TTL Values:
- Higher TTL = fewer queries = lower costs
- 60s TTL: ~1.4 billion queries/month for 1,000 req/s traffic
- 3600s TTL: ~24 million queries/month for same traffic
- Savings: 98% query reduction
3. Use Private Hosted Zones for Internal Services:
- Public hosted zone: $0.50/month + $0.40 per million queries
- Private hosted zone: $0.50/month + FREE queries
- Internal services with 1 billion queries: $400/month savings
4. Consolidate Hosted Zones:
- Multiple subdomains: Use single hosted zone with multiple records
- Example: api.example.com, www.example.com, cdn.example.com β One hosted zone
- Savings: $0.50/month per consolidated domain
5. Delete Unused Hosted Zones:
- Audit monthly billing for unused zones
- Each unused zone: $0.50/month wasted
6. Share Resolver Endpoints Across Accounts:
- Resolver endpoint: $0.125/hour per ENI = $91/month per endpoint
- Share single endpoint across multiple VPCs/accounts (same region)
- Savings: $91/month per avoided endpoint
Cost Example
Scenario: Website with 100 million requests/month, CloudFront + ALB
Before Optimization:
- Hosted zone: $0.50
- A record to CloudFront: 100M queries Γ $0.40/M = $40
- A record to ALB: 20M queries Γ $0.40/M = $8
- Total: $48.50/month
After Optimization:
- Hosted zone: $0.50
- Alias to CloudFront: FREE
- Alias to ALB: FREE
- Increased TTL (300s β 3600s): 80% fewer queries
- Total: $0.50/month
Savings: $48/month (99%)
Common Pitfalls
| Pitfall | Impact | Solution |
|---|---|---|
| 1. Using A records instead of Alias for AWS resources | $40/100M queries wasted | Use Alias records (free queries to AWS resources) |
| 2. Low TTL on stable resources | 10-60x higher query costs | Increase TTL to 3600s+ for production (98% cost reduction) |
| 3. No health checks for failover | Manual failover, downtime during outages | Configure health checks ($0.50/month, automated failover) |
| 4. Public hosted zone for internal services | $0.40/M queries + security risk | Use private hosted zones (free queries, VPC-only access) |
| 5. No default geolocation record | Users in unmapped locations get NXDOMAIN | Always define default location (catch-all) |
| 6. Single-region deployment | Users far from region experience high latency | Use latency-based routing across multiple regions |
| 7. Not testing health checks | False positives/negatives, improper failover | Test health checks with intentional failures |
| 8. Unused hosted zones | $0.50/month wasted per zone | Audit and delete unused zones monthly |
| 9. Equal weighted routing for unequal capacity | Overloading smaller instances | Adjust weights based on capacity (70/30 vs 50/50) |
| 10. No bias in geoproximity routing | Traffic distribution doesnβt match needs | Use bias to shift traffic to preferred regions |
| 11. Missing secondary for failover | Failover incomplete (no fallback) | Always define secondary resource |
| 12. Health check interval too slow | 30s+ detection delay | Use fast health checks (10s) for critical apps ($1/month) |
| 13. Not using private hosted zones | Exposing internal DNS, higher costs | Create private hosted zones for internal services |
| 14. Complex routing without testing | Unexpected traffic patterns | Test routing policies in dev/staging first |
| 15. No CloudWatch alarms for health checks | Unnoticed health check failures | Create alarms for HealthCheckStatus metric |
Cost Impact Examples:
- Pitfall #1 (A vs Alias): 100M queries = $40/month wasted
- Pitfall #2 (Low TTL): 60s β 3600s TTL = 98% query reduction
- Pitfall #4 (Public vs Private): 1B internal queries = $400/month wasted
Integration Patterns
Route 53 + CloudFront
Global Content Delivery:
- Route 53 Alias record β CloudFront distribution (free queries)
- Latency-based routing to multiple CloudFront distributions (multi-region)
- Geolocation routing for compliance (serve EU content from EU distribution)
Route 53 + Application Load Balancer
Multi-Region Load Balancing:
- Latency-based routing β ALB in each region
- Health checks on ALB targets
- Automatic failover if region becomes unhealthy
Blue/Green Deployments:
- Weighted routing: 90% blue ALB, 10% green ALB
- Gradually shift weights: 70/30, 50/50, 30/70, 0/100
- Rollback: Shift weight back to blue
Route 53 + API Gateway
API Traffic Management:
- Alias record β API Gateway (free queries)
- Weighted routing for API version testing (v1: 95%, v2: 5%)
- Geolocation routing for regional APIs (GDPR compliance)
Route 53 + Multi-Region Databases
Read Replica Routing:
- Latency-based routing to RDS read replicas in each region
- Health checks on each replica
- Lowest-latency reads for global applications
Failover to Secondary Region:
- Primary record β us-east-1 Aurora cluster (health check)
- Secondary record β us-west-2 Aurora cluster
- Automatic failover if primary cluster fails
Key Takeaways
Routing Policies:
- Simple: Single resource, no health checks
- Weighted: A/B testing, blue/green (10% new version, 90% old)
- Latency-Based: Route to lowest-latency region (50-70% latency reduction)
- Failover: Active-passive disaster recovery (30-60s failover)
- Geolocation: Content localization, data residency (GDPR)
- Geoproximity: Location-based with bias (2024: available for all records)
- Multivalue: Simple load distribution (up to 8 healthy IPs)
Health Checks:
- Monitor HTTP/HTTPS/TCP endpoints ($0.50/month standard, $1/month fast)
- Enable automated failover (no manual intervention)
- Check from 15+ global locations (majority consensus)
- Calculated health checks combine multiple checks (Boolean logic)
Cost Optimization:
- Use Alias records for AWS resources (100% query cost savings)
- Increase TTL for stable resources (98% query reduction: 60s β 3600s)
- Private hosted zones for internal services (free queries vs $0.40/M)
- Consolidate hosted zones ($0.50/month per zone)
High Availability:
- Failover routing with health checks (30-60s automated failover)
- Multi-region latency-based routing (50-70% latency improvement)
- Active-active with weighted routing + health checks
100% Uptime SLA:
- Route 53 is the only AWS service with 100% availability SLA
- Globally distributed infrastructure (multiple edge locations)
- No single point of failure
Pricing:
- Hosted zones: $0.50/month
- Queries: $0.40 per million (first 1B), $0.20 per million (after 1B)
- Alias queries to AWS resources: FREE
- Health checks: $0.50-$1/month each
Found this guide helpful? Share it with your team:
Share on LinkedIn