Data Management Patterns
Data Management Patterns
Data management patterns address how distributed systems handle data storage, access, and consistency across multiple services and databases.
Database per Service
Each microservice owns its data and database, ensuring loose coupling and service autonomy.
Use When:
- Building microservices architecture
- Services have different data storage requirements
- Want to enable independent service evolution
- Need to prevent database-level coupling
Challenges:
- No ACID transactions across services
- Data consistency becomes more complex
- Reporting across services requires aggregation
Example: E-commerce system where user service uses SQL database, product catalog uses document database, and recommendation engine uses graph database.
User Service → PostgreSQL
Product Service → MongoDB
Recommendation Service → Neo4j
Shared Database
Multiple services access the same database, sharing data directly.
Use When:
- Tight coupling between services is acceptable
- ACID transactions across services are required
- Migrating from monolithic applications
- Services have significant data overlap
Drawbacks:
- Creates tight coupling between services
- Database becomes a bottleneck
- Reduces service autonomy
- Schema changes affect multiple services
Event Sourcing
Pattern popularized by Martin Fowler and Greg Young in early 2000s
Stores all changes to application state as an immutable sequence of events rather than storing only current state. The current state is derived by replaying events from the beginning.
Use When:
- Need complete audit trail of all changes
- Want to replay events for testing or analytics
- Implementing complex business domains (especially with Domain-Driven Design)
- Building systems that benefit from temporal queries (“what was the state at time X?”)
Considerations:
- Event store grows continuously (implement snapshotting for performance)
- Complex queries may require rebuilding state from events
- Need to handle event schema evolution carefully
- Deleting data is complex (GDPR compliance requires special handling)
Example: Banking system that stores all account transactions as events (deposit, withdrawal, transfer) and calculates current balance by replaying events.
Events: [Deposit($100), Withdrawal($30), Deposit($50)]
Current Balance = Sum of events = $120
Can also query: "What was balance on March 1st?" by replaying events up to that date
CQRS (Command Query Responsibility Segregation)
Pattern introduced by Greg Young (2010), based on Bertrand Meyer’s Command-Query Separation principle (1988)
Separates read (query) and write (command) operations into different models, often with separate databases optimized for each operation type. Goes beyond CQS by using separate data models, not just separate methods.
Use When:
- Read and write patterns are significantly different
- Need to scale reads and writes independently
- Complex reporting requirements
- Different consistency requirements for reads and writes
- Working with Event Sourcing (natural fit)
Implementation Options:
- Simple: Separate models, same database
- Advanced: Separate databases for reads and writes
- Full: Event sourcing for writes, materialized projections for reads
Example: Social media platform with write-optimized database for posts and read-optimized database with denormalized data for feeds and searches.
Write: Post Service → Command DB (normalized)
Read: Feed Service → Query DB (denormalized, optimized for feeds)
Sync: Command DB → Events → Update Query DB
Warning: CQRS adds complexity. Don’t use unless you have a specific problem it solves.
Materialized View
Pre-computed and stored query results that are periodically updated, optimizing read performance for complex queries.
Use When:
- Complex queries are expensive to compute
- Query results don’t need to be real-time
- Read-heavy workloads with predictable query patterns
Trade-offs:
- Data freshness vs. performance
- Storage space for additional views
- Complexity of keeping views updated
Example: E-commerce analytics dashboard showing daily sales summaries, computed overnight and stored for fast display during business hours.
Nightly Job: Compute sales by category, region, time period → Store in materialized view
Dashboard: SELECT * FROM daily_sales_summary WHERE date = today
Quick Reference
Pattern Comparison
Pattern | Data Distribution | Consistency | Complexity | Use Case |
---|---|---|---|---|
Database per Service | Isolated | Eventual | High | Microservices independence |
Shared Database | Shared | Strong | Low | Monolith or tight coupling OK |
Event Sourcing | Event log | Eventual | High | Audit trail, event replay |
CQRS | Separate read/write | Varies | Medium-High | Different read/write patterns |
Materialized View | Cached | Stale | Low | Expensive query optimization |
Decision Tree
Independent services? → Database per Service Need ACID across services? → Shared Database (consider monolith instead) Need full audit trail? → Event Sourcing Different read/write patterns? → CQRS Expensive queries? → Materialized View
Consistency Trade-offs
Strong Consistency: Shared Database | CQRS (same DB) Eventual Consistency: Database per Service | Event Sourcing | CQRS (separate DBs) Stale Data OK: Materialized View
Found this guide helpful? Share it with your team:
Share on LinkedIn