IaC Implementation Patterns
Modular Design
What it is: Breaking infrastructure code into reusable, self-contained components (modules).
Purpose: Modules encapsulate resources that work together as a logical unit (e.g., a VPC with subnets, a database with backups). Theyβre about code reuse and abstraction, not about deployment organization (thatβs layering).
Think of modules as functions or libraries in programming: reusable building blocks you can use anywhere.
Think of modules as: Functions or libraries in programming; reusable building blocks you can use anywhere.
Benefits
- Code reuse across projects and environments
- Easier testing (test module once, use everywhere)
- Encapsulation (module internals hidden, clear interfaces)
- Standardization (everyone uses same VPC module, for example)
- Faster development (donβt rewrite common patterns)
Module Structure
Each module is self-contained:
modules/
βββ vpc/ # Reusable VPC module
β βββ main.tf # VPC, subnets, routing, NAT
β βββ variables.tf # Inputs (CIDR, AZ count, etc.)
β βββ outputs.tf # Outputs (VPC ID, subnet IDs)
β βββ README.md # How to use this module
βββ rds/ # Reusable RDS module
β βββ main.tf # RDS instance, subnet group, params
β βββ variables.tf # Inputs (engine, size, etc.)
β βββ outputs.tf # Outputs (endpoint, port)
β βββ README.md
βββ eks/ # Reusable EKS module
βββ main.tf # EKS cluster, node groups
βββ variables.tf
βββ outputs.tf
βββ README.md
Using Modules
Modules are consumed by layers (foundation, platform, etc.):
# foundation/networking.tf
# Foundation layer USES the VPC module
module "vpc" {
source = "../modules/vpc"
cidr_block = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
environment = "prod"
enable_nat = true
}
# platform/shared-database.tf
# Platform layer USES the RDS module
module "shared_db" {
source = "../modules/rds"
vpc_id = data.terraform_remote_state.foundation.outputs.vpc_id
subnet_ids = data.terraform_remote_state.foundation.outputs.database_subnet_ids
engine = "postgres"
instance_class = "db.r5.xlarge"
multi_az = true
}
Module Best Practices
1. Single Responsibility
- Each module should do one thing well
- Avoid monolithic modules
- Keep modules focused and cohesive
2. Well-Defined Interfaces
# variables.tf - Clear inputs
variable "environment" {
type = string
description = "Environment name (dev, staging, prod)"
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
# outputs.tf - Clear outputs
output "vpc_id" {
description = "The ID of the VPC"
value = aws_vpc.main.id
}
3. Version Modules
# Use versioned modules
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.1.2" # Pin to specific version
}
4. Document Modules
Each module should have a README.md with:
- Purpose and description
- Usage example
- Input variables table
- Output values table
Example README.md:
# VPC Module
Creates a VPC with public and private subnets across multiple AZs.
## Usage
module "vpc" {
source = "./modules/vpc"
cidr_block = "10.0.0.0/16"
environment = "production"
}
## Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|----------|
| cidr_block | VPC CIDR block | string | n/a | yes |
| environment | Environment name | string | n/a | yes |
## Outputs
| Name | Description |
|------|-------------|
| vpc_id | The ID of the VPC |
| private_subnet_ids | List of private subnet IDs |
Environment Separation
What it is: Managing separate infrastructure configurations for different environments (dev, staging, production).
Approaches
Approach 1: Directory Structure
infrastructure/
βββ environments/
β βββ dev/
β β βββ main.tf
β β βββ variables.tf
β β βββ terraform.tfvars
β βββ staging/
β β βββ main.tf
β β βββ variables.tf
β β βββ terraform.tfvars
β βββ production/
β βββ main.tf
β βββ variables.tf
β βββ terraform.tfvars
βββ modules/
βββ ...
Advantages:
- Clear separation
- Different configurations per environment
- Easy to see all environments
Disadvantages:
- Code duplication
- Must update all environments for changes
Approach 2: Workspaces (Terraform)
# Create workspaces
terraform workspace new dev
terraform workspace new staging
terraform workspace new production
# Switch workspace
terraform workspace select production
# Deploy
terraform apply
# Use workspace in configuration
resource "aws_instance" "web" {
instance_type = terraform.workspace == "production" ? "t3.large" : "t3.micro"
tags = {
Environment = terraform.workspace
}
}
Advantages:
- Single codebase
- Easy to switch environments
- Less duplication
Disadvantages:
- All environments share same code
- Harder to apply different configurations
- Risk of accidental changes to wrong environment
Approach 3: Separate Repositories
infrastructure-dev/
infrastructure-staging/
infrastructure-production/
Advantages:
- Complete isolation
- Different access controls per environment
- No risk of cross-environment changes
Disadvantages:
- Maximum code duplication
- Hard to keep synchronized
- More repositories to manage
Recommended Approach
Hybrid: Directories + Separate State
infrastructure/
βββ modules/ # Shared modules
β βββ ...
βββ environments/
β βββ dev/
β β βββ backend.tf # Dev state config
β β βββ main.tf # Uses modules
β β βββ dev.tfvars # Dev-specific values
β βββ staging/
β β βββ backend.tf
β β βββ main.tf
β β βββ staging.tfvars
β βββ production/
β βββ backend.tf
β βββ main.tf
β βββ production.tfvars
Benefits:
- Shared modules (DRY)
- Separate state files (isolation)
- Environment-specific configurations
- Clear structure
Layered Architecture
Modules vs. Layers
Modules = Reusable code (VPC module, RDS module)
Layers = Deployment units that USE modules (foundation layer uses VPC module)
Modules are about code reuse. Layers are about deployment organization and team ownership.
What it is: Organizing infrastructure into separate deployment layers based on ownership and change frequency.
Purpose: Layers are about deployment organization and team ownership, not code reuse (thatβs modules). Each layer is deployed independently and has its own state.
Think of layers as: Deployment units owned by different teams with different release schedules.
How layers and modules work together:
- Modules = Reusable code (VPC module, RDS module)
- Layers = Deployment units that USE modules (foundation layer uses VPC module)
Why Layer?
- Faster deployments (deploy only the layer that changed, not everything)
- Reduced blast radius (change to application layer doesnβt risk foundation)
- Clear dependencies (application depends on platform, platform depends on foundation)
- Easier rollbacks (rollback one layer without affecting others)
- Better team ownership (platform team owns platform layer, app teams own application layer)
Common Layers
Layer 1: Foundation (rarely changes, managed by platform team)
- VPCs and core networking (subnets, routing tables, NAT gateways)
- Transit gateways and VPN connections
- Base DNS zones
- Core security groups
- Network ACLs
Layer 2: Platform (changes occasionally, managed by platform/DevOps)
- Shared databases and data stores (not app-specific)
- Message queues and event buses
- Container registries
- Kubernetes/ECS clusters
- Shared load balancers
- Monitoring and logging infrastructure
- Shared caching layers
Layer 3: DevOps (changes occasionally, managed by DevOps team)
- Source code repositories
- CI/CD pipelines
- Artifact stores
- Build agents
- Secret management infrastructure
- Deployment automation tools
Layer 4: Application (changes frequently, managed by app teams)
- Application-specific compute (EC2, Lambda, containers)
- Application-owned databases
- Application-specific queues/topics
- Auto-scaling configurations
- Application load balancers
- Application-specific IAM roles
- Feature flags and config
Key principle: Layers reflect organizational ownership and change frequency, not just resource types. A database might be in Platform (shared) or Application (app-owned) depending on who manages it.
Implementation
infrastructure/
βββ foundation/
β βββ networking.tf
β βββ vpc.tf
β βββ dns.tf
βββ platform/
β βββ shared-databases.tf
β βββ message-queues.tf
β βββ container-registry.tf
β βββ monitoring.tf
βββ devops/
β βββ pipelines.tf
β βββ artifact-stores.tf
β βββ repositories.tf
βββ applications/
βββ webapp/
β βββ compute.tf
β βββ database.tf
β βββ load-balancer.tf
βββ api/
βββ lambda.tf
βββ api-gateway.tf
Dependency Management
Use outputs and data sources:
# Layer 1 (foundation/outputs.tf)
output "vpc_id" {
value = aws_vpc.main.id
}
# Layer 2 (platform/main.tf)
data "terraform_remote_state" "foundation" {
backend = "s3"
config = {
bucket = "terraform-state"
key = "foundation/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_eks_cluster" "main" {
vpc_config {
subnet_ids = data.terraform_remote_state.foundation.outputs.private_subnet_ids
}
}
IaC Code Ownership
The question: Should application-specific IaC live with the application code or in a centralized infrastructure repository?
The answer: Hybrid approach based on layers; application teams own application layer IaC, DevOps owns foundation/platform/devops layers.
Option 1: IaC With Application Code (Application Layer Only)
What belongs with the app:
my-payment-service/
βββ src/
β βββ ... (application code)
βββ Dockerfile
βββ infrastructure/
β βββ compute.tf # Lambda/ECS/EC2 for this app
β βββ database.tf # App-owned database
β βββ queue.tf # App-specific queue
β βββ api-gateway.tf # App-specific API Gateway
βββ .github/workflows/
βββ deploy.yml # Deploys both app and infrastructure
Resources that belong here:
- Application-specific compute (Lambda functions, ECS tasks, EC2 instances)
- Application-owned databases (not shared with other apps)
- Application-specific queues/topics
- Auto-scaling configurations
- Application load balancers
- Application-specific IAM roles
Why this works:
- Deployment coupling: App code and infrastructure change together
- Team autonomy: App team deploys without waiting for DevOps
- Versioning: Infrastructure version matches application version
- Rollback simplicity: Roll back app AND its infrastructure together
- Clear ownership: App team owns everything related to their service
Example deployment:
# .github/workflows/deploy.yml
name: Deploy Payment Service
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
environment: production # Requires manual approval
steps:
- uses: actions/checkout@v3
# Deploy infrastructure first
- name: Deploy Infrastructure
run: |
cd infrastructure
terraform init
terraform apply -auto-approve
env:
AWS_ROLE_ARN: $
# Then deploy application
- name: Deploy Application
run: |
docker build -t payments:$ .
docker push payments:$
# Deploy to infrastructure created above
Option 2: Centralized IaC Repository (Foundation, Platform, DevOps Layers)
What stays centralized:
platform-infrastructure/
βββ foundation/
β βββ networking.tf # VPCs, subnets, routing
β βββ dns.tf # Route 53 zones
β βββ security-groups.tf # Base security groups
βββ platform/
β βββ shared-database.tf # Shared RDS instance
β βββ message-bus.tf # Shared EventBridge/SQS
β βββ container-registry.tf # ECR
β βββ monitoring.tf # CloudWatch, X-Ray
βββ devops/
βββ pipelines.tf # CodePipeline
βββ repositories.tf # CodeCommit
βββ artifact-stores.tf # S3 for artifacts
Resources that stay centralized:
- Foundation layer: VPCs, networking, DNS, core security groups
- Platform layer: Shared databases, message queues, container registries, monitoring
- DevOps layer: CI/CD pipelines, repositories, artifact stores
Why centralized:
- Shared across applications: Many apps use the same VPC, shared database, etc.
- Requires deep expertise: Network architecture, security architecture
- High blast radius: Changes affect all applications
- Strict change control: Needs architecture review and approval
Security Controls That Make This Safe
The concern: βWonβt app teams abuse permissions if they control IaC?β
The answer: No, because of multiple layers of preventive controls:
1. IAM Permission Boundaries
Limit what app teams can create even with their deployment role:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowApplicationLayerOnly",
"Effect": "Allow",
"Action": "*",
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:RequestTag/layer": "application",
"aws:RequestTag/domain": "payments"
}
}
},
{
"Sid": "DenyFoundationChanges",
"Effect": "Deny",
"Action": [
"ec2:DeleteVpc",
"ec2:DeleteSubnet",
"ec2:ModifyVpcAttribute",
"ec2:DeleteRouteTable"
],
"Resource": "*"
}
]
}
2. Service Control Policies (SCPs)
Enforce tagging at organization level:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RequireTagsOnCreate",
"Effect": "Deny",
"Action": [
"ec2:RunInstances",
"rds:CreateDBInstance",
"lambda:CreateFunction"
],
"Resource": "*",
"Condition": {
"Null": {
"aws:RequestTag/environment": "true",
"aws:RequestTag/layer": "true",
"aws:RequestTag/domain": "true"
}
}
}
]
}
3. CloudFormation Hooks
Validate templates before deployment:
RequireTagsHook:
Type: AWS::CloudFormation::Hook
Properties:
TypeName: AWSSamples::RequireTags::Hook
TargetStacks: ALL
FailureMode: FAIL
Properties:
RequiredTags:
- environment
- layer
- domain
4. Approval Gates
Require manual approval for production:
environment: production
# GitHub/GitLab requires approval from designated reviewers
5. AWS Config Rules
Detect non-compliant resources after creation:
ResourcesManagedByTeam:
Type: AWS::Config::ConfigRule
Properties:
ConfigRuleName: resources-have-required-tags
Source:
Owner: AWS
SourceIdentifier: REQUIRED_TAGS
InputParameters:
tag1Key: environment
tag2Key: layer
tag3Key: domain
Recommended Structure
Hybrid approach combining both:
# Application repos (owned by app teams)
payment-service/
βββ src/
βββ infrastructure/ # Application layer only
β βββ compute.tf
β βββ database.tf
β βββ queue.tf
βββ .github/workflows/deploy.yml
identity-service/
βββ src/
βββ infrastructure/ # Application layer only
β βββ lambda.tf
β βββ api-gateway.tf
β βββ dynamodb.tf
βββ .github/workflows/deploy.yml
# Platform repo (owned by DevOps/Platform team)
platform-infrastructure/
βββ foundation/ # Foundation layer
β βββ networking.tf
β βββ dns.tf
β βββ security-groups.tf
βββ platform/ # Platform layer
β βββ shared-database.tf
β βββ message-bus.tf
β βββ monitoring.tf
βββ devops/ # DevOps layer
βββ pipelines.tf
βββ repositories.tf
βββ artifact-stores.tf
Decision Matrix
Use this to decide where IaC should live:
| Question | With App Code | Centralized |
|---|---|---|
| Used by only one application? | β | β |
| Shared across multiple apps? | β | β |
| Changes frequently with app? | β | β |
| Rarely changes (weeks/months)? | β | β |
| App team has expertise? | β | β |
| Requires deep infra expertise? | β | β |
| Blast radius = single app? | β | β |
| Blast radius = all apps? | β | β |
Examples:
- Lambda function for one app β With app code
- VPC used by all apps β Centralized
- App-owned DynamoDB table β With app code
- Shared RDS instance β Centralized
- Application load balancer β With app code
- Container registry (ECR) β Centralized
Benefits of Hybrid Approach
For application teams:
- Deploy infrastructure and code together
- No waiting for DevOps tickets
- Full ownership of their domain
- Faster iteration cycles
- Clear responsibility boundaries
For DevOps/Platform team:
- Focus on shared infrastructure
- Enforce standards via guardrails
- Manage high-impact changes carefully
- Provide self-service capabilities
- Reduce ticket queue
For the organization:
- Faster time to market
- Clear ownership and accountability
- Reduced bottlenecks
- Standards enforced via automation
- Auditability (all changes in Git)
GitOps Workflow
What it is: Using Git as the single source of truth for infrastructure state and changes.
Core Principles
- Declarative: Infrastructure defined declaratively
- Versioned: All changes in Git
- Immutable: Donβt modify running infrastructure directly
- Automated: Changes automatically applied from Git
- Auditable: Full history in Git
Workflow
Developer
β
Create branch
β
Make changes to IaC
β
Commit and push
β
Create Pull Request
β
Automated checks (lint, validate, plan)
β
Code Review
β
Merge to main
β
CI/CD Pipeline
β
Automated deployment
β
Infrastructure Updated
Implementation
CI/CD Pipeline Stages:
On Pull Request:
- Checkout code
- Initialize IaC tool
- Validate syntax
- Generate plan
- Comment plan output on PR for review
- Run security/compliance scans
On Merge to Main:
- Checkout code
- Initialize IaC tool
- Generate plan (verify it matches approved PR plan)
- Require approval gate for production changes
- Apply infrastructure changes
- Report results
Key implementation considerations:
- Trigger pipelines only when infrastructure code changes
- Use separate jobs for plan vs. apply (plan runs on PR, apply runs on merge)
- Store IaC tool state remotely, not in pipeline
- Use environment protection rules for production deployments
- Implement approval gates before applying to sensitive environments
Benefits
Audit Trail:
- Every change tracked in Git
- Who made what change and when
- Easy to see change history
Easy Rollback:
- Revert Git commits to previous infrastructure version
- Push the revert commit
- CI/CD automatically applies the rollback
- Full audit trail of what was rolled back and why
Collaborative:
- Code review process enforced
- Knowledge sharing through PRs
- Documentation in commit messages
Tools
- Terraform automation via pull requests
- Plan on PR, apply on merge
- Locks to prevent conflicts
- Self-hosted GitOps for Terraform
- GitOps for Kubernetes
- Continuous deployment from Git
- Automatic drift detection and reconciliation
Best Practices
Code Organization
1. Consistent Structure
infrastructure/
βββ README.md
βββ .gitignore
βββ modules/
βββ environments/
βββ scripts/
2. Naming Conventions
# Resources: <project>-<environment>-<resource>
resource "aws_s3_bucket" "data" {
bucket = "myapp-prod-s3-data"
}
# Variables: lowercase with underscores
variable "instance_type" {
type = string
}
3. DRY (Donβt Repeat Yourself)
- Use modules for repeated patterns
- Use variables for environment differences
- Use locals for computed values
4. Documentation
- README in each directory
- Comments for complex logic
- Variable descriptions
- Output descriptions
Change Management
1. Always Review Changes
- Use
terraform planbeforeapply - Review plan output carefully
- Use change sets (CloudFormation)
2. Small, Incremental Changes
- One logical change per PR
- Easier to review and test
- Simpler to roll back
3. Automated Testing
- Lint on every commit
- Validate on every PR
- Integration tests before production
4. Approval Gates
# Require manual approval for production
environment: production
Version Control
What to commit:
- β Infrastructure code
- β Module definitions
- β Documentation
- β Scripts
What NOT to commit:
- β State files
- β Sensitive values (.tfvars with secrets)
- β .terraform/ directory
- β Provider plugins
.gitignore:
# Terraform
**/.terraform/*
*.tfstate
*.tfstate.*
crash.log
*.tfvars # Or be selective
.terraform.lock.hcl
# Sensitive
*.pem
*.key
secrets.yaml
Found this guide helpful? Share it with your team:
Share on LinkedIn