IaC State Management
What is State
State is the record of infrastructure resources currently deployed, tracked by IaC tools to:
- Map configuration to real resources
- Track resource metadata
- Determine what changes are needed
- Manage dependencies between resources
Why State Matters
Without state, IaC tools cannot:
- Know what infrastructure currently exists
- Determine what needs to be created, updated, or deleted
- Track resource relationships
- Detect configuration drift
State contains:
- Resource IDs
- Resource attributes
- Dependencies between resources
- Provider configurations
- Sensitive data (passwords, private keys)
Do You Need State Management?
Cloud-native IaC tools handle state automatically; you don’t need to manage state yourself.
Tools That Manage State for You
AWS CloudFormation:
- AWS manages all state internally
- No state files to secure or back up
- No risk of state corruption
- No locking concerns
- Built-in drift detection
Azure Resource Manager (ARM Templates / Bicep):
- Azure manages deployment state
- Integrated with Azure portal
- Deployment history tracked automatically
Google Cloud Deployment Manager:
- GCP tracks deployment state
- Managed through GCP console
When State Management Is Your Responsibility
Terraform and Pulumi require you to manage state explicitly (unless using Terraform Cloud or Pulumi Cloud):
- You must configure remote backends
- You must implement locking mechanisms
- You must secure sensitive data in state
- You must handle backup and recovery
- You must prevent state corruption
Should You Accept This Complexity?
Only add user-managed state when you can articulate specific reasons:
Valid reasons:
- Multi-cloud requirements (managing AWS + Azure + GCP together)
- Need for Terraform/Pulumi-specific features not available in cloud-native tools
- Existing infrastructure already managed by Terraform/Pulumi
- Organizational standard requires specific tooling
Not valid reasons:
- “Terraform is popular”
- “We know Terraform already” (unless you have multi-cloud needs)
- “Terraform is industry standard” (CloudFormation is the standard for AWS-only)
The cost of user-managed state:
- Risk of state corruption causing infrastructure issues
- Complexity of securing state files containing sensitive data
- Operational overhead of managing remote backends and locking
- Recovery procedures when state issues occur
- Additional infrastructure to support state management
If you’re deploying only to AWS, CloudFormation eliminates all of these risks.
Local vs. Remote State
Never Use Local State for Teams
Local state files are not suitable for teams or production environments. They lack locking, backup, versioning, and team sharing capabilities. Always use remote state for collaborative work.
Local State
What it is: State stored on local filesystem.
terraform.tfstate
Advantages:
- Simple to get started
- No additional setup required
- Fast access
Disadvantages:
- ❌ Not suitable for teams (no sharing)
- ❌ No locking (concurrent changes dangerous)
- ❌ Risk of loss/corruption
- ❌ No backup/versioning
- ❌ Contains sensitive data locally
When to use:
- Learning and experimentation only
- Single developer, non-critical infrastructure
- Never for production
Remote State
What it is: State stored in a remote backend (S3, Azure Blob, GCS, Terraform Cloud).
Terraform S3 Backend:
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
Advantages:
- ✅ Shared access for teams
- ✅ State locking (prevents conflicts)
- ✅ Encryption at rest
- ✅ Versioning and backup
- ✅ Audit logging
- ✅ Centralized management
Disadvantages:
- Requires setup
- Depends on external service
- Potential costs
When to use:
- All team environments
- Production infrastructure
- Any collaborative work
Popular Remote Backends
AWS S3 + DynamoDB:
terraform {
backend "s3" {
bucket = "terraform-state"
key = "path/to/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks" # For locking
encrypt = true
kms_key_id = "arn:aws:kms:..." # Optional KMS encryption
}
}
Azure Blob Storage:
terraform {
backend "azurerm" {
resource_group_name = "terraform-state-rg"
storage_account_name = "terraformstate"
container_name = "tfstate"
key = "prod.terraform.tfstate"
}
}
Google Cloud Storage:
terraform {
backend "gcs" {
bucket = "terraform-state"
prefix = "prod"
}
}
Terraform Cloud:
terraform {
cloud {
organization = "my-org"
workspaces {
name = "production"
}
}
}
State Locking
What it is: Preventing simultaneous state modifications that could corrupt state.
How Locking Works
- Process acquires lock before modifying state
- Lock prevents other processes from modifying state
- Lock released after operation completes or fails
Implementations
DynamoDB (AWS + S3 backend):
terraform {
backend "s3" {
bucket = "terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks" # Locking table
}
}
DynamoDB table structure:
resource "aws_dynamodb_table" "terraform_locks" {
name = "terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
Consul:
terraform {
backend "consul" {
address = "consul.example.com"
path = "terraform/prod"
lock = true
}
}
Force Unlock (Use Carefully)
Most IaC tools provide a force-unlock operation for stuck locks. Use with extreme caution.
When to force unlock:
- Process crashed and left lock orphaned
- Lock is demonstrably stale (check lock timestamp)
- You’ve confirmed no one else is running operations
Never force unlock if:
- Someone else might be running operations
- Uncertain about lock state
- During normal business hours without team communication
State Management Best Practices
1. Always Use Remote State for Teams
Never rely on local state files for production or team environments.
# ❌ BAD: No backend configured
terraform {
# Uses local state
}
# ✅ GOOD: Remote backend
terraform {
backend "s3" {
bucket = "terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}
2. Secure State Files
State contains sensitive data!
Enable encryption:
terraform {
backend "s3" {
bucket = "terraform-state"
key = "terraform.tfstate"
encrypt = true # Enable encryption at rest
kms_key_id = "arn:aws:kms:..." # Use KMS for extra security
}
}
Restrict access:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::terraform-state/prod/*",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/TerraformRole"
}
}
]
}
Never commit state to Git:
# .gitignore
*.tfstate
*.tfstate.*
3. Backup State
Enable versioning:
resource "aws_s3_bucket_versioning" "terraform_state" {
bucket = aws_s3_bucket.terraform_state.id
versioning_configuration {
status = "Enabled"
}
}
Regular backups:
- S3 versioning (automatic)
- Cross-region replication
- Periodic manual backups
- Test restore procedures
4. Separate State by Environment
Don’t share state across environments:
s3://terraform-state/
├── dev/terraform.tfstate
├── staging/terraform.tfstate
└── production/terraform.tfstate
Or use separate buckets:
s3://terraform-state-dev/terraform.tfstate
s3://terraform-state-staging/terraform.tfstate
s3://terraform-state-prod/terraform.tfstate
5. Review Plans Before Applying
Always preview changes before applying them:
- Generate a plan showing what will change
- Review the plan carefully for unexpected changes
- Save the plan and apply exactly what was reviewed
- Never skip the preview step, especially in production
6. State File Security Checklist
- Remote backend configured
- Encryption at rest enabled
- State locking enabled
- Versioning enabled
- Access restricted via IAM
- State files not in Git
- Regular backups tested
- Separate state per environment
Common State Operations
IaC tools provide commands for viewing and managing state. Consult your tool’s documentation for current syntax:
Viewing State
You can:
- List all resources tracked in state
- Show details of specific resources
- View output values
- Inspect resource metadata and dependencies
Modifying State
Move operations:
- Rename resources in state (update reference without recreating)
- Move resources between modules
- Reorganize infrastructure code without destroying resources
Remove operations:
- Remove resources from state tracking (doesn’t delete the actual resource)
- Useful when manually deleting resources or transferring ownership
Import operations:
- Import existing infrastructure into state management
- Add resources created outside IaC to your state
- Essential for brownfield infrastructure adoption
Recovering from State Issues
Backup and restore:
- Download current state as backup before risky operations
- Restore from backup if state becomes corrupted
- Use versioning features (S3 versioning) for automatic backups
State recovery process:
- Download backup from remote backend
- Verify backup integrity
- Restore backup to remote backend
- Validate infrastructure matches restored state
Migrating State
Backend migration:
- Change state storage location (local → remote, or remote → different remote)
- Tool-specific migration commands handle data transfer
- Always backup state before migration
- Verify state after migration completes
Found this guide helpful? Share it with your team:
Share on LinkedIn