Infrastructure as code is mandatory in 2026; doing it badly is worse than not doing it. This post is the working set of practices.
Repo structure
infra/
├── modules/ # reusable building blocks
│ ├── postgres/
│ ├── kubernetes-cluster/
│ └── service/
├── environments/
│ ├── dev/
│ │ └── main.tf # uses modules
│ ├── staging/
│ │ └── main.tf
│ └── prod/
│ └── main.tf
└── README.md
Each environment has its own state. Modules are versioned; environments pin module versions.
Remote state with locking
# Terraform / OpenTofu
terraform {
backend "s3" {
bucket = "my-tfstate"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "tfstate-locks"
encrypt = true
}
}
S3 + DynamoDB lock prevents concurrent applies. Never let two engineers run apply on prod at the same time.
For Pulumi: Pulumi Cloud or self-hosted backend with similar guarantees.
Module design
# modules/postgres/main.tf
variable "name" { type = string }
variable "instance_class" { type = string default = "db.t4g.micro" }
variable "allocated_storage" { type = number default = 20 }
variable "tags" { type = map(string) default = {} }
resource "aws_db_instance" "this" {
identifier = var.name
instance_class = var.instance_class
allocated_storage = var.allocated_storage
...
tags = merge(var.tags, { Module = "postgres" })
}
output "endpoint" { value = aws_db_instance.this.endpoint }
Sensible defaults, overridable. Outputs explicit. Versioned.
Tagging discipline
Every resource tagged:
default_tags = {
Owner = "team-data"
Environment = "prod"
ManagedBy = "terraform"
CostCenter = "engineering-data"
Repo = "infra"
}
Tags drive cost reports, ownership lookups, automated cleanup. Without tags, you can’t answer “whose is this?”
Code review
Treat IaC like code:
- PRs reviewed.
planoutput included.- No direct apply to prod outside CI.
For GitOps -style: PR triggers plan; merge triggers apply.
CI pipeline
on:
pull_request:
paths: ["infra/**"]
jobs:
plan:
runs-on: ubuntu-latest
strategy:
matrix:
env: [dev, staging, prod]
steps:
- uses: actions/checkout@v4
- uses: opentofu/setup-opentofu@v1
- run: tofu init
working-directory: infra/environments/${{ matrix.env }}
- run: tofu plan -out=plan.bin
working-directory: infra/environments/${{ matrix.env }}
- uses: actions/upload-artifact@v4
with: { path: infra/environments/${{ matrix.env }}/plan.bin }
apply:
if: github.ref == 'refs/heads/main'
needs: [plan]
environment: production # required-reviewer gate
runs-on: ubuntu-latest
steps:
- run: tofu apply plan.bin
Plan on PR; apply on main only with manual approval for prod.
Drift detection
Schedule weekly plan; alert on drift:
on:
schedule:
- cron: "0 8 * * 1" # Monday 8am
jobs:
drift:
runs-on: ubuntu-latest
steps:
- run: tofu plan -detailed-exitcode
# exit 0 = no diff; 2 = diff (drift); 1 = error
Drift is usually someone clicking in the console. Investigate; either fix the IaC or revert the manual change.
Secrets, never in IaC
Don’t put secret values in *.tf. Pull them at apply time from a secret manager:
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "prod/db/password"
}
resource "aws_db_instance" "this" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
}
For secrets management at scale.
Cost guardrails
Use Infracost in CI to flag cost-changing PRs:
- uses: infracost/actions/setup@v3
- run: infracost diff --path=infra/environments/prod --compare-to=infracost-base.json
Flags “this PR adds $5k/month” before merging. Catches accidents.
Common mistakes
1. One big state file
Everything in one state. Apply takes hours. One bad change blocks all others. Split by environment, by domain.
2. No code review on IaC
PRs ship to prod with no second pair of eyes. Disasters.
3. Hand-rolled cloud changes
“Just this once” — six months later, IaC and reality drift, deploys fail, mystery bugs. Always through IaC.
4. No tagging
Cost reports useless. Resource ownership unclear.
5. Silently mutable defaults
A module’s default changes; existing environments silently get the new behavior. Pin module versions.
What I’d ship today
For a new IaC setup:
- OpenTofu (Pulumi vs Terraform vs OpenTofu ).
- Modular structure with versioned modules.
- Per-environment state, S3 + DynamoDB locking.
- Atlantis or GitHub Actions for plan/apply automation.
- Tags everywhere.
- Infracost in CI.
- Weekly drift detection.
- Secrets via AWS Secrets Manager / Vault.
Read this next
- Pulumi vs Terraform vs OpenTofu
- GitOps with Argo CD and Flux
- Secrets Management in 2026
- Cloud Cost Optimization in 2026
If you want my OpenTofu module library + CI templates, it’s at rajpoot.dev .
Building something AI-, backend-, or data-heavy and want a second pair of eyes? I do consulting and freelance work — see my projects and ways to reach me at rajpoot.dev .