Mohammad Gufran Jahangir February 11, 2026 0

You don’t really “learn Terraform” when you run terraform apply once and see something created.

You learn Terraform when you understand these four things:

  1. State (what Terraform thinks exists)
  2. Plan (what Terraform is about to change)
  3. Modules (how teams scale Terraform without chaos)
  4. Workspaces (how the same code can manage multiple environments safely)

This guide is built to make you productive fast—without assuming you already know cloud architecture or DevOps tricks.


Table of Contents

The 60-second mental model (so everything clicks)

Terraform works in a loop:

  1. You write desired state (HCL code)
  2. Terraform reads current state (from state file + provider APIs)
  3. Terraform creates a plan (diff between desired and current)
  4. Terraform applies the plan to reach desired state
  5. Terraform updates state so it remembers what happened

If you remember only one thing:
Terraform is a state manager first, and a resource creator second.


Part 1 — Terraform State: what it is, why it matters, and how it breaks

What “state” really is

Terraform state is Terraform’s memory: a mapping of:

  • which resources exist
  • their IDs in the cloud provider
  • important attributes Terraform needs later
  • relationships between resources

Without state, Terraform cannot reliably update or delete things.

Example: why state is essential

You create an EC2 instance (or VM) using Terraform.
Later you change the instance type in code.

Terraform must know which existing instance to modify.

That “which one” is stored in state.


The #1 beginner mistake with state

Keeping state locally on your laptop and sharing the project with teammates.

Why it’s dangerous:

  • Two people apply at the same time → race condition
  • Someone applies from an old branch → drift and surprise changes
  • Your laptop dies → state is gone → recovery pain

✅ Real teams use remote state + state locking.


Remote state (what you should do from day 1)

Remote state means:

  • state is stored centrally (so your team shares one truth)
  • state is locked during apply (so nobody collides)

Common storage options:

  • AWS: S3 + DynamoDB lock
  • Azure: Storage Account
  • GCP: GCS
  • Terraform Cloud: managed state + locking

If you do only one “pro” thing as a beginner, do this one.


What is “drift” and how you spot it

Drift happens when someone changes infrastructure outside Terraform:

  • clicking in console
  • manual updates
  • scripts not tracked

Terraform will detect drift during plan and show differences.

✅ Habit: run terraform plan before every merge.


State safety rules (print these in your head)

  • Never commit .tfstate to Git
  • Use remote state + locking
  • Keep state per environment (prod separate from dev)
  • Avoid manual edits to state (except emergencies)
  • Take state backups (most remotes do this)

Part 2 — Modules: the secret to scaling Terraform cleanly

What is a module (in plain English)?

A module is a reusable package of Terraform code.

Think of it like a function:

  • inputs = variables
  • outputs = values you export
  • internals = resources and logic

Terraform already uses modules even when you don’t realize it:

  • your root folder is the “root module”

Why modules matter (even when you’re small)

Without modules, Terraform projects often become:

  • one giant folder
  • copy-paste resources per environment
  • inconsistent naming
  • impossible refactoring

Modules give you:

  • consistency
  • reuse
  • easier reviews
  • safer changes

A beginner-friendly module structure

Recommended repository layout

terraform/
  envs/
    dev/
      main.tf
      backend.tf
      providers.tf
      terraform.tfvars
    prod/
      main.tf
      backend.tf
      providers.tf
      terraform.tfvars

  modules/
    network/
      main.tf
      variables.tf
      outputs.tf
    compute/
      main.tf
      variables.tf
      outputs.tf
  • envs/* = small “wiring” code per environment
  • modules/* = reusable building blocks

Real example: a tiny module (conceptually)

Let’s say you want a “network” module that creates:

  • VPC/VNet
  • subnets
  • route tables

The idea:

modules/network exposes:

  • cidr_block
  • public_subnet_cidrs
  • private_subnet_cidrs

and outputs:

  • vpc_id
  • public_subnet_ids
  • private_subnet_ids

Then envs/dev/main.tf consumes it like:

  • “create a network using these inputs”
  • “give me outputs to attach other resources”

That’s module thinking.


Module design rules that prevent pain later

1) Make modules boring

A good module is predictable, not clever.

  • fewer conditionals
  • fewer “magic defaults”
  • clear variable names

2) Keep inputs minimal

If a module needs 40 variables, it’s too big or too leaky.

3) Output only what consumers need

Don’t output everything. Output what’s useful.

4) Version your modules

When teams reuse a module, treat it like a product.

  • version tags
  • changelog
  • breaking change discipline

Part 3 — Workspaces: what they are, when to use them, and when NOT to

This is where beginners often get confused, so let’s make it crystal clear.

What is a workspace?

A Terraform workspace is essentially a separate state under the same configuration.

Same code, different state.

So you can do:

  • workspace dev → state A
  • workspace prod → state B

The workspace trap (important)

Workspaces are tempting for environments, but they can be risky.

Why?

Because:

  • it’s easy to run apply in the wrong workspace
  • environment differences become messy
  • secrets/variables separation can get sloppy

When workspaces ARE a good idea

Workspaces are great when environments are “same-ish,” like:

  • multiple identical stacks (per customer, per region, per feature preview)
  • temporary environments (PR environments)
  • sandboxes

Example:

  • customer-a, customer-b, customer-c all identical

When workspaces are NOT the best choice

For classic dev / stage / prod, most teams prefer:

  • separate folders (or separate repos)
  • separate remote state backends
  • separate accounts/subscriptions/projects

Because prod deserves stronger guardrails.

✅ Best beginner path:

  • Use separate env directories
  • Use separate remote state config
  • Add workspaces later if you really need them

Part 4 — Best Practices (the stuff that saves you from 2am incidents)

1) Always run Terraform like a pipeline

A safe workflow looks like:

  1. fmt (format)
  2. validate (syntax)
  3. plan (diff review)
  4. approval
  5. apply

If you apply without review, you will eventually create a costly incident.


2) Keep environments isolated

A clean model:

  • dev in one account/project/subscription
  • prod in another
  • separate state
  • separate credentials

Even if you’re solo today, this protects you tomorrow.


3) Don’t hardcode values (use variables + locals)

Good Terraform projects:

  • have a small set of inputs (terraform.tfvars)
  • compute consistent naming using locals
  • avoid repeated literals everywhere

Example:

  • you set project = "cloudopsnow"
  • use that consistently in names/tags

4) Use consistent naming and tagging everywhere

This is not “nice to have.”
It is how you:

  • find resources
  • allocate costs
  • debug issues
  • automate policies

Minimum tags:

  • app, env, owner, team, managed_by

5) Don’t fight Terraform: avoid “manual drift”

Terraform wants to be the source of truth.

If people click-change resources:

  • plans become noisy
  • rollbacks get weird
  • “why did this change?” becomes impossible

Make a rule:

  • changes go through Terraform unless there’s an emergency
  • emergency changes get backfilled into code immediately

6) Use lifecycle rules carefully (they’re sharp knives)

Terraform has lifecycle options like:

  • prevent_destroy (stop accidental deletes)
  • ignore_changes (stop Terraform from changing certain fields)

These can be useful—but if you use them blindly, you can hide real problems.

Beginner-friendly tip:

  • use prevent_destroy for critical data resources (DBs) in prod
  • avoid ignore_changes unless you’re sure why

7) Treat module outputs and dependencies like APIs

When you output something from a module, other code will depend on it.
Changing it later can break everything.

So:

  • keep output names stable
  • document what outputs mean
  • prefer additive changes over breaking changes

8) Plan for secrets from day 1 (even as a beginner)

Never store secrets directly in .tfvars committed to Git.

Instead:

  • load secrets from secret managers
  • pass them via environment variables
  • use encrypted variable stores in CI

Beginner rule:
✅ if it’s a password/key/token → it shouldn’t live in Git.


9) Use “small modules” first, then compose bigger systems

Great Terraform architecture is like Lego:

  • small blocks → compose into a platform

Start with modules like:

  • network
  • compute
  • database
  • iam
  • observability

Then build higher-level modules later.


Part 5 — A full “beginner project” walkthrough (conceptual, but real)

Let’s imagine your first real Terraform project is:

Create a basic web stack

  • VPC/network
  • compute/service
  • security rules
  • outputs

Step-by-step flow

Step 1: Create remote state

  • choose backend
  • enable locking

Step 2: Create modules/network

  • vpc + subnets outputs

Step 3: Create modules/compute

  • instance/service uses subnet ids

Step 4: Create envs/dev

  • call both modules
  • set small sizes

Step 5: Add outputs

  • endpoint, ids, useful values

Step 6: Apply to dev

  • review plan carefully
  • confirm tags and naming

Step 7: Create envs/prod

  • separate backend/state
  • separate account if possible
  • bigger sizes + stronger policies (like prevent_destroy)

This is the same pattern used by real companies—just bigger.


Part 6 — The 12 mistakes beginners make (and how to avoid them)

  1. Local state in a team → use remote + locking
  2. One state for all environments → separate state per env
  3. Copy-paste code for dev/prod → use modules
  4. Huge “main.tf” file → split files logically
  5. Hardcoding values → variables + locals
  6. No consistent tags → tag policy from day 1
  7. Applying without reading plan → plan review is mandatory
  8. Manual console changes → backfill into Terraform
  9. Too many module inputs → simplify module interfaces
  10. Workspaces for prod too early → use env dirs first
  11. Ignoring drift → regular plan checks
  12. Storing secrets in tfvars → use secret managers/env vars

If you avoid just these, you’ll already be ahead of most “intermediate” Terraform users.


Part 7 — A beginner-friendly “Terraform checklist” (use this every time)

Before you merge:

  • terraform fmt
  • terraform validate
  • terraform plan reviewed by someone (even if it’s future-you)
  • tags included
  • state is remote + locked
  • no secrets in code
  • changes are isolated to the right environment

Before you apply:

  • confirm account/subscription/project is correct
  • confirm backend/state is correct
  • confirm workspace (if you use them)
  • read the plan line-by-line

After apply:

  • run a second plan (should show no changes)
  • save the plan output in CI logs
  • document what changed

Final takeaway (the “aha” moment)

Terraform is not hard because the syntax is hard.

Terraform is hard because you’re managing real infrastructure with state, and small mistakes scale fast.

So start with the right habits:

  • remote state + locking
  • modules for reuse
  • environments isolated
  • plan review always
  • boring, consistent code

Do that—and Terraform becomes one of the most powerful, calm, reliable tools in your engineering toolkit.


Here’s a copy-paste friendly starter kit you can use as your “Terraform for Beginners” baseline:

  • ✅ Clean folder skeleton (dev/prod separated)
  • ✅ Two real modules: network + compute
  • ✅ Remote state + locking pattern (safe for teams)
  • ✅ Practical defaults, tagging, and outputs
  • ✅ A PR/review checklist teams actually use

No links. Just usable content.


1) Recommended folder skeleton (simple + scalable)

terraform/
  .gitignore
  README.md

  modules/
    network/
      main.tf
      variables.tf
      outputs.tf
      versions.tf

    compute/
      main.tf
      variables.tf
      outputs.tf
      versions.tf

  envs/
    dev/
      backend.tf
      versions.tf
      providers.tf
      main.tf
      terraform.tfvars
      outputs.tf

    prod/
      backend.tf
      versions.tf
      providers.tf
      main.tf
      terraform.tfvars
      outputs.tf

Why this layout works

  • modules/ = reusable building blocks
  • envs/dev and envs/prod = tiny “wiring” + environment values
  • each env has its own state (safe separation)

2) Root .gitignore (must-have)

# Local .terraform directories
**/.terraform/*

# Terraform state files
*.tfstate
*.tfstate.*

# Crash log files
crash.log
crash.*.log

# Terraform plan files
*.tfplan
*.plan
plan.out

# CLI configuration files
.terraformrc
terraform.rc

# Sensitive variable files (keep tfvars minimal + non-secret)
*.auto.tfvars
*.auto.tfvars.json

# Mac/Windows noise
.DS_Store
Thumbs.db

3) Module: modules/network (VPC + public + private subnets, NAT optional)

modules/network/versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.0"
    }
  }
}

modules/network/variables.tf

variable "name" {
  description = "Base name used for resources (e.g., cloudopsnow-dev)"
  type        = string
}

variable "vpc_cidr" {
  description = "CIDR for the VPC"
  type        = string
}

variable "azs" {
  description = "List of AZs (e.g., [\"us-east-1a\", \"us-east-1b\"])"
  type        = list(string)
}

variable "public_subnet_cidrs" {
  description = "Public subnet CIDRs, one per AZ"
  type        = list(string)
}

variable "private_subnet_cidrs" {
  description = "Private subnet CIDRs, one per AZ"
  type        = list(string)
}

variable "enable_nat_gateway" {
  description = "If true, create 1 NAT Gateway (costs money). Good for private subnets needing internet egress."
  type        = bool
  default     = false
}

variable "tags" {
  description = "Common tags applied to all resources"
  type        = map(string)
  default     = {}
}

modules/network/main.tf

locals {
  common_tags = merge(var.tags, {
    "Name"       = var.name
    "managed_by" = "terraform"
  })
}

resource "aws_vpc" "this" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = merge(local.common_tags, { "Name" = "${var.name}-vpc" })
}

resource "aws_internet_gateway" "this" {
  vpc_id = aws_vpc.this.id
  tags   = merge(local.common_tags, { "Name" = "${var.name}-igw" })
}

# Public Subnets
resource "aws_subnet" "public" {
  count                   = length(var.azs)
  vpc_id                  = aws_vpc.this.id
  availability_zone       = var.azs[count.index]
  cidr_block              = var.public_subnet_cidrs[count.index]
  map_public_ip_on_launch = true

  tags = merge(local.common_tags, {
    "Name" = "${var.name}-public-${var.azs[count.index]}"
    "tier" = "public"
  })
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.this.id
  tags   = merge(local.common_tags, { "Name" = "${var.name}-public-rt" })
}

resource "aws_route" "public_internet" {
  route_table_id         = aws_route_table.public.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = aws_internet_gateway.this.id
}

resource "aws_route_table_association" "public_assoc" {
  count          = length(var.azs)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

# Private Subnets
resource "aws_subnet" "private" {
  count             = length(var.azs)
  vpc_id            = aws_vpc.this.id
  availability_zone = var.azs[count.index]
  cidr_block        = var.private_subnet_cidrs[count.index]

  tags = merge(local.common_tags, {
    "Name" = "${var.name}-private-${var.azs[count.index]}"
    "tier" = "private"
  })
}

# Optional NAT (1 NAT for simplicity + cost control)
resource "aws_eip" "nat" {
  count  = var.enable_nat_gateway ? 1 : 0
  domain = "vpc"
  tags   = merge(local.common_tags, { "Name" = "${var.name}-nat-eip" })
}

resource "aws_nat_gateway" "this" {
  count         = var.enable_nat_gateway ? 1 : 0
  allocation_id = aws_eip.nat[0].id
  subnet_id     = aws_subnet.public[0].id

  tags = merge(local.common_tags, { "Name" = "${var.name}-nat" })
}

resource "aws_route_table" "private" {
  vpc_id = aws_vpc.this.id
  tags   = merge(local.common_tags, { "Name" = "${var.name}-private-rt" })
}

resource "aws_route" "private_egress" {
  count                  = var.enable_nat_gateway ? 1 : 0
  route_table_id         = aws_route_table.private.id
  destination_cidr_block = "0.0.0.0/0"
  nat_gateway_id         = aws_nat_gateway.this[0].id
}

resource "aws_route_table_association" "private_assoc" {
  count          = length(var.azs)
  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private.id
}

modules/network/outputs.tf

output "vpc_id" {
  value = aws_vpc.this.id
}

output "public_subnet_ids" {
  value = [for s in aws_subnet.public : s.id]
}

output "private_subnet_ids" {
  value = [for s in aws_subnet.private : s.id]
}

4) Module: modules/compute (one EC2 instance + security group)

This is intentionally simple. It teaches the basics without drowning you.

modules/compute/versions.tf

terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 5.0"
    }
  }
}

modules/compute/variables.tf

variable "name" {
  description = "Base name for resources"
  type        = string
}

variable "vpc_id" {
  description = "VPC ID for security group"
  type        = string
}

variable "subnet_id" {
  description = "Subnet to launch instance into (use a public subnet for SSH/HTTP demo)"
  type        = string
}

variable "ami_id" {
  description = "AMI ID (keep it explicit for beginners)"
  type        = string
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
}

variable "ssh_cidr" {
  description = "CIDR allowed to SSH (use your IP/32). For demo you can use 0.0.0.0/0 but it's not recommended."
  type        = string
  default     = "0.0.0.0/0"
}

variable "user_data" {
  description = "Optional user_data script"
  type        = string
  default     = ""
}

variable "tags" {
  description = "Common tags"
  type        = map(string)
  default     = {}
}

modules/compute/main.tf

locals {
  common_tags = merge(var.tags, {
    "Name"       = var.name
    "managed_by" = "terraform"
  })
}

resource "aws_security_group" "web" {
  name        = "${var.name}-sg"
  description = "Allow SSH and HTTP for demo"
  vpc_id      = var.vpc_id

  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [var.ssh_cidr]
  }

  ingress {
    description = "HTTP"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    description = "All egress"
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = local.common_tags
}

resource "aws_instance" "this" {
  ami                    = var.ami_id
  instance_type          = var.instance_type
  subnet_id              = var.subnet_id
  vpc_security_group_ids = [aws_security_group.web.id]

  user_data = var.user_data

  tags = merge(local.common_tags, {
    "role" = "demo-web"
  })
}

modules/compute/outputs.tf

output "instance_id" {
  value = aws_instance.this.id
}

output "public_ip" {
  value = aws_instance.this.public_ip
}

output "security_group_id" {
  value = aws_security_group.web.id
}

5) Environment wiring: envs/dev

envs/dev/versions.tf

terraform {
  required_version = ">= 1.5.0"
}

envs/dev/providers.tf

provider "aws" {
  region = var.aws_region
}

envs/dev/backend.tf (remote state + locking)

terraform {
  backend "s3" {
    bucket         = "REPLACE_ME_TFSTATE_BUCKET"
    key            = "cloudopsnow/dev/terraform.tfstate"
    region         = "REPLACE_ME_REGION"
    dynamodb_table = "REPLACE_ME_TFSTATE_LOCK_TABLE"
    encrypt        = true
  }
}

envs/dev/main.tf

locals {
  name = "${var.project}-${var.env}"

  tags = {
    app        = var.project
    env        = var.env
    team       = var.team
    owner      = var.owner
    cost_center = var.cost_center
  }
}

module "network" {
  source = "../../modules/network"

  name                = local.name
  vpc_cidr             = var.vpc_cidr
  azs                  = var.azs
  public_subnet_cidrs  = var.public_subnet_cidrs
  private_subnet_cidrs = var.private_subnet_cidrs

  enable_nat_gateway = var.enable_nat_gateway
  tags               = local.tags
}

module "compute" {
  source = "../../modules/compute"

  name          = "${local.name}-web"
  vpc_id        = module.network.vpc_id
  subnet_id     = module.network.public_subnet_ids[0]
  ami_id        = var.ami_id
  instance_type = var.instance_type
  ssh_cidr      = var.ssh_cidr

  user_data = <<-EOF
    #!/bin/bash
    set -e
    yum update -y
    yum install -y httpd
    systemctl enable httpd
    systemctl start httpd
    echo "Hello from ${local.name}" > /var/www/html/index.html
  EOF

  tags = local.tags
}

envs/dev/outputs.tf

output "vpc_id" {
  value = module.network.vpc_id
}

output "web_public_ip" {
  value = module.compute.public_ip
}

envs/dev/terraform.tfvars (example values)

project     = "cloudopsnow"
env         = "dev"
team        = "platform"
owner       = "devops-team"
cost_center = "cc-001"

aws_region = "us-east-1"

# Pick 2 AZs
azs = ["us-east-1a", "us-east-1b"]

vpc_cidr = "10.10.0.0/16"

public_subnet_cidrs  = ["10.10.1.0/24", "10.10.2.0/24"]
private_subnet_cidrs = ["10.10.11.0/24", "10.10.12.0/24"]

enable_nat_gateway = false

# Replace with a valid AMI for your region
ami_id        = "ami-REPLACE_ME"
instance_type = "t3.micro"

# Set this to YOUR public IP/32 for safer SSH
ssh_cidr = "0.0.0.0/0"

envs/dev/variables.tf (add this file)

Create envs/dev/variables.tf (same for prod) with:

variable "project" { type = string }
variable "env" { type = string }

variable "team" { type = string }
variable "owner" { type = string }
variable "cost_center" { type = string }

variable "aws_region" { type = string }

variable "azs" { type = list(string) }

variable "vpc_cidr" { type = string }
variable "public_subnet_cidrs" { type = list(string) }
variable "private_subnet_cidrs" { type = list(string) }

variable "enable_nat_gateway" { type = bool }

variable "ami_id" { type = string }
variable "instance_type" { type = string }
variable "ssh_cidr" { type = string }

6) envs/prod (same files, different values + separate state key)

Copy envs/dev/* to envs/prod/* and change:

envs/prod/backend.tf

terraform {
  backend "s3" {
    bucket         = "REPLACE_ME_TFSTATE_BUCKET"
    key            = "cloudopsnow/prod/terraform.tfstate"
    region         = "REPLACE_ME_REGION"
    dynamodb_table = "REPLACE_ME_TFSTATE_LOCK_TABLE"
    encrypt        = true
  }
}

envs/prod/terraform.tfvars (typical prod adjustments)

project     = "cloudopsnow"
env         = "prod"
team        = "platform"
owner       = "devops-team"
cost_center = "cc-001"

aws_region = "us-east-1"

azs = ["us-east-1a", "us-east-1b"]

vpc_cidr = "10.20.0.0/16"
public_subnet_cidrs  = ["10.20.1.0/24", "10.20.2.0/24"]
private_subnet_cidrs = ["10.20.11.0/24", "10.20.12.0/24"]

enable_nat_gateway = true

ami_id        = "ami-REPLACE_ME"
instance_type = "t3.small"

ssh_cidr = "YOUR_IP/32"

7) How to run it (safe step-by-step)

From terraform/envs/dev:

terraform fmt -recursive
terraform init
terraform validate
terraform plan -out plan.out
terraform apply plan.out

Then sanity check:

terraform plan

If it shows no changes, your state + config are consistent. That’s a great sign.


8) Workspaces (optional): how to use without getting burned

For beginners, I recommend dev/prod folders first.

If later you want workspaces for multiple identical stacks (like customer-a, customer-b), then:

terraform workspace new customer-a
terraform workspace select customer-a

Safety rule: add workspace name into your resource naming convention so you don’t overlap resources by accident.


9) PR / Review checklist (use this every single time)

✅ Code hygiene

  • terraform fmt -recursive clean
  • terraform validate passes
  • no secrets in code or tfvars
  • variable names are clear (no mystery abbreviations)
  • tags include app, env, team, owner, cost_center, managed_by

✅ State & environment safety

  • remote state is enabled (no local tfstate)
  • state key is correct for this env (dev ≠ prod)
  • provider region/account is correct (prod mistakes hurt)

✅ Plan review (most important)

  • plan was generated and reviewed line-by-line
  • destructive changes are understood (especially replacements)
  • changes match the intent of the PR title/description
  • outputs still make sense after change

✅ Module discipline

  • module inputs kept minimal
  • outputs are stable (avoid breaking consumers)
  • naming is consistent across resources
  • no copy-paste resource blocks across envs (prefer modules)

✅ Operational readiness

  • critical resources have protection where needed (example: prevent_destroy in prod)
  • non-prod cost controls considered (scheduling, NAT usage, sizes)
  • rollback plan exists for risky changes

  • One repo
  • Separate root environments per cloud (because providers + backends are different)
  • Parallel module interfaces (same inputs/outputs style) so your brain doesn’t reset per cloud

⚠️ Important truth: Trying to “switch” AWS/Azure/GCP inside the same root module with conditionals usually becomes messy and unsafe. Separate roots per cloud is the practical approach.

Below is a multi-cloud Terraform starter kit you can copy into a blog and also actually use.


Multi-Cloud Terraform Starter Kit (AWS / Azure / GCP)

1) Recommended folder structure (works for teams)

terraform/
  .gitignore
  README.md

  common/
    naming.tf
    variables.tf

  modules/
    aws/
      network/
        main.tf
        variables.tf
        outputs.tf
        versions.tf
      compute/
        main.tf
        variables.tf
        outputs.tf
        versions.tf

    azure/
      network/
        main.tf
        variables.tf
        outputs.tf
        versions.tf
      compute/
        main.tf
        variables.tf
        outputs.tf
        versions.tf

    gcp/
      network/
        main.tf
        variables.tf
        outputs.tf
        versions.tf
      compute/
        main.tf
        variables.tf
        outputs.tf
        versions.tf

  envs/
    aws/
      dev/
        backend.tf
        versions.tf
        providers.tf
        variables.tf
        main.tf
        terraform.tfvars
        outputs.tf
      prod/
        (same as dev)

    azure/
      dev/
        backend.tf
        versions.tf
        providers.tf
        variables.tf
        main.tf
        terraform.tfvars
        outputs.tf
      prod/
        (same as dev)

    gcp/
      dev/
        backend.tf
        versions.tf
        providers.tf
        variables.tf
        main.tf
        terraform.tfvars
        outputs.tf
      prod/
        (same as dev)

Why this is great for beginners:

  • You learn one pattern (module + env wiring) and repeat it 3 times
  • Each cloud env has its own backend + its own state (safe)
  • Modules stay reusable and reviewable

2) Common “team” variables (shared idea across clouds)

Create: common/variables.tf

variable "project"     { type = string }
variable "env"         { type = string } # dev/prod
variable "team"        { type = string }
variable "owner"       { type = string }
variable "cost_center" { type = string }

Create: common/naming.tf

locals {
  name = "${var.project}-${var.env}"

  # Tag keys differ by cloud, but the idea is the same everywhere
  base_tags = {
    project     = var.project
    env         = var.env
    team        = var.team
    owner       = var.owner
    cost_center = var.cost_center
    managed_by  = "terraform"
  }
}

3) Backend templates (remote state) — one per cloud

You must create backend storage resources once (S3 bucket, Azure storage, GCS bucket). Do that upfront, then Terraform will safely store state.

AWS backend (envs/aws/dev/backend.tf)

terraform {
  backend "s3" {
    bucket         = "REPLACE_ME_TFSTATE_BUCKET"
    key            = "cloudopsnow/aws/dev/terraform.tfstate"
    region         = "REPLACE_ME_REGION"
    dynamodb_table = "REPLACE_ME_LOCK_TABLE"
    encrypt        = true
  }
}

Azure backend (envs/azure/dev/backend.tf)

terraform {
  backend "azurerm" {
    resource_group_name  = "REPLACE_ME_RG"
    storage_account_name = "REPLACE_ME_STORAGE"
    container_name       = "tfstate"
    key                  = "cloudopsnow/azure/dev/terraform.tfstate"
  }
}

GCP backend (envs/gcp/dev/backend.tf)

terraform {
  backend "gcs" {
    bucket = "REPLACE_ME_TFSTATE_BUCKET"
    prefix = "cloudopsnow/gcp/dev"
  }
}

(Prod is identical, only the key/prefix changes.)


4) AWS: Network + Compute (minimal but real)

modules/aws/network/versions.tf

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    aws = { source = "hashicorp/aws", version = ">= 5.0" }
  }
}

modules/aws/network/variables.tf

variable "name" { type = string }
variable "vpc_cidr" { type = string }
variable "azs" { type = list(string) }
variable "public_subnet_cidrs" { type = list(string) }
variable "tags" { type = map(string) default = {} }

modules/aws/network/main.tf

locals {
  tags = merge(var.tags, { Name = var.name, managed_by = "terraform" })
}

resource "aws_vpc" "this" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true
  tags                 = merge(local.tags, { Name = "${var.name}-vpc" })
}

resource "aws_internet_gateway" "igw" {
  vpc_id = aws_vpc.this.id
  tags   = merge(local.tags, { Name = "${var.name}-igw" })
}

resource "aws_subnet" "public" {
  count                   = length(var.azs)
  vpc_id                  = aws_vpc.this.id
  availability_zone       = var.azs[count.index]
  cidr_block              = var.public_subnet_cidrs[count.index]
  map_public_ip_on_launch = true
  tags                    = merge(local.tags, { Name = "${var.name}-public-${var.azs[count.index]}" })
}

resource "aws_route_table" "public" {
  vpc_id = aws_vpc.this.id
  tags   = merge(local.tags, { Name = "${var.name}-public-rt" })
}

resource "aws_route" "public_default" {
  route_table_id         = aws_route_table.public.id
  destination_cidr_block = "0.0.0.0/0"
  gateway_id             = aws_internet_gateway.igw.id
}

resource "aws_route_table_association" "public_assoc" {
  count          = length(var.azs)
  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

modules/aws/network/outputs.tf

output "vpc_id" { value = aws_vpc.this.id }
output "public_subnet_ids" { value = [for s in aws_subnet.public : s.id] }

modules/aws/compute/main.tf (simple EC2 + SG)

variables.tf

variable "name" { type = string }
variable "vpc_id" { type = string }
variable "subnet_id" { type = string }
variable "ami_id" { type = string }
variable "instance_type" { type = string }
variable "ssh_cidr" { type = string default = "0.0.0.0/0" }
variable "tags" { type = map(string) default = {} }

main.tf

locals { tags = merge(var.tags, { Name = var.name, managed_by = "terraform" }) }

resource "aws_security_group" "this" {
  name   = "${var.name}-sg"
  vpc_id = var.vpc_id

  ingress { from_port = 22 to_port = 22 protocol = "tcp" cidr_blocks = [var.ssh_cidr] }
  ingress { from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] }

  egress  { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] }

  tags = local.tags
}

resource "aws_instance" "this" {
  ami                    = var.ami_id
  instance_type          = var.instance_type
  subnet_id              = var.subnet_id
  vpc_security_group_ids = [aws_security_group.this.id]
  tags                   = local.tags
}

outputs.tf

output "public_ip" { value = aws_instance.this.public_ip }

5) Azure: Network + Compute (minimal but real)

modules/azure/network/versions.tf

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    azurerm = { source = "hashicorp/azurerm", version = ">= 3.0" }
  }
}

modules/azure/network/variables.tf

variable "name" { type = string }
variable "location" { type = string }
variable "address_space" { type = list(string) }
variable "subnet_cidr" { type = string }
variable "tags" { type = map(string) default = {} }

modules/azure/network/main.tf

locals { tags = merge(var.tags, { managed_by = "terraform" }) }

resource "azurerm_resource_group" "rg" {
  name     = "${var.name}-rg"
  location = var.location
  tags     = local.tags
}

resource "azurerm_virtual_network" "vnet" {
  name                = "${var.name}-vnet"
  location            = var.location
  resource_group_name = azurerm_resource_group.rg.name
  address_space       = var.address_space
  tags                = local.tags
}

resource "azurerm_subnet" "subnet" {
  name                 = "${var.name}-subnet"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = [var.subnet_cidr]
}

modules/azure/network/outputs.tf

output "resource_group_name" { value = azurerm_resource_group.rg.name }
output "subnet_id" { value = azurerm_subnet.subnet.id }
output "location" { value = azurerm_resource_group.rg.location }

modules/azure/compute/main.tf (Linux VM + Public IP + NSG)

variables.tf

variable "name" { type = string }
variable "location" { type = string }
variable "resource_group_name" { type = string }
variable "subnet_id" { type = string }
variable "admin_username" { type = string }
variable "ssh_public_key" { type = string }
variable "vm_size" { type = string }
variable "tags" { type = map(string) default = {} }

main.tf

locals { tags = merge(var.tags, { managed_by = "terraform" }) }

resource "azurerm_public_ip" "pip" {
  name                = "${var.name}-pip"
  location            = var.location
  resource_group_name = var.resource_group_name
  allocation_method   = "Static"
  sku                 = "Standard"
  tags                = local.tags
}

resource "azurerm_network_security_group" "nsg" {
  name                = "${var.name}-nsg"
  location            = var.location
  resource_group_name = var.resource_group_name
  tags                = local.tags

  security_rule {
    name                       = "SSH"
    priority                   = 100
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "22"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }

  security_rule {
    name                       = "HTTP"
    priority                   = 110
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "80"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }
}

resource "azurerm_network_interface" "nic" {
  name                = "${var.name}-nic"
  location            = var.location
  resource_group_name = var.resource_group_name
  tags                = local.tags

  ip_configuration {
    name                          = "ipconfig"
    subnet_id                     = var.subnet_id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = azurerm_public_ip.pip.id
  }
}

resource "azurerm_network_interface_security_group_association" "assoc" {
  network_interface_id      = azurerm_network_interface.nic.id
  network_security_group_id = azurerm_network_security_group.nsg.id
}

resource "azurerm_linux_virtual_machine" "vm" {
  name                = "${var.name}-vm"
  location            = var.location
  resource_group_name = var.resource_group_name
  size                = var.vm_size
  admin_username      = var.admin_username
  network_interface_ids = [azurerm_network_interface.nic.id]

  admin_ssh_key {
    username   = var.admin_username
    public_key = var.ssh_public_key
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-jammy"
    sku       = "22_04-lts"
    version   = "latest"
  }

  tags = local.tags
}

outputs.tf

output "public_ip" { value = azurerm_public_ip.pip.ip_address }

6) GCP: Network + Compute (minimal but real)

modules/gcp/network/versions.tf

terraform {
  required_version = ">= 1.5.0"
  required_providers {
    google = { source = "hashicorp/google", version = ">= 5.0" }
  }
}

modules/gcp/network/variables.tf

variable "name" { type = string }
variable "project_id" { type = string }
variable "region" { type = string }
variable "subnet_cidr" { type = string }
variable "tags" { type = map(string) default = {} }

modules/gcp/network/main.tf

resource "google_compute_network" "vpc" {
  name                    = "${var.name}-vpc"
  project                 = var.project_id
  auto_create_subnetworks = false
}

resource "google_compute_subnetwork" "subnet" {
  name          = "${var.name}-subnet"
  project       = var.project_id
  region        = var.region
  network       = google_compute_network.vpc.id
  ip_cidr_range = var.subnet_cidr
}

# Allow SSH + HTTP (for demo)
resource "google_compute_firewall" "allow_ssh_http" {
  name    = "${var.name}-fw-ssh-http"
  project = var.project_id
  network = google_compute_network.vpc.name

  allow { protocol = "tcp" ports = ["22", "80"] }

  source_ranges = ["0.0.0.0/0"]
}

modules/gcp/network/outputs.tf

output "network_name" { value = google_compute_network.vpc.name }
output "subnet_self_link" { value = google_compute_subnetwork.subnet.self_link }

modules/gcp/compute/main.tf

variables.tf

variable "name" { type = string }
variable "project_id" { type = string }
variable "zone" { type = string }
variable "subnet_self_link" { type = string }
variable "machine_type" { type = string }
variable "tags" { type = map(string) default = {} }

main.tf

resource "google_compute_instance" "vm" {
  name         = "${var.name}-vm"
  project      = var.project_id
  zone         = var.zone
  machine_type = var.machine_type

  boot_disk {
    initialize_params { image = "debian-cloud/debian-12" }
  }

  network_interface {
    subnetwork = var.subnet_self_link
    access_config {} # gives external IP
  }

  metadata_startup_script = <<-EOF
    #!/bin/bash
    apt-get update -y
    apt-get install -y apache2
    systemctl enable apache2
    systemctl start apache2
    echo "Hello from ${var.name}" > /var/www/html/index.html
  EOF
}

outputs.tf

output "public_ip" {
  value = google_compute_instance.vm.network_interface[0].access_config[0].nat_ip
}

7) Environment wiring examples (dev) — per cloud

AWS dev envs/aws/dev

providers.tf

provider "aws" { region = var.aws_region }

variables.tf

variable "aws_region" { type = string }
variable "azs" { type = list(string) }
variable "vpc_cidr" { type = string }
variable "public_subnet_cidrs" { type = list(string) }
variable "ami_id" { type = string }
variable "instance_type" { type = string }
variable "ssh_cidr" { type = string }

# common vars
variable "project" { type = string }
variable "env" { type = string }
variable "team" { type = string }
variable "owner" { type = string }
variable "cost_center" { type = string }

main.tf

locals {
  name = "${var.project}-${var.env}"
  tags = {
    app         = var.project
    env         = var.env
    team        = var.team
    owner       = var.owner
    cost_center = var.cost_center
    managed_by  = "terraform"
  }
}

module "network" {
  source              = "../../../modules/aws/network"
  name                = local.name
  vpc_cidr             = var.vpc_cidr
  azs                  = var.azs
  public_subnet_cidrs  = var.public_subnet_cidrs
  tags                = local.tags
}

module "compute" {
  source        = "../../../modules/aws/compute"
  name          = "${local.name}-web"
  vpc_id        = module.network.vpc_id
  subnet_id     = module.network.public_subnet_ids[0]
  ami_id        = var.ami_id
  instance_type = var.instance_type
  ssh_cidr      = var.ssh_cidr
  tags          = local.tags
}

outputs.tf

output "web_public_ip" { value = module.compute.public_ip }

terraform.tfvars (example)

project     = "cloudopsnow"
env         = "dev"
team        = "platform"
owner       = "devops"
cost_center = "cc-001"

aws_region = "us-east-1"
azs = ["us-east-1a", "us-east-1b"]

vpc_cidr = "10.10.0.0/16"
public_subnet_cidrs = ["10.10.1.0/24", "10.10.2.0/24"]

ami_id        = "ami-REPLACE_ME"
instance_type = "t3.micro"
ssh_cidr      = "YOUR_IP/32"

Azure dev envs/azure/dev

providers.tf

provider "azurerm" {
  features {}
}

variables.tf

variable "location" { type = string }
variable "address_space" { type = list(string) }
variable "subnet_cidr" { type = string }
variable "admin_username" { type = string }
variable "ssh_public_key" { type = string }
variable "vm_size" { type = string }

# common vars
variable "project" { type = string }
variable "env" { type = string }
variable "team" { type = string }
variable "owner" { type = string }
variable "cost_center" { type = string }

main.tf

locals {
  name = "${var.project}-${var.env}"
  tags = {
    app         = var.project
    env         = var.env
    team        = var.team
    owner       = var.owner
    cost_center = var.cost_center
    managed_by  = "terraform"
  }
}

module "network" {
  source        = "../../../modules/azure/network"
  name          = local.name
  location      = var.location
  address_space = var.address_space
  subnet_cidr   = var.subnet_cidr
  tags          = local.tags
}

module "compute" {
  source              = "../../../modules/azure/compute"
  name                = "${local.name}-web"
  location            = module.network.location
  resource_group_name = module.network.resource_group_name
  subnet_id           = module.network.subnet_id
  admin_username      = var.admin_username
  ssh_public_key      = var.ssh_public_key
  vm_size             = var.vm_size
  tags                = local.tags
}

outputs.tf

output "web_public_ip" { value = module.compute.public_ip }

GCP dev envs/gcp/dev

providers.tf

provider "google" {
  project = var.project_id
  region  = var.region
  zone    = var.zone
}

variables.tf

variable "project_id" { type = string }
variable "region" { type = string }
variable "zone" { type = string }
variable "subnet_cidr" { type = string }
variable "machine_type" { type = string }

# common vars
variable "project" { type = string }
variable "env" { type = string }
variable "team" { type = string }
variable "owner" { type = string }
variable "cost_center" { type = string }

main.tf

locals {
  name = "${var.project}-${var.env}"
}

module "network" {
  source          = "../../../modules/gcp/network"
  name            = local.name
  project_id      = var.project_id
  region          = var.region
  subnet_cidr     = var.subnet_cidr
}

module "compute" {
  source           = "../../../modules/gcp/compute"
  name             = "${local.name}-web"
  project_id       = var.project_id
  zone             = var.zone
  subnet_self_link = module.network.subnet_self_link
  machine_type     = var.machine_type
}

outputs.tf

output "web_public_ip" { value = module.compute.public_ip }

8) How to run (same mindset for all clouds)

From the environment folder (example AWS dev):

terraform fmt -recursive
terraform init
terraform validate
terraform plan -out plan.out
terraform apply plan.out
terraform plan

That last plan should show no changes. That’s your “everything is stable” check.


9) The safest beginner rule for dev/prod across 3 clouds

For dev/prod, prefer:

  • separate folders ✅
  • separate state keys ✅
  • ideally separate accounts/subscriptions/projects ✅

Workspaces are great later for “many identical stacks,” not for “prod safety.”


Category: 
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments