Terraform: State & Modules

Learn how Terraform tracks infrastructure with state and how to create reusable modules for consistent deployments.

Understanding Terraform State

State is arguably Terraform’s most important concept. Before diving into the technical details, consider the following scenario:

You run terraform apply and create an EC2 instance. The next day, you run it again. How does Terraform know the instance already exists and should not create a duplicate?

The answer is state. Terraform maintains a JSON file that records what it created, allowing it to compare your configuration against reality.

What State Tracks

Information	Why It Matters
Resource IDs	Links configuration to real infrastructure
Attribute values	Detects configuration drift
Dependencies	Determines update and deletion order
Metadata	Provider versions, schema information

Why State Matters

Without state, Terraform would:

Create duplicate resources on every apply
Not know which resources to update or delete
Lose track of resources entirely

With state, Terraform can:

Calculate minimal changes - Only modify what is different
Detect drift - Alert when someone changes infrastructure outside Terraform
Enable collaboration - Teams share state to avoid conflicts

Local vs Remote State

By default, Terraform stores state in a local file called terraform.tfstate. This works fine for learning, but becomes problematic when teams collaborate.

Comparing State Storage Options

Aspect	Local State	Remote State
Storage	`terraform.tfstate` file	S3, Azure Blob, GCS, etc.
Collaboration	Single user only	Multiple team members
Locking	None	Prevents concurrent changes
Backup	Manual	Automatic versioning
Security	File permissions only	Encryption, access controls
Best for	Learning, experiments	Teams, production

When to Use Each

Use local state when:

Learning Terraform
Personal projects with no collaboration
Quick experiments

Use remote state when:

Working in a team
Managing production infrastructure
Needing audit trails or encryption
Running Terraform in CI/CD pipelines

Setting Up Remote State (AWS Example)

Remote state requires two things: storage (S3 bucket) and locking (DynamoDB table).

# Configure S3 backend with locking
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"
    encrypt        = true
  }
}

Setting up the infrastructure for remote state:

# Create state bucket (run this separately first)
resource "aws_s3_bucket" "state" {
  bucket = "my-terraform-state"
}

resource "aws_dynamodb_table" "locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute { name = "LockID", type = "S" }
}

Important: Create the S3 bucket and DynamoDB table before configuring the backend. You cannot use Terraform to create its own state storage (chicken-and-egg problem).

Common State Operations

As your infrastructure evolves, you will need to manipulate state directly. Here are the most common scenarios:

State Commands Quick Reference

Command	Purpose	Example
`terraform state list`	Show all resources	See what Terraform manages
`terraform state show`	Inspect a resource	Debug configuration issues
`terraform state mv`	Rename/move resources	Refactoring without recreating
`terraform state rm`	Remove from state	Adopt existing resources
`terraform import`	Add existing resource	Bring unmanaged resources under control

Moving Resources Between States

When refactoring, you might need to move resources between configurations:

# Rename a resource
terraform state mv aws_instance.old aws_instance.new

# Move to a module
terraform state mv aws_instance.web module.webserver.aws_instance.main

Importing Existing Resources

Have infrastructure created outside Terraform? Import it:

# Import an existing S3 bucket
terraform import aws_s3_bucket.data my-existing-bucket

Tip: After importing, run terraform plan to ensure your configuration matches the actual resource.

Terraform Workspaces

Workspaces let you deploy the same configuration multiple times with separate state files. Think of them as parallel environments sharing the same code.

When to Use Workspaces

Consider the following scenario: You want to deploy identical infrastructure for dev, staging, and production. Workspaces let you do this without duplicating configuration files.

# Create workspaces for each environment
terraform workspace new dev
terraform workspace new staging
terraform workspace new production

# Switch between them
terraform workspace select dev
terraform apply  # Deploys to dev

terraform workspace select production
terraform apply  # Deploys to production (separate state)

Workspaces vs Separate Directories

Approach	Workspaces	Separate Directories
Configuration	Shared	Can differ per environment
State	Separate per workspace	Separate per directory
Complexity	Lower	Higher
Flexibility	Lower	Higher
Best for	Identical environments	Different configurations

Use workspaces when: Environments are nearly identical and differ only by size or count.

Use separate directories when: Environments have different resources, modules, or significant configuration differences.

Making Configuration Workspace-Aware

Use terraform.workspace to customize based on current workspace:

locals {
  instance_types = {
    dev        = "t3.micro"
    staging    = "t3.small"
    production = "m5.large"
  }
}

resource "aws_instance" "app" {
  instance_type = local.instance_types[terraform.workspace]

  tags = {
    Environment = terraform.workspace
  }
}

Workspace Best Practices

Use consistent naming: Include workspace in resource names to avoid conflicts
Validate workspace names: Prevent typos from creating unexpected environments
Consider cost: Non-production workspaces can use smaller, cheaper resources

# Example: consistent naming with workspace
resource "aws_s3_bucket" "data" {
  bucket = "${var.project}-${terraform.workspace}-data"
}

Outputs

Outputs expose values from your Terraform configuration. They serve two purposes:

Display information after apply (e.g., the IP address of a newly created server)
Share data between Terraform configurations via remote state

Defining Outputs

output "bucket_arn" {
  description = "ARN of the S3 bucket"
  value       = aws_s3_bucket.data.arn
}

output "database_endpoint" {
  description = "Database connection string"
  value       = aws_db_instance.main.endpoint
  sensitive   = true  # Hide in console output
}

Reading Outputs from Other Configurations

Use terraform_remote_state to access outputs from another Terraform configuration:

# Read outputs from the networking configuration
data "terraform_remote_state" "network" {
  backend = "s3"
  config = {
    bucket = "my-terraform-state"
    key    = "network/terraform.tfstate"
    region = "us-east-1"
  }
}

# Use the VPC ID from the other configuration
resource "aws_instance" "app" {
  subnet_id = data.terraform_remote_state.network.outputs.private_subnet_id
}

This pattern is useful for splitting large configurations into smaller, manageable pieces while maintaining connections between them.

Terraform Modules

Modules are reusable packages of Terraform configuration. Think of them as functions for infrastructure: they accept inputs (variables), create resources, and return outputs.

Why Use Modules?

Consider the following situation: You have copied your VPC configuration to 10 different projects. Now you need to change the subnet configuration. Without modules, you update 10 files. With modules, you update once.

Benefits of modules:

Reusability: Write once, use everywhere
Consistency: Same configuration across environments
Maintainability: Update in one place, propagate everywhere
Encapsulation: Hide complexity behind a simple interface

Creating Your First Module

A module is simply a directory with Terraform files. Here is a simple web server module:

modules/
  webserver/
    main.tf       # Resources
    variables.tf  # Inputs
    outputs.tf    # Outputs

modules/webserver/variables.tf:

variable "instance_type" {
  default = "t3.micro"
}
variable "name" {
  type = string
}

modules/webserver/main.tf:

resource "aws_instance" "web" {
  ami           = "ami-12345678"
  instance_type = var.instance_type
  tags          = { Name = var.name }
}

modules/webserver/outputs.tf:

output "public_ip" {
  value = aws_instance.web.public_ip
}

Using Modules

Call your module from any configuration:

module "web_prod" {
  source        = "./modules/webserver"
  name          = "production-web"
  instance_type = "m5.large"
}

module "web_dev" {
  source = "./modules/webserver"
  name   = "dev-web"
  # Uses default t3.micro
}

output "prod_ip" {
  value = module.web_prod.public_ip
}

Module Sources

Modules can come from various locations:

Source	Example	Best For
Local path	`./modules/vpc`	Development, organization-specific
GitHub	`github.com/org/module`	Shared across teams
Terraform Registry	`hashicorp/vpc/aws`	Community modules
S3/GCS	`s3::https://bucket.s3.amazonaws.com/module.zip`	Private modules

# From Terraform Registry (version pinned)
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.0.0"

  name = "my-vpc"
  cidr = "10.0.0.0/16"
}

When to Create vs Use Existing Modules

Create your own modules when:

You have organization-specific requirements
You need tight control over configuration
Existing modules are too complex or simple

Use community modules when:

They match your requirements closely
They are well-maintained (check stars, recent updates)
You want to benefit from community best practices

The Terraform Registry has thousands of modules for common infrastructure patterns.