ECS + Terraform

Looking for Kubernetes? This guide covers ECS (Elastic Container Service) deployment. For Kubernetes/EKS deployment, see the EKS + Terraform guide.

Overview

Deploy Runlayer on AWS ECS (Elastic Container Service) with Fargate using Terraform. This provides a serverless container deployment without managing Kubernetes clusters. Choose between minimal configuration (5 required parameters) or enterprise configuration (full customization). Infrastructure Repository: runlayer/runlayer-infra

This deployment creates AWS resources that incur costs. Typical costs range from $100-400/month depending on configuration.

Quick Start

Step 1: Get the infrastructure code


cd infra  # Infrastructure code is included as a subtree

Minimal (Recommended)
Enterprise

Get production-ready infrastructure with these parameters:

cp minimal.tfvars.example production.tfvars

# Edit these required values:
# region = "us-east-1"
# domain_name = "ai.yourcompany.com"  # Must own this domain
# account = 123456789012
# database_username = "postgres"

terraform init
terraform apply -var-file="production.tfvars"

That’s it! You get production-ready defaults for everything else.

You must own the domain name and be able to validate it via DNS for SSL certificate creation.

Full control over all infrastructure settings:

cp enterprise.tfvars.example production.tfvars

# Customize all settings as needed
nano production.tfvars

terraform init
terraform apply -var-file="production.tfvars"

160+ configuration options for enterprise customization.

Configuration Options

Minimal Configuration

Required:

environment       = "production"              # Environment name
region            = "us-east-1"              # AWS region
domain_name       = "ai.yourcompany.com"     # Your domain (required for SSL)
account           = 123456789012             # AWS account ID

Smart production-ready defaults include:

Database: Aurora PostgreSQL 16.6, 2-16 ACUs, private subnets, 7-day backups
Security: Public ALB with internet access, private database/cache
SSL: Automatic wildcard certificate lookup (e.g., *.staging.runlayer.com); set enable_acm_dns_validation = true to create a new certificate instead
Scaling: 2 backend + 2 frontend containers, auto-scale to 10 max
Resources: Backend 512 CPU/1024 MB, Frontend 512 CPU/1024 MB
Resources: Backend 2048 CPU/4096 MB, Frontend 512 CPU/1024 MB
Network: 3-AZ VPC with /16 CIDR, public/private subnets
Monitoring: CloudWatch logs for all services

Enterprise Configuration

All minimal options plus 160+ customizable parameters:

Database Configuration

database_name     = "anysource_prod"
database_username = "postgres"        # Database master username
database_config = {
  engine_version      = "16.6"        # PostgreSQL version
  min_capacity       = 4              # Min Aurora capacity (ACUs)
  max_capacity       = 32             # Max Aurora capacity (ACUs)
  publicly_accessible = false         # Keep private (recommended)
  backup_retention   = 30             # Backup retention days
  subnet_type        = "private"      # Use private subnets
}

Security Configuration

alb_access_type = "public"            # "public" or "private"
alb_allowed_cidrs = [                 # Security group IP ranges
  "0.0.0.0/0"                        # Internet (change for security)
  # "203.0.113.0/24",                # Your office IPs
  # "198.51.100.0/24"                # Your VPN IPs
]

# WAF IP Allowlisting (recommended for production)
waf_enable_ip_allowlisting = true     # Enable WAF-based IP filtering
waf_allowlist_ipv4_cidrs = [          # Allowed IPv4 CIDR blocks
  "203.0.113.0/24",                   # Office network
  "198.51.100.42/32",                 # Specific IP address
]

# SSL Certificate options (choose one):

# Option 1: Look up existing wildcard certificate (DEFAULT)
# Module derives wildcard from domain: ai.prod.yourcompany.com → *.prod.yourcompany.com
hosted_zone_name = "prod.yourcompany.com"  # Used for DNS records and certificate lookup

# Option 2: Create new wildcard certificate with DNS validation
enable_acm_dns_validation = true           # Creates new cert instead of lookup
hosted_zone_name = "prod.yourcompany.com"  # Required for Route53 validation

# Option 3: Use specific certificate ARN (bypasses module)
ssl_certificate_arn = "arn:aws:acm:us-east-1:123456789012:certificate/xxx"

# DNS record control (separate from certificate management)
create_dns_records = true  # Creates Route53 ALIAS record pointing domain to ALB (default: false)

Dual ALB Setup (Split-Horizon DNS)

# Enable dual ALB for split-horizon DNS
# Public ALB for internet traffic + Internal ALB for private network traffic
enable_dual_alb = true

# Optional: Use an existing private hosted zone
private_hosted_zone_id = "Z1234567890ABC"

# Optional: Associate with a specific VPC (defaults to service VPC)
private_hosted_zone_vpc_id = "vpc-1234567890abcdef0"

# Optional: Associate with additional VPCs (for VPC peering scenarios)
private_hosted_zone_additional_vpc_ids = [
  "vpc-0987654321fedcba0",  # Peered VPC 1
  "vpc-1111222233334444",   # Peered VPC 2
]

Use Case:

Public access required (e.g., ChatGPT, external integrations)
Private network access for internal services (stays within VPC/peering)
Single domain name (runlayer.example.com) resolves differently based on network context

How it works:

Public DNS → Public ALB (internet traffic)
Private DNS → Internal ALB (VPC/peered traffic)
ECS services register with both ALBs
WAF applies only to public ALB

VPC Peering for Cross-VPC Connectivity

Security Note: VPC peering connections require peer_owner_id to be specified for all connections. This validation ensures you only accept connections from known and trusted AWS accounts. Connections are automatically accepted after validation.

Enable VPC peering to allow traffic between the Runlayer VPC and other VPCs (for example, a customer’s existing VPC or internal services VPC).

# Accept and configure VPC peering connections
vpc_peering_connections = {
  "customer-vpc" = {
    peering_connection_id = "pcx-0abc123def456"  # VPC peering connection ID from the initiating side
    peer_vpc_cidr         = "172.16.0.0/16"      # CIDR block of the peer VPC
    peer_owner_id         = "123456789012"        # AWS account ID of the peer (REQUIRED)
    peer_region           = "us-east-1"           # AWS region of the peer (optional)
  }
}

# Optional: For dual ALB setup, associate private hosted zone with peered VPC
enable_dual_alb = true
private_hosted_zone_additional_vpc_ids = ["vpc-customer123"]

Security Features:

✅ Required peer validation - All connections must specify peer_owner_id
✅ Automatic acceptance after validation - Validation is the security control
✅ Automatic routing - Routes configured only for validated connections
✅ Security group integration - Backend allows traffic from validated peer CIDRs

Prerequisites:

The peer VPC must initiate the peering connection first
You need the peering connection ID (pcx-xxxxx)
You need the peer VPC’s CIDR block
You MUST provide the peer AWS account ID for security validation

Use Cases:

Connecting to a customer’s existing VPC for internal API access
Multi-VPC architectures with centralized services
Hybrid cloud setups with on-premises connectivity

VPC peering only works when the module creates the VPC (not with existing_vpc_id).

Service Scaling

services_configurations = {
  "backend" = {
    desired_count     = 3             # Number of instances
    min_capacity      = 2             # Min for auto-scaling
    max_capacity      = 10            # Max for auto-scaling
    cpu              = 2048           # CPU units (1024 = 1 vCPU)
    memory           = 4096           # Memory in MB

    # Auto-scaling thresholds
    cpu_auto_scalling_target_value    = 70    # Scale at 70% CPU
    memory_auto_scalling_target_value = 80    # Scale at 80% memory
  }

  "frontend" = {
    desired_count = 2
    cpu          = 512
    memory       = 1024
  }
}

Network Configuration

cidr = "10.0.0.0/16" # VPC CIDR block
region_az = ["us-east-1a", "us-east-1b", "us-east-1c"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
private_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
database_subnets = ["10.0.7.0/24", "10.0.8.0/24", "10.0.9.0/24"]

Monitoring & Alerting

# Enable CloudWatch monitoring and alerting
enable_monitoring = true              # Create CloudWatch alarms
slack_channel_id = "C1234567890"      # Your #alerts channel ID (enables Slack alerts)
slack_team_id = "T1234567890"         # Your Slack workspace ID

Automatic CloudWatch alarms created for:

ECS Services: CPU utilization (80%), Memory utilization (85%)
RDS Database: CPU utilization (80%), Connection count (50), Freeable Memory, Disk Queue Depth, Read/Write IOPS, Free Storage Space
Redis Cache: CPU utilization (80%), Memory utilization (85%)
Load Balancer: Response time (5s), Unhealthy targets, 5XX error count
VPC: Flow logs for network traffic analysis and security monitoring

Enhanced monitoring capabilities include:

Configurable alarm thresholds for all RDS metrics
365-day log retention for ECS services and prestart containers
VPC Flow Logs with customizable traffic type monitoring
ALB 5XX error tracking with configurable thresholds

CloudWatch monitoring - comprehensive monitoring with configurable alarms for all infrastructure components.Advanced RDS monitoring configuration:

rds_alarm_config = {
  FreeableMemory = {
    period    = 300
    threshold = 268435456 # 256MB
    unit      = "Bytes"
  }
  DiskQueueDepth = {
    period    = 300
    threshold = 5
    unit      = "Count"
  }
  WriteIOPS = {
    period    = 300
    threshold = 1000
    unit      = "Count"
  }
  ReadIOPS = {
    period    = 300
    threshold = 1000
    unit      = "Count"
  }
  Storage = {
    period    = 300
    threshold = 107374182400 # 100GB
    unit      = "Bytes"
  }
}

# ALB 5XX error monitoring
alb_5xx_alarm_period    = 300  # 5 minute period
alb_5xx_alarm_threshold = 1    # Alert on any 5XX error

Optional Services

# Environment Variables

env_vars = {
ENVIRONMENT = "production"
COMPANY = "YourCompany"
}

Prerequisites

Install Tools

# AWS CLI
brew install awscli
aws configure

# Terraform

brew install terraform
terraform version

AWS Requirements

AWS account with sufficient permissions
Domain name you control (for DNS records and SSL)
Adequate service quotas (VPC, RDS, ECS)

By default, the module looks up an existing wildcard ACM certificate (e.g., *.staging.runlayer.com). If no wildcard certificate exists, set enable_acm_dns_validation = true to create one automatically with DNS validation.

SSL Certificate (Enterprise Only)

# Optional: Request certificate in ACM
aws acm request-certificate \
  --domain-name "*.yourcompany.com" \
  --validation-method DNS \
  --region us-east-1

# Get Route53 hosted zone ID
aws route53 list-hosted-zones-by-name \
  --dns-name yourcompany.com

Deployment

1. Get Infrastructure Code

# Clone the infrastructure repository
git clone https://github.com/runlayer/runlayer-infra.git
cd runlayer-infra

# OR use the subtree in your main project
cd infra

2. Configure

# Minimal configuration (recommended)
cp minimal.tfvars.example production.tfvars

# OR Enterprise configuration
cp enterprise.tfvars.example production.tfvars

3. Deploy

terraform init
terraform plan -var-file="production.tfvars"
terraform apply -var-file="production.tfvars"

Deployment takes 15-20 minutes. SSL certificate validation may add 5-10 minutes.

4. Automated Database Setup

Database initialization is completely automated: What happens automatically:

Database Connection: Prestart container waits for Aurora PostgreSQL to be ready
Schema Migration: Runs alembic upgrade head to apply latest database schema
Logging: All setup activity logged to CloudWatch under prestart-logs-[environment]
Error Handling: Backend won’t start if database setup fails

Secrets are automatically generated:

Database password and secret keys are created securely

5. Update Application Secrets (Optional)

# Secrets are automatically generated during deployment!
# Database password and secret keys are created securely.

# If you need to update any secrets after deployment:
aws secretsmanager update-secret \
  --secret-id "anysource-production-app-secrets-PROD2024" \
  --secret-string '{
    "CUSTOM_API_KEY": "your-custom-value"
  }'

6. Verify Deployment

# Get application URL
terraform output alb_dns_name

# Test health endpoint
curl https://your-domain.com/api/v1/utils/health-check/

Architecture

Cost Estimation

Component	Minimal	Enterprise
ECS Services	$50-80	$150-300
Aurora Database	$30-60	$100-400
ElastiCache	$15-30	$50-150
Load Balancer	$20-25	$20-25
Monitoring	$5-10	$10-25
Other (NAT, Storage)	$10-20	$30-50
Monthly Total	$130-225	$360-950

Costs vary by region and usage. Use AWS Pricing Calculator for precise estimates.

Common Use Cases

Private Enterprise Deployment

alb_access_type = "private"
alb_allowed_cidrs = ["10.0.0.0/8"]  # Corporate network only

# Additional WAF protection for corporate network
waf_enable_ip_allowlisting = true
waf_allowlist_ipv4_cidrs = [
  "10.100.0.0/16",    # Corporate HQ
  "10.200.0.0/16",    # Regional offices
]

database_config = {
  publicly_accessible = false
  backup_retention = 30
}

High Availability Production

database_config = {
  min_capacity = 8
  max_capacity = 64
}
services_configurations = {
  "backend" = {
    desired_count = 4
    max_capacity = 20
  }
}

Development Environment

environment = "development"
database_config = {
  min_capacity = 2
  max_capacity = 4
  backup_retention = 1
}
services_configurations = {
  "backend" = { desired_count = 1 }
  "frontend" = { desired_count = 1 }
}
# Disable monitoring for development
enable_monitoring = false

Production with Monitoring

environment = "production"
enable_monitoring = true
slack_channel_id = "C1234567890"
slack_team_id = "T1234567890"

database_config = {
  min_capacity = 4
  max_capacity = 32
  backup_retention = 30
}
services_configurations = {
  "backend" = {
    desired_count = 3
    max_capacity = 15
  }
}

Troubleshooting

SSL Certificate Issues

Wildcard lookup fails (default behavior):

# Check if wildcard certificate exists
aws acm list-certificates --query "CertificateSummaryList[?contains(DomainName, '*')]"

# The module derives: your-domain.example.com → *.example.com
# Ensure a certificate exists for the wildcard domain

DNS validation fails (when enable_acm_dns_validation = true):

# Check certificate status
aws acm describe-certificate --certificate-arn your-cert-arn

# Verify DNS records exist
dig _acme-challenge.your-domain.com CNAME

Application Not Accessible

Solution: Check security groups and target health:

# Check target group health
aws elbv2 describe-target-health --target-group-arn your-tg-arn

# Check ECS service status
aws ecs describe-services --cluster anysource-production

High Costs

Solution: Optimize resource sizing:

Use environment = "development" for testing
Reduce database min_capacity and max_capacity
Lower service desired_count and CPU/memory
Set shorter backup_retention periods

Monitoring Alerts Not Working

Solution: Check CloudWatch alarm configuration:

# Check CloudWatch alarms exist
aws cloudwatch describe-alarms --state-value ALARM

# Check alarm actions and thresholds
aws cloudwatch describe-alarms --alarm-names "your-alarm-name"

Common issues:

CloudWatch alarm thresholds too high
Alarm actions disabled
Missing CloudWatch permissions

ECS vs EKS: Which to Choose?

Factor	ECS (This Guide)	EKS
Complexity	Lower - Simpler to manage	Higher - Kubernetes expertise needed
Cost	$130-225/month	$200-800/month
Scaling	Auto-scaling with Fargate	More granular control
Ecosystem	AWS-specific	Kubernetes ecosystem
Best For	Simpler deployments, AWS-native	Complex workloads, multi-cloud

Next Steps

Configuration

Configure application settings and integrations

EKS + Terraform

Deploy on Kubernetes using EKS with Terraform for more advanced scenarios

Helm + Kubernetes

Deploy application using Helm charts (after provisioning EKS infrastructure)

Getting Started

Platform

Cookbook

Integrations

MCPs by Runlayer

Resources

Overview

Quick Start

Configuration Options

Minimal Configuration

Enterprise Configuration

Prerequisites

Deployment

1. Get Infrastructure Code

2. Configure

3. Deploy

4. Automated Database Setup

5. Update Application Secrets (Optional)

6. Verify Deployment

Architecture

Cost Estimation

Common Use Cases

Private Enterprise Deployment

High Availability Production

Development Environment

Production with Monitoring

Troubleshooting

ECS vs EKS: Which to Choose?

Next Steps

Configuration

EKS + Terraform

Helm + Kubernetes

Getting Started

Platform

Cookbook

Integrations

MCPs by Runlayer

Resources

​Overview

​Quick Start

​Configuration Options

​Minimal Configuration

​Enterprise Configuration

​Prerequisites

​Deployment

​1. Get Infrastructure Code

​2. Configure

​3. Deploy

​4. Automated Database Setup

​5. Update Application Secrets (Optional)

​6. Verify Deployment

​Architecture

​Cost Estimation

​Common Use Cases

​Private Enterprise Deployment

​High Availability Production

​Development Environment

​Production with Monitoring

​Troubleshooting

​ECS vs EKS: Which to Choose?

​Next Steps

Configuration

EKS + Terraform

Helm + Kubernetes

Overview

Quick Start

Configuration Options

Minimal Configuration

Enterprise Configuration

Prerequisites

Deployment

1. Get Infrastructure Code

2. Configure

3. Deploy

4. Automated Database Setup

5. Update Application Secrets (Optional)

6. Verify Deployment

Architecture

Cost Estimation

Common Use Cases

Private Enterprise Deployment

High Availability Production

Development Environment

Production with Monitoring

Troubleshooting

ECS vs EKS: Which to Choose?

Next Steps