Skip to main content

Overview

Deploy Runlayer on Amazon EKS (Elastic Kubernetes Service) with production-ready infrastructure provisioned by Terraform. This guide covers three deployment scenarios to fit different infrastructure requirements. Terraform Module Location: infra/aws-helm/terraform-eks/
This deployment creates AWS resources that incur costs. Typical costs range from $200-800/month depending on configuration and usage.

Deployment Scenarios

Choose the scenario that best fits your infrastructure:

Full Stack

Create EverythingNew VPC + New EKS cluster + Application infrastructureBest for: New deployments, greenfield projects

Existing VPC

Use Your NetworkExisting VPC + New EKS cluster + Application infrastructureBest for: Integrating with existing network infrastructure

Existing EKS

Minimal InfrastructureExisting VPC + Existing EKS + Application infrastructure onlyBest for: Shared clusters, platform team managed EKS

Prerequisites

1

Install Tools

# AWS CLI
brew install awscli
aws configure

# Terraform >= 1.7
brew install terraform
terraform version

# kubectl (optional, for cluster access)
brew install kubectl
2

AWS Requirements

  • AWS account with administrator access
  • Sufficient service quotas (VPC, EKS, RDS, ElastiCache)
  • IAM permissions to create resources
3

Prepare Secrets

Generate strong secrets for your deployment:
# Generate random secrets (32+ characters recommended)
openssl rand -base64 32  # For SECRET_KEY
openssl rand -base64 32  # For MASTER_SALT
openssl rand -base64 32  # For database_password

Scenario 1: Full Stack Deployment

Create a complete production-ready environment with new VPC and EKS cluster.

What Gets Created

  • VPC: 10.0.0.0/16 with public/private subnets across 3 AZs
  • EKS Cluster: Kubernetes 1.33 with managed node groups
  • RDS: Aurora PostgreSQL Serverless v2 (2-16 ACUs)
  • Redis: ElastiCache Redis cluster
  • IAM Roles: IRSA roles for EBS CSI, ALB Controller, CloudWatch, Application
  • Security: KMS encryption, security groups, private subnets
  • Monitoring: CloudWatch logs and metrics

Step-by-Step Deployment

1. Get the Terraform module:
# Clone the repository
git clone https://github.com/runlayer/Runlayer.git
cd runlayer/infra/aws-helm/terraform-eks
2. Configure your deployment:
cp terraform.tfvars.example terraform.tfvars
nano terraform.tfvars
Edit terraform.tfvars:
# AWS Configuration
region  = "us-east-1"
account = "123456789012"  # Your AWS Account ID

# Project Configuration
project     = "myapp"
environment = "production"

# VPC Configuration - Create new VPC
create_vpc      = true
vpc_cidr        = "10.0.0.0/16"
private_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
public_subnets  = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]

# EKS Configuration - Create new cluster
create_eks      = true
cluster_version = "1.33"

# Restrict API access (recommended)
cluster_endpoint_public_access_cidrs = ["203.0.113.1/32"]  # Your office IP

# Node Groups
node_groups = {
  default = {
    instance_types = ["m6i.2xlarge"]
    scaling_config = {
      desired_size = 4
      max_size     = 10
      min_size     = 2
    }
    disk_size = 50
  }
}

# Database Configuration
database_name     = "anysource_db"
database_username = "dbadmin"
database_password = "CHANGE_ME_STRONG_PASSWORD"  # Use generated secret

database_config = {
  engine_version      = "16.8"
  min_capacity        = 2
  max_capacity        = 16
  deletion_protection = true  # Enable for production
}

# Redis Configuration
redis_node_type = "cache.t3.medium"

# Application Secrets
secret_key   = "CHANGE_ME_SECRET_KEY"   # Use generated secret
master_salt  = "CHANGE_ME_MASTER_SALT"  # Use generated secret
auth_api_key = "your-auth-api-key"

# Monitoring
enable_monitoring = true

# EKS Namespace
eks_namespace = "anysource-production"
3. Deploy infrastructure:
# Initialize Terraform
terraform init

# Review planned changes
terraform plan

# Apply configuration (takes 15-20 minutes)
terraform apply
4. Configure kubectl:
# Update kubeconfig
aws eks update-kubeconfig \
  --region us-east-1 \
  --name myapp-production-eks

# Verify cluster access
kubectl cluster-info
kubectl get nodes
5. Deploy application with Helm:
cd ../anysource-chart

# Add Helm repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add jetstack https://charts.jetstack.io
helm repo update

# Deploy application
helm upgrade --install anysource . \
  --namespace anysource-production --create-namespace \
  -f values.example.yaml \
  --set externalDatabase.host="<rds-endpoint-from-terraform-output>" \
  --set externalDatabase.password="your-database-password" \
  --set backend.secrets.SECRET_KEY="your-secret-key" \
  --set backend.secrets.MASTER_SALT="your-master-salt" \
  --set backend.secrets.AUTH_API_KEY="your-auth-api-key"

Architecture

Cost Estimation

ComponentConfigurationMonthly Cost
EKS ClusterControl plane$73
EC2 Nodes4x m6i.2xlarge$400-500
RDS Aurora2-16 ACUs$60-240
ElastiCachecache.t3.medium$40-50
NAT Gateway3 AZs$100-120
ALBApplication Load Balancer$20-25
Data TransferVaries by usage$20-50
CloudWatchLogs and metrics$10-30
Total$723-1,088/month
Costs vary by region and usage patterns. Use AWS Pricing Calculator for precise estimates.

Scenario 2: Existing VPC Deployment

Deploy EKS cluster into your existing VPC infrastructure.

Prerequisites

Your existing VPC must have:
  1. Private Subnets (required):
    • At least 2 private subnets across different AZs
    • NAT Gateway for internet access
    • Sufficient IP address space for pods
  2. Public Subnets (recommended):
    • At least 2 public subnets across different AZs
    • Internet Gateway attached
  3. Subnet Tags (for auto-discovery):
    # Private subnets
    kubernetes.io/role/internal-elb = "1"
    kubernetes.io/cluster/myapp-production-eks = "shared"
    
    # Public subnets
    kubernetes.io/role/elb = "1"
    kubernetes.io/cluster/myapp-production-eks = "shared"
    
  4. VPC Settings:
    • DNS hostnames enabled
    • DNS resolution enabled

Configuration

# terraform.tfvars
region  = "us-east-1"
account = "123456789012"

# Use existing VPC
create_vpc = false
vpc_id     = "vpc-0a1b2c3d4e5f67890"

# Provide existing subnet IDs
private_subnet_ids = ["subnet-0a1b2c3d", "subnet-1e2f3g4h", "subnet-2i3j4k5l"]
public_subnet_ids  = ["subnet-6m7n8o9p", "subnet-7q8r9s0t", "subnet-8u9v0w1x"]

# Create new EKS cluster
create_eks      = true
cluster_version = "1.33"

# ... rest of configuration same as Scenario 1

Tag Your Subnets

If your subnets aren’t tagged, run this script:
#!/bin/bash
CLUSTER_NAME="myapp-production-eks"
REGION="us-east-1"

# Tag private subnets
aws ec2 create-tags --region $REGION \
  --resources subnet-0a1b2c3d subnet-1e2f3g4h subnet-2i3j4k5l \
  --tags \
    Key=kubernetes.io/role/internal-elb,Value=1 \
    Key=kubernetes.io/cluster/$CLUSTER_NAME,Value=shared

# Tag public subnets
aws ec2 create-tags --region $REGION \
  --resources subnet-6m7n8o9p subnet-7q8r9s0t subnet-8u9v0w1x \
  --tags \
    Key=kubernetes.io/role/elb,Value=1 \
    Key=kubernetes.io/cluster/$CLUSTER_NAME,Value=shared

What Gets Created

EKS Cluster: New Kubernetes cluster in your VPC ✅ Node Groups: Managed node groups ✅ RDS Database: Aurora PostgreSQL ✅ Redis Cache: ElastiCache Redis ✅ IAM Roles: All IRSA roles ✅ Security Groups: For EKS, RDS, Redis VPC: Uses your existing VPC ❌ Subnets: Uses your existing subnets ❌ NAT Gateway: Uses your existing NAT Gateway

Cost Savings

By using existing VPC infrastructure:
  • Save $100-120/month on NAT Gateway costs (if already provisioned)
  • Save $5-10/month on VPC Flow Logs (if already enabled)
  • Total Savings: ~$105-130/month

Scenario 3: Existing EKS Cluster

Add application infrastructure (RDS, Redis, IAM roles) to an existing EKS cluster.

When to Use This

  • ✅ Platform team manages EKS, application teams manage apps
  • ✅ Multiple applications share the same EKS cluster
  • ✅ EKS cluster managed outside Terraform
  • ✅ You only need application infrastructure

Prerequisites

  1. Existing EKS Cluster:
    • Kubernetes version 1.19+
    • OIDC provider enabled
  2. Required Add-ons (must be pre-installed):
    • VPC CNI
    • kube-proxy
    • CoreDNS
    • EBS CSI Driver with IRSA role
    • AWS Load Balancer Controller with IRSA role
  3. Cluster Information:
    • Cluster name
    • OIDC provider ARN

Get OIDC Provider ARN

# Get OIDC issuer URL
aws eks describe-cluster \
  --name your-cluster-name \
  --region us-east-1 \
  --query "cluster.identity.oidc.issuer" \
  --output text

# Output: https://oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE

# Convert to ARN format:
# arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE

Configuration

# terraform.tfvars
region  = "us-east-1"
account = "123456789012"

# Use existing VPC (required)
create_vpc = false
vpc_id     = "vpc-0a1b2c3d4e5f67890"
private_subnet_ids = ["subnet-0a1b2c3d", "subnet-1e2f3g4h", "subnet-2i3j4k5l"]
public_subnet_ids  = ["subnet-6m7n8o9p", "subnet-7q8r9s0t", "subnet-8u9v0w1x"]

# Use existing EKS cluster
create_eks                 = false
existing_cluster_name      = "shared-production-cluster"
existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"

# Database Configuration
database_name     = "anysource_db"
database_username = "dbadmin"
database_password = "your-strong-password"
database_config = {
  engine_version = "16.8"
  min_capacity   = 2
  max_capacity   = 16
}

# Redis Configuration
redis_node_type = "cache.t3.medium"

# Application Secrets
secret_key   = "your-secret-key"
master_salt  = "your-master-salt"
auth_api_key = "your-auth-api-key"

# EKS Namespace
eks_namespace = "anysource-production"

What Gets Created

Application IRSA Role: IAM role for your application pods with permissions for Bedrock, and Secrets Manager ✅ RDS Database: Aurora PostgreSQL Serverless v2 ✅ Redis Cache: ElastiCache Redis ✅ Secrets: AWS Secrets Manager secrets ✅ Security Groups: For RDS and Redis

What Does NOT Get Created

EKS Cluster: Uses your existing cluster ❌ Node Groups: Uses your existing nodes ❌ Cluster Add-ons: Uses your existing add-ons ❌ System IRSA Roles: EBS CSI, ALB Controller, CloudWatch ❌ KMS Key: Uses your existing cluster encryption

Cost Savings

By using existing EKS infrastructure:
  • Save $73/month on EKS control plane
  • Save $400-500/month on EC2 nodes (if shared)
  • Save $100-120/month on NAT Gateways (if shared)
  • Total Savings: ~$573-693/month
Estimated Cost: $150-395/month (RDS + Redis + data transfer only)

Multi-Application Example

Deploy multiple applications to the same cluster:
# Application 1
module "app1" {
  source = "./terraform-eks"
  
  create_eks                 = false
  existing_cluster_name      = "shared-cluster"
  existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/..."
  
  project       = "app1"
  eks_namespace = "app1-production"
  database_name = "app1_db"
  # ... app1 configuration
}

# Application 2
module "app2" {
  source = "./terraform-eks"
  
  create_eks                 = false
  existing_cluster_name      = "shared-cluster"
  existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/..."
  
  project       = "app2"
  eks_namespace = "app2-production"
  database_name = "app2_db"
  # ... app2 configuration
}

Using as a Terraform Module

Reference this module from your own Terraform project:
module "anysource_infrastructure" {
  source = "git::https://github.com/runlayer/Runlayer.git//infra/aws-helm/terraform-eks?ref=v1.0.0"
  
  # Core Configuration
  environment = "production"
  project     = "myapp"
  region      = "us-east-1"
  account     = "123456789012"
  
  # Choose your deployment mode
  create_vpc = true
  create_eks = true
  
  # ... rest of configuration
}

# Use module outputs
output "cluster_name" {
  value = module.anysource_infrastructure.cluster_name
}

output "database_endpoint" {
  value     = module.anysource_infrastructure.database_endpoint
  sensitive = true
}

Security Best Practices

Production Checklist

  • Restrict API Access: Use cluster_endpoint_public_access_cidrs to limit access
  • Enable Encryption: Set enable_cluster_encryption = true
  • Private Endpoints: Consider cluster_endpoint_public_access = false
  • Strong Secrets: Use 32+ character random strings
  • Deletion Protection: Enable for RDS in production
  • Backup Retention: Set appropriate retention periods
  • Monitoring: Enable CloudWatch monitoring
  • VPC Flow Logs: Enable for network monitoring
  • IAM Least Privilege: Review and restrict IAM policies
  • Regular Updates: Keep Kubernetes version current

Secrets Management

Never commit secrets to version control! Use one of these approaches:
  1. Environment Variables:
    export TF_VAR_database_password="..."
    export TF_VAR_secret_key="..."
    terraform apply
    
  2. AWS Secrets Manager:
    # Store secrets in AWS
    aws secretsmanager create-secret \
      --name anysource-production-secrets \
      --secret-string file://secrets.json
    
    # Reference in Terraform
    data "aws_secretsmanager_secret_version" "secrets" {
      secret_id = "anysource-production-secrets"
    }
    
  3. Terraform Cloud/Enterprise:
    • Store sensitive variables in Terraform Cloud
    • Mark as sensitive
    • Use workspace-specific values

Troubleshooting

Common Causes:
  • Insufficient IAM permissions
  • Service quota limits reached
  • Subnet IP address exhaustion
Solution:
# Check IAM permissions
aws sts get-caller-identity

# Check service quotas
aws service-quotas list-service-quotas \
  --service-code eks

# Verify subnet CIDR blocks
aws ec2 describe-subnets --subnet-ids subnet-xxx
Common Causes:
  • Security group misconfiguration
  • IAM role issues
  • Subnet routing problems
Solution:
# Check node group status
aws eks describe-nodegroup \
  --cluster-name myapp-production-eks \
  --nodegroup-name default

# Check node logs
kubectl logs -n kube-system -l k8s-app=aws-node
Common Causes:
  • Security group rules
  • Wrong endpoint
  • Password mismatch
Solution:
# Test database connectivity from a pod
kubectl run -it --rm debug --image=postgres:16 --restart=Never -- \
  psql -h <rds-endpoint> -U dbadmin -d anysource_db

# Check security groups
aws ec2 describe-security-groups --group-ids sg-xxx
Cost Optimization Tips:
  1. Right-size node instances:
    • Use smaller instances for development
    • Enable cluster autoscaler
  2. Optimize database:
    • Lower min_capacity for non-production
    • Reduce backup retention
  3. Reduce NAT Gateway costs:
    • Use single NAT Gateway for development
    • Consider VPC endpoints for AWS services
  4. Monitor usage:
    # Check actual resource usage
    kubectl top nodes
    kubectl top pods --all-namespaces
    

Maintenance and Updates

Kubernetes Version Upgrades

# Check current version
kubectl version --short

# Update cluster version in terraform.tfvars
cluster_version = "1.33"

# Apply upgrade
terraform plan
terraform apply

# Update node groups (done automatically)

Backup and Disaster Recovery

Automated Backups:
  • RDS: Daily automated backups (configurable retention)
  • EKS: Backup using Velero or AWS Backup
Manual Backup:
# Create RDS snapshot
aws rds create-db-cluster-snapshot \
  --db-cluster-identifier myapp-production-eks-db \
  --db-cluster-snapshot-identifier manual-backup-$(date +%Y%m%d)

# Backup Kubernetes resources
kubectl get all --all-namespaces -o yaml > k8s-backup.yaml

Next Steps

Deploy Application

Deploy Runlayer application using Helm charts

ECS Alternative

Consider ECS deployment if you prefer containers without Kubernetes

Monitoring

Set up comprehensive monitoring and alerting

SSL Certificates

Configure SSL certificates with ACM or cert-manager

Support

For issues and questions: