Skip to main content

Overview

Deploy Runlayer on Amazon EKS (Elastic Kubernetes Service) with production-ready infrastructure provisioned by Terraform. This guide covers three deployment scenarios to fit different infrastructure requirements. Terraform Module Location: infra/aws-helm/terraform-eks/
This deployment creates AWS resources that incur costs. Typical costs range from $200-800/month depending on configuration and usage.

Deployment Scenarios

Choose the scenario that best fits your infrastructure:

Full Stack

Create EverythingNew VPC + New EKS cluster + Application infrastructureBest for: New deployments, greenfield projects

Existing VPC

Use Your NetworkExisting VPC + New EKS cluster + Application infrastructureBest for: Integrating with existing network infrastructure

Existing EKS

Minimal InfrastructureExisting VPC + Existing EKS + Application infrastructure onlyBest for: Shared clusters, platform team managed EKS

Prerequisites

1

Install Tools

# AWS CLI
brew install awscli
aws configure

# Terraform >= 1.7
brew install terraform
terraform version

# kubectl (optional, for cluster access)
brew install kubectl
2

AWS Requirements

  • AWS account with administrator access
  • Sufficient service quotas (VPC, EKS, RDS, ElastiCache)
  • IAM permissions to create resources
3

Prepare Secrets

Generate strong secrets for your deployment:
# Generate random secrets (32+ characters recommended)
openssl rand -base64 32  # For SECRET_KEY
openssl rand -base64 32  # For MASTER_SALT
openssl rand -base64 32  # For database_password

Scenario 1: Full Stack Deployment

Create a complete production-ready environment with new VPC and EKS cluster.

What Gets Created

  • VPC: 10.0.0.0/16 with public/private subnets across 3 AZs
  • EKS Cluster: Kubernetes 1.33 with managed node groups
  • RDS: Aurora PostgreSQL Serverless v2 (2-16 ACUs)
  • Redis: ElastiCache Redis cluster
  • IAM Roles: IRSA roles for EBS CSI, ALB Controller, CloudWatch, Application
  • Security: KMS encryption, security groups, private subnets
  • Monitoring: CloudWatch logs and metrics

Step-by-Step Deployment

1. Get the Terraform module:
# Clone the repository
git clone https://github.com/anysource-AI/Runlayer.git
cd runlayer/infra/aws-helm/terraform-eks
2. Configure your deployment:
cp terraform.tfvars.example terraform.tfvars
nano terraform.tfvars
Edit terraform.tfvars:
# AWS Configuration
region  = "us-east-1"
account = "123456789012"  # Your AWS Account ID

# Project Configuration
project     = "myapp"
environment = "production"

# VPC Configuration - Create new VPC
create_vpc      = true
vpc_cidr        = "10.0.0.0/16"
private_subnets = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
public_subnets  = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]

# EKS Configuration - Create new cluster
create_eks      = true
cluster_version = "1.33"

# Restrict API access (recommended)
cluster_endpoint_public_access_cidrs = ["203.0.113.1/32"]  # Your office IP

# Node Groups
node_groups = {
  default = {
    instance_types = ["m6i.2xlarge"]
    scaling_config = {
      desired_size = 4
      max_size     = 10
      min_size     = 2
    }
    disk_size = 50
  }
}

# Database Configuration
database_name     = "anysource_db"
database_username = "dbadmin"
database_password = "CHANGE_ME_STRONG_PASSWORD"  # Use generated secret

database_config = {
  engine_version      = "16.8"
  min_capacity        = 2
  max_capacity        = 16
  deletion_protection = true  # Enable for production
}

# Redis Configuration
redis_node_type = "cache.t3.medium"

# Application Secrets
secret_key   = "CHANGE_ME_SECRET_KEY"   # Use generated secret
master_salt  = "CHANGE_ME_MASTER_SALT"  # Use generated secret
auth_api_key = "your-auth-api-key"

# Monitoring
enable_monitoring = true

# EKS Namespace
eks_namespace = "anysource-production"
3. Deploy infrastructure:
# Initialize Terraform
terraform init

# Review planned changes
terraform plan

# Apply configuration (takes 15-20 minutes)
terraform apply
4. Configure kubectl:
# Update kubeconfig
aws eks update-kubeconfig \
  --region us-east-1 \
  --name myapp-production-eks

# Verify cluster access
kubectl cluster-info
kubectl get nodes
5. Deploy application with Helm:
cd ../anysource-chart

# Add Helm repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add jetstack https://charts.jetstack.io
helm repo update

# Deploy application
helm upgrade --install anysource . \
  --namespace anysource-production --create-namespace \
  -f values.example.yaml \
  --set externalDatabase.host="<rds-endpoint-from-terraform-output>" \
  --set externalDatabase.password="your-database-password" \
  --set backend.secrets.SECRET_KEY="your-secret-key" \
  --set backend.secrets.MASTER_SALT="your-master-salt" \
  --set backend.secrets.AUTH_API_KEY="your-auth-api-key"

Architecture

Cost Estimation

ComponentConfigurationMonthly Cost
EKS ClusterControl plane$73
EC2 Nodes4x m6i.2xlarge$400-500
RDS Aurora2-16 ACUs$60-240
ElastiCachecache.t3.medium$40-50
NAT Gateway3 AZs$100-120
ALBApplication Load Balancer$20-25
Data TransferVaries by usage$20-50
CloudWatchLogs and metrics$10-30
Total$723-1,088/month
Costs vary by region and usage patterns. Use AWS Pricing Calculator for precise estimates.

Scenario 2: Existing VPC Deployment

Deploy EKS cluster into your existing VPC infrastructure.

Prerequisites

Your existing VPC must have:
  1. Private Subnets (required):
    • At least 2 private subnets across different AZs
    • NAT Gateway for internet access
    • Sufficient IP address space for pods
  2. Public Subnets (recommended):
    • At least 2 public subnets across different AZs
    • Internet Gateway attached
  3. Subnet Tags (for auto-discovery):
    # Private subnets
    kubernetes.io/role/internal-elb = "1"
    kubernetes.io/cluster/myapp-production-eks = "shared"
    
    # Public subnets
    kubernetes.io/role/elb = "1"
    kubernetes.io/cluster/myapp-production-eks = "shared"
    
  4. VPC Settings:
    • DNS hostnames enabled
    • DNS resolution enabled

Configuration

# terraform.tfvars
region  = "us-east-1"
account = "123456789012"

# Use existing VPC
create_vpc = false
vpc_id     = "vpc-0a1b2c3d4e5f67890"

# Provide existing subnet IDs
private_subnet_ids = ["subnet-0a1b2c3d", "subnet-1e2f3g4h", "subnet-2i3j4k5l"]
public_subnet_ids  = ["subnet-6m7n8o9p", "subnet-7q8r9s0t", "subnet-8u9v0w1x"]

# Create new EKS cluster
create_eks      = true
cluster_version = "1.33"

# ... rest of configuration same as Scenario 1

Tag Your Subnets

If your subnets aren’t tagged, run this script:
#!/bin/bash
CLUSTER_NAME="myapp-production-eks"
REGION="us-east-1"

# Tag private subnets
aws ec2 create-tags --region $REGION \
  --resources subnet-0a1b2c3d subnet-1e2f3g4h subnet-2i3j4k5l \
  --tags \
    Key=kubernetes.io/role/internal-elb,Value=1 \
    Key=kubernetes.io/cluster/$CLUSTER_NAME,Value=shared

# Tag public subnets
aws ec2 create-tags --region $REGION \
  --resources subnet-6m7n8o9p subnet-7q8r9s0t subnet-8u9v0w1x \
  --tags \
    Key=kubernetes.io/role/elb,Value=1 \
    Key=kubernetes.io/cluster/$CLUSTER_NAME,Value=shared

What Gets Created

EKS Cluster: New Kubernetes cluster in your VPC ✅ Node Groups: Managed node groups ✅ RDS Database: Aurora PostgreSQL ✅ Redis Cache: ElastiCache Redis ✅ IAM Roles: All IRSA roles ✅ Security Groups: For EKS, RDS, Redis VPC: Uses your existing VPC ❌ Subnets: Uses your existing subnets ❌ NAT Gateway: Uses your existing NAT Gateway

Cost Savings

By using existing VPC infrastructure:
  • Save $100-120/month on NAT Gateway costs (if already provisioned)
  • Save $5-10/month on VPC Flow Logs (if already enabled)
  • Total Savings: ~$105-130/month

Scenario 3: Existing EKS Cluster

Add application infrastructure (RDS, Redis, IAM roles) to an existing EKS cluster.

When to Use This

  • ✅ Platform team manages EKS, application teams manage apps
  • ✅ Multiple applications share the same EKS cluster
  • ✅ EKS cluster managed outside Terraform
  • ✅ You only need application infrastructure

Prerequisites

  1. Existing EKS Cluster:
    • Kubernetes version 1.19+
    • OIDC provider enabled
  2. Required Add-ons (must be pre-installed):
    • VPC CNI
    • kube-proxy
    • CoreDNS
    • EBS CSI Driver with IRSA role
    • AWS Load Balancer Controller with IRSA role
  3. Cluster Information:
    • Cluster name
    • OIDC provider ARN

Get OIDC Provider ARN

# Get OIDC issuer URL
aws eks describe-cluster \
  --name your-cluster-name \
  --region us-east-1 \
  --query "cluster.identity.oidc.issuer" \
  --output text

# Output: https://oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE

# Convert to ARN format:
# arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE

Configuration

# terraform.tfvars
region  = "us-east-1"
account = "123456789012"

# Use existing VPC (required)
create_vpc = false
vpc_id     = "vpc-0a1b2c3d4e5f67890"
private_subnet_ids = ["subnet-0a1b2c3d", "subnet-1e2f3g4h", "subnet-2i3j4k5l"]
public_subnet_ids  = ["subnet-6m7n8o9p", "subnet-7q8r9s0t", "subnet-8u9v0w1x"]

# Use existing EKS cluster
create_eks                 = false
existing_cluster_name      = "shared-production-cluster"
existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B71EXAMPLE"

# Database Configuration
database_name     = "anysource_db"
database_username = "dbadmin"
database_password = "your-strong-password"
database_config = {
  engine_version = "16.8"
  min_capacity   = 2
  max_capacity   = 16
}

# Redis Configuration
redis_node_type = "cache.t3.medium"

# Application Secrets
secret_key   = "your-secret-key"
master_salt  = "your-master-salt"
auth_api_key = "your-auth-api-key"

# EKS Namespace
eks_namespace = "anysource-production"

What Gets Created

Application IRSA Role: IAM role for your application pods with permissions for Bedrock, and Secrets Manager ✅ RDS Database: Aurora PostgreSQL Serverless v2 ✅ Redis Cache: ElastiCache Redis ✅ Secrets: AWS Secrets Manager secrets ✅ Security Groups: For RDS and Redis

What Does NOT Get Created

EKS Cluster: Uses your existing cluster ❌ Node Groups: Uses your existing nodes ❌ Cluster Add-ons: Uses your existing add-ons ❌ System IRSA Roles: EBS CSI, ALB Controller, CloudWatch ❌ KMS Key: Uses your existing cluster encryption

Cost Savings

By using existing EKS infrastructure:
  • Save $73/month on EKS control plane
  • Save $400-500/month on EC2 nodes (if shared)
  • Save $100-120/month on NAT Gateways (if shared)
  • Total Savings: ~$573-693/month
Estimated Cost: $150-395/month (RDS + Redis + data transfer only)

Multi-Application Example

Deploy multiple applications to the same cluster:
# Application 1
module "app1" {
  source = "./terraform-eks"
  
  create_eks                 = false
  existing_cluster_name      = "shared-cluster"
  existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/..."
  
  project       = "app1"
  eks_namespace = "app1-production"
  database_name = "app1_db"
  # ... app1 configuration
}

# Application 2
module "app2" {
  source = "./terraform-eks"
  
  create_eks                 = false
  existing_cluster_name      = "shared-cluster"
  existing_oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/..."
  
  project       = "app2"
  eks_namespace = "app2-production"
  database_name = "app2_db"
  # ... app2 configuration
}

Using as a Terraform Module

Reference this module from your own Terraform project:
module "anysource_infrastructure" {
  source = "git::https://github.com/anysource-AI/Runlayer.git//infra/aws-helm/terraform-eks?ref=v1.0.0"
  
  # Core Configuration
  environment = "production"
  project     = "myapp"
  region      = "us-east-1"
  account     = "123456789012"
  
  # Choose your deployment mode
  create_vpc = true
  create_eks = true
  
  # ... rest of configuration
}

# Use module outputs
output "cluster_name" {
  value = module.anysource_infrastructure.cluster_name
}

output "database_endpoint" {
  value     = module.anysource_infrastructure.database_endpoint
  sensitive = true
}

Security Best Practices

Production Checklist

  • Restrict API Access: Use cluster_endpoint_public_access_cidrs to limit access
  • Enable Encryption: Set enable_cluster_encryption = true
  • Private Endpoints: Consider cluster_endpoint_public_access = false
  • Strong Secrets: Use 32+ character random strings
  • Deletion Protection: Enable for RDS in production
  • Backup Retention: Set appropriate retention periods
  • Monitoring: Enable CloudWatch monitoring
  • VPC Flow Logs: Enable for network monitoring
  • IAM Least Privilege: Review and restrict IAM policies
  • Regular Updates: Keep Kubernetes version current

Secrets Management

Never commit secrets to version control! Use one of these approaches:
  1. Environment Variables:
    export TF_VAR_database_password="..."
    export TF_VAR_secret_key="..."
    terraform apply
    
  2. AWS Secrets Manager:
    # Store secrets in AWS
    aws secretsmanager create-secret \
      --name anysource-production-secrets \
      --secret-string file://secrets.json
    
    # Reference in Terraform
    data "aws_secretsmanager_secret_version" "secrets" {
      secret_id = "anysource-production-secrets"
    }
    
  3. Terraform Cloud/Enterprise:
    • Store sensitive variables in Terraform Cloud
    • Mark as sensitive
    • Use workspace-specific values

Troubleshooting

Common Causes:
  • Insufficient IAM permissions
  • Service quota limits reached
  • Subnet IP address exhaustion
Solution:
# Check IAM permissions
aws sts get-caller-identity

# Check service quotas
aws service-quotas list-service-quotas \
  --service-code eks

# Verify subnet CIDR blocks
aws ec2 describe-subnets --subnet-ids subnet-xxx
Common Causes:
  • Security group misconfiguration
  • IAM role issues
  • Subnet routing problems
Solution:
# Check node group status
aws eks describe-nodegroup \
  --cluster-name myapp-production-eks \
  --nodegroup-name default

# Check node logs
kubectl logs -n kube-system -l k8s-app=aws-node
Common Causes:
  • Security group rules
  • Wrong endpoint
  • Password mismatch
Solution:
# Test database connectivity from a pod
kubectl run -it --rm debug --image=postgres:16 --restart=Never -- \
  psql -h <rds-endpoint> -U dbadmin -d anysource_db

# Check security groups
aws ec2 describe-security-groups --group-ids sg-xxx
Cost Optimization Tips:
  1. Right-size node instances:
    • Use smaller instances for development
    • Enable cluster autoscaler
  2. Optimize database:
    • Lower min_capacity for non-production
    • Reduce backup retention
  3. Reduce NAT Gateway costs:
    • Use single NAT Gateway for development
    • Consider VPC endpoints for AWS services
  4. Monitor usage:
    # Check actual resource usage
    kubectl top nodes
    kubectl top pods --all-namespaces
    

Maintenance and Updates

Kubernetes Version Upgrades

# Check current version
kubectl version --short

# Update cluster version in terraform.tfvars
cluster_version = "1.33"

# Apply upgrade
terraform plan
terraform apply

# Update node groups (done automatically)

Backup and Disaster Recovery

Automated Backups:
  • RDS: Daily automated backups (configurable retention)
  • EKS: Backup using Velero or AWS Backup
Manual Backup:
# Create RDS snapshot
aws rds create-db-cluster-snapshot \
  --db-cluster-identifier myapp-production-eks-db \
  --db-cluster-snapshot-identifier manual-backup-$(date +%Y%m%d)

# Backup Kubernetes resources
kubectl get all --all-namespaces -o yaml > k8s-backup.yaml

Next Steps

Support

For issues and questions: