Hey there! In this project documentation, I’m going to walk you through how I deployed the DeepSeek Model R1 on AWS using Terraform and GitHub Actions. If you’ve ever tried deploying a machine learning model, you know it can get pretty complicated—especially when you’re juggling multiple AWS services. To make things easier, I decided to automate the whole process using Terraform for infrastructure as code and GitHub Actions for CI/CD. Spoiler alert: it worked like a charm!
This project involved setting up an EC2 instance, an Application Load Balancer (ALB), security groups, IAM roles, and even a custom domain using Route 53. The best part? Everything was automated, so I didn’t have to manually configure resources every time I made a change. Whether you’re a seasoned DevOps pro or just getting started with cloud deployments, I hope this walkthrough gives you some useful insights (and maybe saves you a few headaches along the way).
See the video below for the project breakdown and result
In this update, I’ve migrated my EC2 instance to a private subnet for improved security while still ensuring seamless access for application configuration and management. Here’s what changed compared to my first deployment:
✅ Private Subnet Deployment – The EC2 instance now runs in a private subnet instead of a public one.
✅ Internet Access via NAT Gateway – The instance can pull updates and dependencies while remaining inaccessible from the public internet.
✅ AWS Systems Manager (SSM) for Secure Access – Instead of SSH, I’m using SSM to manage and interact with the instance securely.
✅ AWS WAF for Backend Protection – Added AWS Web Application Firewall (WAF) to safeguard against malicious traffic and attacks.
This approach enhances security while maintaining full operational control over the deployment. Next steps? Further optimizing performance and security!
Table of Contents
- Introduction
- Project Overview
-
Terraform Configuration
- Provider Configuration
- Security Groups
- Load Balancer (ALB)
- EC2 Instance
- IAM Roles and Instance Profile
- Route 53 DNS Record
- Terraform Backend (S3)
- AWS WAF Configuration
-
GitHub Actions Workflow
- Workflow Triggers
- Setup Job
- Apply Job
- Post-Apply Job
- Destroy Job
- Challenges Faced
- Lessons Learned
- Future Improvements
- Conclusion
1. Introduction
Deploying machine learning models in production can be a complex task, especially when it involves multiple AWS services. To streamline this process, I used Terraform to define the infrastructure as code and GitHub Actions to automate the deployment pipeline. This approach ensures consistency, scalability, and repeatability.
2. Project Overview
The goal of this project was to deploy the DeepSeek Model R1 on AWS, making it accessible via a web interface (OpenWebUI) and an API (Ollama). The infrastructure includes:
- EC2 Instance: Hosts the DeepSeek model in a Docker container and associated services in a private subnet.
- AWS Systems Manager (SSM): Provides secure connection to EC2 in private subnet via VPC Endpoints.
- Application Load Balancer (ALB): Distributes traffic to the EC2 instance and handles SSL termination.
- Security Groups: Control inbound and outbound traffic to the ALB and EC2 instance.
- IAM Roles: Provide the necessary permissions for the EC2 instance to allow AWS Systems Manager Access
- Route 53: Manages DNS records for the ALB. I just employed a cetificate I already have in us-east-1 and a ready public hosted zone in same region.
- Terraform Backend: Stores the Terraform state file in an S3 bucket for team collaboration.
- AWF: Protects the backend against bad requests
3. Terraform Configuration & Data
The Terraform configuration is the backbone of this project. It defines all the AWS resources required for the deployment. Below is a breakdown of the key components:
AWS Provider & VPC Configuration
The first step in the Terraform configuration is to define the AWS provider and specify the region:
provider "aws" {
region = "us-east-1"
}
# Fetch existing VPC
data "aws_vpc" "main_vpc" {
id = var.vpc_id
}
Security Groups
I created 3 security groups: one for the ALB and one for the EC2 instance and the last for the VPC endpoints. The ALB security group allows HTTPS traffic (port 443). The EC2 security group restricts traffic to only allow communication from the ALB on ports 8080 (OpenWebUI) and the security group of the VPC endpoints.
## Security Group for EC2 (Only ALB can access it)
resource "aws_security_group" "deepseek_ec2_sg" {
name = "deepseek_ec2_sg"
description = "Security group for EC2 instance"
vpc_id = data.aws_vpc.main_vpc.id
# Allow traffic from ALB
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb_sg.id]
}
ingress {
from_port = 11434
to_port = 11434
protocol = "tcp"
security_groups = [aws_security_group.alb_sg.id]
}
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
## Security Group for ALB (Allows direct access)
resource "aws_security_group" "alb_sg" {
name = "deepseek_alb_sg"
description = "Security group for ALB"
vpc_id = data.aws_vpc.main_vpc.id
# Allow HTTPS from anywhere
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
## Security Group for VPC Endpoints
resource "aws_security_group" "endpoint_sg" {
name = "vpc-endpoint-sg"
description = "Security group for VPC Endpoints"
vpc_id = data.aws_vpc.main_vpc.id
# Allow traffic from EC2 to VPC Endpoints
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
security_groups = [aws_security_group.deepseek_ec2_sg.id]
}
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
security_groups = [aws_security_group.deepseek_ec2_sg.id]
}
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Load Balancer (ALB) & Listener
The ALB is configured to listen on ports 443 (HTTPS) and forwards traffic to the target group that has the EC2.
# Load Balancer
resource "aws_lb" "deepseek_lb" {
name = "deepseek-alb"
internal = false # Public ALB
load_balancer_type = "application"
security_groups = [aws_security_group.alb_sg.id]
subnets = var.public_subnet_ids # ALB must be in public subnets
}
## Listener for ALB (HTTPS) forwards traffic to OpenWebUI
resource "aws_lb_listener" "https_listener" {
load_balancer_arn = aws_lb.deepseek_lb.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-2016-08"
certificate_arn = var.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.deepseek_tg.arn
}
}
# Target Groups
resource "aws_lb_target_group" "deepseek_tg" {
name = "deepseek-target-group"
port = 8080
protocol = "HTTP"
vpc_id = data.aws_vpc.main_vpc.id
health_check {
path = "/"
interval = 30
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 2
}
}
EC2, IAM & VPC Endpoints
The EC2 instance is configured with a gp3 EBS volume (48GB) and an IAM role for necessary permissions. The instance is placed in a public subnet and associated with the EC2 security group. Note: An instance with GPU support like p3.2xlarge, g4dn.xlarge etc would do better here to handle bigger model and process responses faster
but I didn't get one approved by AWS at the time of project execution. So, I used c4.4xlarge
.
# IAM Role for SSM
resource "aws_iam_role" "ssm_role" {
name = "EC2SSMRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
Action = "sts:AssumeRole"
}]
})
}
# Attach AmazonSSMManagedInstanceCore policy
resource "aws_iam_role_policy_attachment" "ssm_policy" {
role = aws_iam_role.ssm_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
# IAM Instance Profile
resource "aws_iam_instance_profile" "ssm_instance_profile" {
name = "EC2SSMInstanceProfile"
role = aws_iam_role.ssm_role.name
}
# EC2 Instance
resource "aws_instance" "deepseek_ec2" {
ami = var.ami_id
instance_type = var.instance_type
subnet_id = var.private_subnet_ids[0]
security_groups = [aws_security_group.deepseek_ec2_sg.id]
iam_instance_profile = aws_iam_instance_profile.ssm_instance_profile.name
root_block_device {
volume_size = 48
volume_type = "gp3"
delete_on_termination = true
}
tags = {
Name = "DeepSeekModelInstance"
}
}
# Attach EC2 Instance to Target Group
resource "aws_lb_target_group_attachment" "deepseek_tg_attachment" {
target_group_arn = aws_lb_target_group.deepseek_tg.arn
target_id = aws_instance.deepseek_ec2.id
port = 8080
}
# VPC Endpoints for SSM
resource "aws_vpc_endpoint" "ssm" {
vpc_id = data.aws_vpc.main_vpc.id
service_name = "com.amazonaws.us-east-1.ssm"
vpc_endpoint_type = "Interface"
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.endpoint_sg.id]
private_dns_enabled = true
}
# VPC Endpoint for EC2 Messages (Used by SSM)
resource "aws_vpc_endpoint" "ec2_messages" {
vpc_id = data.aws_vpc.main_vpc.id
service_name = "com.amazonaws.us-east-1.ec2messages"
vpc_endpoint_type = "Interface"
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.endpoint_sg.id]
private_dns_enabled = true
}
# VPC Endpoint for SSM Messages (Used by SSM)
resource "aws_vpc_endpoint" "ssm_messages" {
vpc_id = data.aws_vpc.main_vpc.id
service_name = "com.amazonaws.us-east-1.ssmmessages"
vpc_endpoint_type = "Interface"
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.endpoint_sg.id]
private_dns_enabled = true
}
Route 53 DNS Record
A Route 53 DNS record is created to map the ALB’s DNS name to a custom domain. Like I mentioned above, I used an already existing certificate to enable SSL and I also employed an existing hosted zone in us-east-1.
resource "aws_route53_record" "deepseek_dns" {
zone_id = var.hosted_zone_id
name = "deepseek.fozdigitalz.com"
type = "A"
alias {
name = aws_lb.deepseek_lb.dns_name
zone_id = aws_lb.deepseek_lb.zone_id
evaluate_target_health = true
}
}
AWS WAF Configuration
#AWS Web Application Firewall
resource "aws_wafv2_web_acl" "deepseek_waf" {
name = "deepseek-waf"
description = "WAF for ALB protecting backend"
scope = "REGIONAL"
default_action {
allow {}
}
# Rate Limiting Rule
rule {
name = "RateLimitRule"
priority = 1
action {
block {}
}
statement {
rate_based_statement {
limit = 150
aggregate_key_type = "IP"
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "RateLimit"
sampled_requests_enabled = true
}
}
# Amazon IP Reputation List (Blocks known bad IPs, reconnaissance, DDoS)
rule {
name = "AmazonIPReputationRule"
priority = 2
override_action {
none {}
}
statement {
managed_rule_group_statement {
vendor_name = "AWS"
name = "AWSManagedRulesAmazonIpReputationList"
# OPTIONAL: Override specific rules inside the group
rule_action_override {
action_to_use {
block {}
}
name = "AWSManagedIPReputationList"
}
rule_action_override {
action_to_use {
block {}
}
name = "AWSManagedReconnaissanceList"
}
rule_action_override {
action_to_use {
count {}
}
name = "AWSManagedIPDDoSList"
}
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "AmazonIPReputationRule"
sampled_requests_enabled = true
}
}
# AWS Managed Known Bad Inputs Rule Set
rule {
name = "KnownBadInputsRule"
priority = 3
override_action {
none {}
}
statement {
managed_rule_group_statement {
vendor_name = "AWS"
name = "AWSManagedRulesKnownBadInputsRuleSet"
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "KnownBadInputsRule"
sampled_requests_enabled = true
}
}
# AWS Managed Common Rule Set
rule {
name = "CommonRuleSet"
priority = 4
override_action {
none {} # Ensures AWS WAF applies its built-in block actions
}
statement {
managed_rule_group_statement {
vendor_name = "AWS"
name = "AWSManagedRulesCommonRuleSet"
# Override specific rules that are set to "Count" by default, so they actually block bad traffic.
rule_action_override {
action_to_use {
block {}
}
name = "CrossSiteScripting_URIPATH_RC_COUNT"
}
rule_action_override {
action_to_use {
block {}
}
name = "CrossSiteScripting_BODY_RC_COUNT"
}
rule_action_override {
action_to_use {
block {}
}
name = "CrossSiteScripting_QUERYARGUMENTS_RC_COUNT"
}
rule_action_override {
action_to_use {
block {}
}
name = "CrossSiteScripting_COOKIE_RC_COUNT"
}
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "CommonRuleSet"
sampled_requests_enabled = true
}
}
visibility_config {
cloudwatch_metrics_enabled = true
metric_name = "deepseek-waf"
sampled_requests_enabled = true
}
}
#WAF Attachment to ALB
resource "aws_wafv2_web_acl_association" "deepseek_waf_alb" {
resource_arn = aws_lb.deepseek_lb.arn
web_acl_arn = aws_wafv2_web_acl.deepseek_waf.arn
depends_on = [aws_lb.deepseek_lb,
aws_wafv2_web_acl.deepseek_waf
]
}
Terraform Backend (S3)
The Terraform state file is stored in an S3 bucket to enable team collaboration and state management.
terraform {
backend "s3" {
bucket = "foz-terraform-state-bucket"
key = "infra.tfstate"
region = "us-east-1"
encrypt = true
}
}
Variables Configuration
The variables.tf
file defines all the input variables required for the Terraform configuration. These variables make the configuration reusable and customizable.
variable "vpc_id" {
description = "The VPC ID where resources will be deployed"
type = string
}
variable "public_subnet_ids" {
description = "List of public subnet IDs for ALB"
type = list(string)
}
variable "private_subnet_ids" {
description = "List of private subnet IDs for EC2 instances"
type = list(string)
}
variable "ami_id" {
description = "AMI ID for the EC2 instance"
type = string
}
variable "instance_type" {
description = "Instance type for the EC2 instance"
type = string
}
variable "certificate_arn" {
description = "ARN of the SSL certificate for HTTPS"
type = string
}
variable "hosted_zone_id" {
description = "Route 53 hosted zone ID for the domain"
type = string
}
Terraform.tfvars
The terraform.tfvars
file is used to assign values to the variables defined in variables.tf
. This file is typically not committed to version control (e.g., Git) for security reasons, as it may contain sensitive information like AWS credentials. I added terraform.tfvars
to .gitignore
so it won't be tracked. I provided the variables and secrets in github using environment variables
and secrets
. Also, the values below are made up and not real.
vpc_id = "vpc-012345678910"
private_subnet_ids = ["subnet-0012345678910", "subnet-0012345678910", "subnet-0012345678910"]
instance_type = "c4.4xlarge" # bigger for production but t2.2xlarge test
ami_id = "ami-04b4f1a9cf54c11d0"
certificate_arn = "arn:aws:acm:us-east-1:012345678910:certificate/697cf89b-9931-435f-a5f0-c8f012345678910c"
hosted_zone_id = "Z012345678960X9UZZVPUYYW0H"
public_subnet_ids = ["subnet-0012345678910","subnet-012345678910"]
Output Configuration
The output.tf file defines the outputs that Terraform will display after applying the configuration. I used these outputs for retrieving information like the EC2 instance’s public IP or the ALB’s DNS name that are needed for my Github Action workflow
to configure my EC2 instances.
output "ec2_public_ip" {
description = "Public IP of the EC2 instance"
value = aws_instance.deepseek_ec2.public_ip
}
output "lb_url" {
description = "DNS name of the ALB"
value = aws_lb.deepseek_lb.dns_name
}
output "deepseek_ec2_sg_id" {
value = aws_security_group.deepseek_ec2_sg.id
}
output "deepseek_ec2_id" {
value = aws_instance.deepseek_ec2.id
}
4. GitHub Actions Workflow
The GitHub Actions workflow automates the deployment process. It consists of four main jobs: setup, apply, post-apply, and destroy. My workflow will be triggered by pushing to the main branch or manually through the GitHub UI using workflow dispatch with an input to choose the action (apply/destroy).
Workflow Triggers
The workflow is triggered on a push to the main
branch or manually via the workflow_dispatch
event. The manual trigger allows you to choose between apply (to create resources) and destroy (to tear down resources).
on:
push:
branches:
- main
workflow_dispatch:
inputs:
action:
description: "Choose action (apply/destroy)"
required: true
default: "apply"
type: choice
options:
- apply
- destroy
Setup Job
The setup job initializes the environment by checking out the code, setting up Terraform, and configuring AWS credentials.
name: Deploying DeepSeek Model R1 on AWS via Terraform & GitHub Actions
on:
push:
branches:
- main
workflow_dispatch:
inputs:
action:
description: "Choose action (apply/destroy)"
required: true
default: "apply"
type: choice
options:
- apply
- destroy
jobs:
setup:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v4
- name: Set up Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_DEFAULT_REGION }}
Apply Job
The apply job runs Terraform to create the infrastructure. It generates a terraform.tfvars
file using GitHub Secrets, initializes Terraform, and applies the configuration. This is where my Github Actions Runner
SSH into my EC2 to install docker, pulls the images and deploy Ollama
and OpenWebUI
using Docker.
apply:
runs-on: ubuntu-latest
outputs:
ec2_instance_id: ${{ steps.get_ec2_id.outputs.ec2_id }}
needs: setup
if: |
(github.event_name == 'workflow_dispatch' && github.event.inputs.action == 'apply') ||
(github.event_name == 'push' && !contains(github.event.head_commit.message, 'destroy'))
steps:
- name: Checkout Code
uses: actions/checkout@v4
- name: Set up Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_DEFAULT_REGION }}
- name: Create terraform.tfvars
run: |
cat < terraform.tfvars
ami_id = "${{ secrets.AMI_ID }}"
certificate_arn = "${{ secrets.CERTIFICATE_ARN }}"
vpc_id = "${{ secrets.VPC_ID }}"
hosted_zone_id = "${{ secrets.HOSTED_ZONE_ID }}"
instance_type = "${{ secrets.INSTANCE_TYPE }}"
aws_access_key_id = "${{ secrets.AWS_ACCESS_KEY_ID }}"
aws_secret_access_key = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
aws_region = "${{ secrets.AWS_DEFAULT_REGION }}"
public_subnet_ids = ${{ secrets.PUBLIC_SUBNET_IDS }}
private_subnet_ids = ${{ secrets.PRIVATE_SUBNET_IDS }}
EOF
- name: Mask AWS Account ID in Logs
run: echo "::add-mask::${{ secrets.AWS_ACCOUNT_ID }}"
- name: Terraform Init & Apply
run: |
terraform init \
-backend-config="bucket=${{ secrets.TERRAFORM_STATE_BUCKET }}" \
-backend-config="key=infra.tfstate" \
-backend-config="region=${{ secrets.AWS_DEFAULT_REGION }}"
terraform apply -auto-approve -var-file=terraform.tfvars
- name: Retrieve EC2 Instance ID
id: get_ec2_id
run: |
echo "Retrieving EC2 Instance ID..."
EC2_ID=$(terraform output -raw deepseek_ec2_id)
echo "EC2_INSTANCE_ID=$EC2_ID" >> $GITHUB_ENV
echo "::set-output name=ec2_id::$EC2_ID"
- name: Verify EC2 Instance ID
run: |
echo "EC2_INSTANCE_ID=${{ env.EC2_INSTANCE_ID }}"
if [ -z "${{ env.EC2_INSTANCE_ID }}" ]; then
echo "EC2 instance ID is empty or invalid."
exit 1
fi
- name: Wait for EC2
run: sleep 60
Post-Apply Job
The post-apply job configures the EC2 instance after Terraform provisions it. It verifies the instance's connection via AWS SSM, checks if the SSM agent is running, and installs Docker along with necessary dependencies. After rebooting the instance, it deploys the DeepSeek Model R1 inside an Ollama container and sets up Open WebUI for interaction. Finally, it confirms that the WebUI is accessible via https://deepseek.fozdigitalz.com
. This ensures a fully automated deployment of the model and web interface on AWS.
post_apply:
runs-on: ubuntu-latest
needs: apply
if: success()
env:
EC2_INSTANCE_ID: ${{ needs.apply.outputs.ec2_instance_id }}
steps:
- name: Checkout Code
uses: actions/checkout@v4
- name: Set up Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_DEFAULT_REGION }}
- name: Create terraform.tfvars
run: |
cat < terraform.tfvars
ami_id = "${{ secrets.AMI_ID }}"
certificate_arn = "${{ secrets.CERTIFICATE_ARN }}"
vpc_id = "${{ secrets.VPC_ID }}"
hosted_zone_id = "${{ secrets.HOSTED_ZONE_ID }}"
instance_type = "${{ secrets.INSTANCE_TYPE }}"
aws_access_key_id = "${{ secrets.AWS_ACCESS_KEY_ID }}"
aws_secret_access_key = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
aws_region = "${{ secrets.AWS_DEFAULT_REGION }}"
public_subnet_ids = ${{ secrets.PUBLIC_SUBNET_IDS }}
private_subnet_ids = ${{ secrets.PRIVATE_SUBNET_IDS }}
EOF
- name: Verify SSM Connection
run: |
echo "Verifying SSM Connection..."
aws ssm describe-instance-information --region ${{ secrets.AWS_DEFAULT_REGION }} \
--query "InstanceInformationList[?InstanceId=='${{ env.EC2_INSTANCE_ID }}']" \
--output json
- name: Check SSM Agent Status
run: |
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--instance-ids "${{ env.EC2_INSTANCE_ID }}" \
--parameters '{"commands":["sudo systemctl status amazon-ssm-agent"]}' \
--region ${{ secrets.AWS_DEFAULT_REGION }}
- name: Install Docker via SSM
run: |
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "[{\"Key\":\"InstanceIds\",\"Values\":[\"${{ env.EC2_INSTANCE_ID }}\"]}]" \
--parameters commands='[
"sudo apt-get update",
"sudo apt-get install -y docker.io docker-compose",
"sudo systemctl enable docker",
"sudo systemctl start docker",
"sudo usermod -aG docker ubuntu",
"sudo sed -i s/^ENABLED=1/ENABLED=0/ /etc/apt/apt.conf.d/20auto-upgrades",
"sleep 10",
"sudo reboot"
]' \
--region ${{ secrets.AWS_DEFAULT_REGION }}
- name: Wait for EC2 instance to reboot ..."
run: sleep 50
- name: Run DeepSeek Model and WebUI via SSM
run: |
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets "[{\"Key\":\"InstanceIds\",\"Values\":[\"${{ env.EC2_INSTANCE_ID }}\"]}]" \
--parameters commands='[
"docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama",
"sleep 20",
"docker exec ollama ollama pull deepseek-r1:8b",
"sleep 15",
"docker exec -d ollama ollama serve",
"sleep 15",
"docker run -d -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main",
"sleep 15"
]' \
--region ${{ secrets.AWS_DEFAULT_REGION }}
echo "Waiting for WebUI to start..."
sleep 30
- name: Confirm WebUI is Running
run: |
aws ssm send-command \
--document-name "AWS-RunShellScript" \
--targets '[{"Key":"instanceIds","Values":["${{ env.EC2_INSTANCE_ID }}"]}]' \
--parameters '{"commands":["curl -I https://deepseek.fozdigitalz.com"]}' \
--region ${{ secrets.AWS_DEFAULT_REGION }}
Destroy Job
The destroy job tears down the infrastructure when triggered manually or via a commit message containing "destroy".
destroy:
runs-on: ubuntu-latest
needs: setup
if: |
(github.event_name == 'workflow_dispatch' && github.event.inputs.action == 'destroy') ||
(github.event_name == 'push' && contains(github.event.head_commit.message, 'destroy'))
steps:
- name: Checkout Code
uses: actions/checkout@v4
- name: Set up Terraform
uses: hashicorp/setup-terraform@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_DEFAULT_REGION }}
- name: Create terraform.tfvars
run: |
cat < terraform.tfvars
ami_id = "${{ secrets.AMI_ID }}"
certificate_arn = "${{ secrets.CERTIFICATE_ARN }}"
vpc_id = "${{ secrets.VPC_ID }}"
hosted_zone_id = "${{ secrets.HOSTED_ZONE_ID }}"
instance_type = "${{ secrets.INSTANCE_TYPE }}"
aws_access_key_id = "${{ secrets.AWS_ACCESS_KEY_ID }}"
aws_secret_access_key = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
aws_region = "${{ secrets.AWS_DEFAULT_REGION }}"
public_subnet_ids = ${{ secrets.PUBLIC_SUBNET_IDS }}
private_subnet_ids = ${{ secrets.PRIVATE_SUBNET_IDS }}
EOF
- name: Mask AWS Account ID in Logs
run: echo "::add-mask::${{ secrets.AWS_ACCOUNT_ID }}"
- name: Terraform Destroy
run: |
terraform init -reconfigure \
-backend-config="bucket=${{ secrets.TERRAFORM_STATE_BUCKET }}" \
-backend-config="key=infra.tfstate" \
-backend-config="region=${{ secrets.AWS_DEFAULT_REGION }}"
terraform destroy -auto-approve -var-file=terraform.tfvars
Completed workflow run
5. The Application in Action (Result)
After successfully deploying the DeepSeek Model R1 on AWS, I was able to access the OpenWebUI and interact with the model. Below are some screenshots demonstrating the setup and functionality:
1. OpenWebUI Interface
The OpenWebUI provides a user-friendly interface for interacting with the DeepSeek Model R1. Here’s a screenshot of the dashboard:
The OpenWebUI dashboard, accessible via the custom domain deepseek.fozdigitalz.com
.
2. Model Interaction
I tested the model by asking it a few questions. Here’s an example of the model’s response:
Sample model response 11
The DeepSeek Model R1 generating a response to a sample query.
Sample model response 12
Sample model response 21
Sample model response 22
3. Infrastructure Clean Up
I can destroy my infrastructure when I trigger my workflow in different ways. My workflow will be triggered by pushing to the main branch or manually through the GitHub UI using workflow dispatch with an input to choose the action (apply/destroy).
- By manually triggering the workflow from the GitHub UI and selecting the
destroy
action. - By pushing to the
main
branch with a commit message that contains the worddestroy.
- By running the command
gh workflow run "workflow name" --field action=destroy
locally but you must have the Github CLI installed for this to work.
Infrastructure cleanup using terraform
6. Challenges Faced & Lesson Learned
Learning from the challenges in my first deployment, modifying the setup was easier—I moved the EC2 instance to a private subnet, switched from SSH to AWS SSM for secure access, and integrated AWS WAF for better protection. However, I still researched best practices for implementing AWS-managed WAF rules to enhance security effectively.
7. Future Improvements
While the deployment process is now functional, there are opportunities I may consider to enhance scalability, security, and cost-efficiency.
In my first deployment, I considered improvements like moving EC2 to a private subnet, using SSM instead of SSH, and adding WAF, which I have now implemented. Additionally, I’m exploring CloudWatch for monitoring, auto-scaling for better resource management, and GPU instances for enhanced performance, though AWS approval for GPU is still pending. These refinements aim to make the deployment more secure, scalable, and cost-efficient.
8. Conclusion
This deployment has evolved significantly from the initial setup, incorporating better security, automation, and scalability based on lessons learned. Moving EC2 to a private subnet, using AWS SSM instead of SSH, and integrating AWS WAF has strengthened security, while CloudWatch and auto-scaling are now key considerations for future improvements. As I continue refining the architecture, I’m also exploring GPU-powered instances for better performance once AWS approval is granted. These enhancements ensure a more resilient, efficient, and secure deployment of the DeepSeek Model R1 on AWS.