Introduction

Hey builders! πŸ‘‹

So I recently took on an exciting challenge: transforming a full-stack FastAPI and React application into a production-ready system with robust monitoring. While the application itself was already well-structured, my role was to bring it to production standards through proper containerization, orchestration, and monitoring/observability. Let me walk you through this journey.

Want to see the complete code? Check out the GitHub repository.

πŸ“‹ Prerequisites

Some things you'll need to have for this project:

  • Docker and Docker Compose installed. You can check the official documentation on how to install them on ubuntu.

  • A domain name (for SSL/TLS setup). I recommend getting from Hostinger

  • Basic understanding of:

  • Docker containers and images

  • Reverse proxies

  • Monitoring concepts

  • Linux command line

  • Sufficient system resources:
  • At least 4GB RAM
  • 2 CPU cores
  • 20GB storage

The Challenge 🎯

When I first looked at the project, I saw a typical full-stack application with:

  • A FastAPI backend
  • A React frontend
  • PostgreSQL database
  • Basic authentication

My mission? Transform this into a production-grade system with:

  • Proper containerization
  • Automated SSL/TLS
  • Comprehensive monitoring
  • Efficient log management
  • Zero-downtime deployments

The Solution Architecture

I designed a modern DevOps architecture that looks like this (if you've been following my articles recently, you'll know i like this types of diagrams now πŸ˜…):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                           Client Requests                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                        β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                        β”‚    Traefik    β”‚
                        β”‚  Reverse Proxyβ”‚
                        β”‚   (SSL/TLS)   β”‚
                        β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                       β”‚                       β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
β”‚    Frontend   β”‚       β”‚    Backend    β”‚       β”‚    Adminer    β”‚
β”‚  (React/Nginx)│◄─────►│   (FastAPI)   │◄─────►│  (DB Admin)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                       β”‚                       β”‚
        β”‚                       β”‚                       β”‚
        β”‚                β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”               β”‚
        └───────────────►│  PostgreSQL  β”‚β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚   Database   β”‚
                         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”
                         β”‚  Monitoring  β”‚
                         β”‚    Stack     β”‚
                         β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚             β”‚             β”‚             β”‚            β”‚
β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”
β”‚cAdvisor β”‚   β”‚Promtail β”‚   β”‚Loki   β”‚    β”‚Prometheusβ”‚  β”‚Grafana   β”‚
β”‚Containerβ”‚   β”‚Logs     β”‚   β”‚Log    β”‚    β”‚Metrics   β”‚  β”‚Dashboardsβ”‚
β”‚Metrics  β”‚   β”‚Collectorβ”‚   β”‚Storageβ”‚    β”‚Database  β”‚  β”‚& Alerts  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Application in Action

Image description

Image description

Containerization Strategy: Building Efficient Images 🐳

Multi-stage Builds: Optimizing Image Size

So let's look at how I achieved proper containerization.

One of the first challenges I faced was keeping the Docker images small in size and efficient. That's where multi-stage builds came in. Let me show you how I implemented this for both frontend and backend:

Frontend Container

# Build stage
FROM node:18-alpine as build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build

# Production stage
FROM nginx:alpine
COPY --from=build /app/dist /usr/share/nginx/html
COPY nginx.conf /etc/nginx/conf.d/default.conf
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]

Why did I adopt using multi-stage?

  • It reduces final image size by excluding build tools.
    For the frontend imagine the image size reducing from a whooping ~590mb with a single-stage build to a 50.1mb image size from the multi-stage build.

  • Using multi-stage also separates build dependencies from runtime dependencies

  • Improves security by reducing attack surface

  • Makes for faster deployments due to smaller images

Single-stage build...

Image description

vs Multi-stage build...

Image description

Backend Container

I also applied multi-stage build here:

# Build stage
FROM python:3.11-slim AS builder

WORKDIR /app

# Install only necessary build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Install specific version of poetry with export support
RUN pip install --no-cache-dir poetry==1.5.1

# Copy only dependency files first for better caching
COPY pyproject.toml poetry.lock* ./

# Generate requirements.txt - using the correct syntax for poetry export
RUN poetry export --without-hashes --format=requirements.txt > requirements.txt

# Copy the rest of the application
COPY . .

# Production stage
FROM python:3.11-slim
WORKDIR /app

# Install runtime dependencies only
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq5 \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements and install directly with pip
COPY --from=builder /app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY --from=builder /app .

# Make startup script executable
RUN chmod +x /app/prestart.sh

# Set correct PYTHONPATH to ensure app imports work properly
ENV PYTHONPATH="${PYTHONPATH}:/app"

EXPOSE 8000
CMD ["bash", "-c", "cd /app && ./prestart.sh && uvicorn app.main:app --host 0.0.0.0 --port 8000"]

Database Management with Adminer

I added Adminer in my application stack in order to monitor the database. I also configured it to have secure access through Traefik.
Did this by simply added that Traefik label in the adminer service section in the docker-compose file.

adminer:
  image: adminer
  labels:
    - "traefik.enable=true"
    - "traefik.http.routers.adminer.rule=Host(`michaeloxo.tech`) && PathPrefix(`/adminer`)"
    - "traefik.http.routers.adminer.entrypoints=websecure"
    - "traefik.http.routers.adminer.tls=true"

Adminer also supports multiple database types so you can try using it too in your various application stacks.

Volume Management

For data persistence for grafana and prometheus, i implemented named volumes:

volumes:
  postgres_data:
    driver: local
  prometheus_data:
    driver: local
  grafana_data:
    driver: local

This helps persist data across container restarts.

Network Isolation

I implemented proper network isolation using Docker networks:

networks:
  app-network:
    driver: bridge

This will allow easy communication among the containers.

Health Checks

Every service includes health checks. The database, for example, has this health check:

healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
      interval: 5s
      timeout: 5s
      retries: 5

These health checks are important as they ensure that all the services are available, and we can detect problems early.


Detailed Component Breakdown: The Monitoring Stack πŸ”

Now let's talk about observability and monitoring.

When I first approached this project, I knew I needed a monitoring solution that would be both powerful and maintainable.

After careful consideration, I settled on a modern stack that combines the best tools for container monitoring, metrics collection, and log aggregation. Let me walk you through each component and why I chose them.

The Foundation: cAdvisor and Container Metrics

The first piece of the puzzle was finding a way to monitor the containers effectively. That's where cAdvisor came in. What makes cAdvisor special is its zero-configuration approach - just mount the right volumes, and it starts collecting metrics automatically. In my setup, it watches over every container, tracking CPU usage, memory consumption, network I/O, and disk usage in real-time.

cadvisor:
  image: gcr.io/cadvisor/cadvisor:v0.47.0
  volumes:
    - /:/rootfs:ro
    - /var/run:/var/run:rw
    - /sys:/sys:ro
    - /var/lib/docker/:/var/lib/docker:ro

The beauty of cAdvisor lies in its simplicity. It exposes metrics in Prometheus format out of the box, making it a perfect fit for my monitoring stack. Every container's performance is now visible at a glance, helping us identify potential issues before they become problems.

Prometheus and Metrics Storage

For storing and querying our metrics, I chose Prometheus. Yeah I know, Prometheus is quite everyone's go-to when it come storing and collecting metrics. This is because of its pull-based architecture, which is more reliable than push-based systems, especially in containerized environments. My Prometheus configuration is quite clean and straightforward:

global:
  scrape_interval: 15s
  evaluation_interval: 15s     

scrape_configs:
  - job_name: 'prometheus'
    metrics_path: '/prometheus/metrics'
    static_configs:
      - targets: ['prometheus:9090']

  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

The Log Aggregation Duo: Loki and Promtail

For log management, I implemented a combination of Loki and Promtail. This choice was driven by the need for a lightweight yet powerful logging solution. Unlike traditional ELK stacks that can be resource-intensive, Loki and Promtail provide efficient log aggregation with minimal overhead.

loki:
  image: grafana/loki:latest
  volumes:
    - ./loki/loki-config.yml:/etc/loki/config.yml

promtail:
  image: grafana/promtail:latest
  volumes:
    - ./promtail/promtail-config.yml:/etc/promtail/config.yml
    - /var/log:/var/log:ro
    - /var/lib/docker/containers:/var/lib/docker/containers:ro

The synergy between these tools is impressive. Promtail collects logs from both the system and containers, while Loki stores them efficiently. What's particularly useful is how they use the same labeling system as Prometheus, making it easy to correlate logs with metrics.

The Visualization Layer: Grafana

To bring all this data to life, I chose and implemented Grafana. I mean, Grafana just wonderfully ties everything together, providing beautiful dashboards and powerful querying capabilities. My Grafana setup is configured to work seamlessly with both Prometheus and Loki:

grafana:
  image: grafana/grafana:latest
  volumes:
    - grafana_data:/var/lib/grafana
  environment:
    - GF_SERVER_ROOT_URL=http://grafana:3000/grafana
    - GF_SERVER_SERVE_FROM_SUB_PATH=true

Here's how the grafana dashboards looks like in production:

Image description

Image description

Image description

The Traffic Manager: Traefik

Finally, to tie everything together, I implemented Traefik as my reverse proxy. This modern reverse proxy stands out for its automatic service discovery and dynamic configuration capabilities. My Traefik setup ensures secure access to all my monitoring tools. Just add these labels to the desired service in the docker-compose file:

labels:
  - "traefik.enable=true"
  - "traefik.http.routers.[service].rule=Host(`michaeloxo.tech`) && PathPrefix(`/[service]`)"
  - "traefik.http.routers.[service].entrypoints=websecure"
  - "traefik.http.routers.[service].tls=true"
  - "traefik.http.routers.[service].middlewares=global-middleware@file"

What makes this setup particularly effective is how all components work together.

cAdvisor collects metrics, Prometheus stores them, Loki and Promtail handle logs, Grafana visualizes everything, and Traefik ensures secure access. It's a well-oiled machine where each part plays its role perfectly.

The result? A comprehensive monitoring solution that provides real-time insights into our application's performance, helps us identify and troubleshoot issues quickly, and ensures we have the data we need to make informed decisions about our infrastructure.

(Btw, I recently implemented a similar but more complex monitoring stack on a product we're building, and ofc i wrote an article about it. Read it here )


Lessons Learned πŸ“š

Throughout this DevOps implementation journey, I've gathered some valuable insights that are worth noting:

Traefik was like magic!

  • Automatic SSL Was a Game-Changer. Before Traefik, I spent hours manually configuring and renewing SSL certificates. I also never have to worry about certificate renewals again.

  • Middleware Chains Simplified Security. Creating reusable security configurations with middleware chains was a marvel. I could easily apply consistent security headers across all services with a single reference.

For containerization best practices

Multi-Stage Builds Transformed My Images

  • The satisfaction of seeing frontend image sizes drop from 590MB to 50MB (0ver 90%) was incredible
  • Eliminated unnecessary build tools from production images
  • Significantly improved deployment speed and reduced bandwidth usage

Implementing Proper Service Dependencies Using Healthchecks

One thing that could make your setup even better is implementing proper service dependencies with healthchecks for startup order.

This ensures services start in the right order and only when their dependencies are truly ready, not just running. It's eliminated those annoying "connection refused" errors during startup and makes the system much more resilient to restarts.

The biggest takeaway from this project was how the right tooling can transform complex tasks into manageable ones. Traefik turned what would have been several hours of reverse proxy configuration into minutes, while the monitoring stack gave me insights I didn't even know I needed until I had them.


Conclusion πŸŽ‰

Phew! What a rewarding journey this has been! Taking a regular app and turning it into a containerized, monitored production system was quite the adventure.

Is it perfect? Nah, nothing ever is. But that's the beauty of DevOps - it's all about continuous improvement. I'm still tinkering with container sizes, playing with alert thresholds, and learning new security tricks,. And tbh, that's the fun part!

The biggest win for me wasn't just getting everything up and running (though that felt amazing!), but seeing how all these pieces work together. Watching container metrics flow into Prometheus, visualizing them in Grafana, and catching issues before they become problems - it's like having superpowers lol.

I still have a list of improvements I want to make. Just thought of adding a ci/cd pipeline and integrating terraform & ansible to automate deployment, I know i'll still think of more.

But for now, I'm pretty good with this setup. It's running smoothly, it's secure, and most importantly - it's giving us the insights we need to keep getting better.

I do hope sharing my experience helps you in your own containerization and monitoring journey. Again don't forget to check out my article exclusively on monitoring here.

If you'd like to explore the complete implementation, the entire project is available on GitHub. I welcome your feedback, issues, and pull requests!

Till the next, happy building!