I was paying $20/month for ChatGPT Plus to help with support workflows—until I hit another rate limit during a printer ticket crisis.

So I built my own GPT-powered assistant on a dusty Windows server.

No tokens. No API keys. No cloud.

What I Built

  • Self-hosted chatbot using Ollama and Mistral 7B
  • Flask backend and React frontend
  • Custom JSON knowledge base that updates from closed helpdesk tickets
  • Hosted entirely offline on local hardware

Stack Overview

  • Flask – backend routing and LLM integration
  • React – UI for support prompts and feedback
  • Ollama – local LLM runner with REST API
  • Windows 10 server with 32GB RAM and GTX 1660 Super

Why I Did It

  • No more rate limits or usage caps
  • Complete control over data and prompts
  • Avoid vendor lock-in
  • And honestly? I just wanted to prove it could be done

I published the full write-up here:

Read the full article on Medium
Drops on Tuesday

If you're curious about the setup, prompt wrappers, or how the JSON knowledge base works, I'm happy to share.

Flask #React #SelfHostedAI #Ollama #Chatbot #OpenSource #GPT