I was paying $20/month for ChatGPT Plus to help with support workflows—until I hit another rate limit during a printer ticket crisis.
So I built my own GPT-powered assistant on a dusty Windows server.
No tokens. No API keys. No cloud.
What I Built
- Self-hosted chatbot using Ollama and Mistral 7B
- Flask backend and React frontend
- Custom JSON knowledge base that updates from closed helpdesk tickets
- Hosted entirely offline on local hardware
Stack Overview
- Flask – backend routing and LLM integration
- React – UI for support prompts and feedback
- Ollama – local LLM runner with REST API
- Windows 10 server with 32GB RAM and GTX 1660 Super
Why I Did It
- No more rate limits or usage caps
- Complete control over data and prompts
- Avoid vendor lock-in
- And honestly? I just wanted to prove it could be done
I published the full write-up here:
Read the full article on Medium
Drops on Tuesday
If you're curious about the setup, prompt wrappers, or how the JSON knowledge base works, I'm happy to share.