If you post car content on Instagram, you know the pain of writing fresh captions every time. I built an AI-powered caption generator that helps you auto-generate captions, hashtags, and even TikTok sound ideas — just from a car photo.
In this article, I’ll show you how I built it using:
- 🧠 OpenAI GPT-4 Vision
- 🖼️ Image analysis
- 🌐 Streamlit app interface
- 🛠️ My personal caption history as training context
🔧 What We'll Build
🚀 Tech Stack
- Streamlit – for the UI
- OpenAI GPT-4 Vision – to understand car photos
- Python (with
openai
,streamlit
,dotenv
)
📁 Project Structure
car-caption-generator-ai/
├── app.py # Streamlit app logic
├── utils/
│ ├── vision.py # GPT-4 Vision logic
│ ├── captions.py # Captions and hashtags generator
│ └── prompts.py # Stores the prompt template
├── requirements.txt # Dependencies
└── README.md
Setup Instructions
1 .Clone the repo:
git clone https://github.com/Navashub/caption_generator_ai.git
cd car-caption-generator-ai
2.Create and activate a virtual environment:
python -m venv myvenv
source myvenv/bin/activate # Windows: myvenv\Scripts\activate
3.Install dependencies:
pip install -r requirements.txt
4.Add your OpenAI API key in a .env file
OPENAI_API_KEY=""
5.Run the app:
streamlit run app.py
🧠 How It Works
1.The app uses GPT-4 Vision to describe your uploaded car image.
2.That description is passed into a prompt template (along with your past captions).
3.The model returns:
- An Instagram caption
- Hashtags
- TikTok sound vibes
Here’s a sample output for an Audi RS5:
"Brutal beauty in carbon black. The RS5 doesn’t speak — it growls. Welcome to the autobahn attitude. 💨🔥
#RS5Power #FavouriteFourRings #AudiLife"