Update (2025/05/04):Wrote about Image generation and editing MCP
🧠🥷How to make Image generation and editing MCP (Gemini API + Cline and Cursor)
Intro
Hello! I'm a Ninja Web Developer. Hi-Yah!🥷
Lately, I have been playing with AI.
🧠🥷How to make AI controled Avatar 2 (Vroid MCP + Cline and Cursor + Unity)
🧠🥷How to make cool Ninja game (Unity MCP + Blender MCP (Cline and Cursor))
🧠🥷How to make cool Ninja (Blender MCP (Cline and Cursor))
By the way, I tried Gemini API
before, and it could handle text with AI well.
🧠🤖Gemini API for free (by Super Mario and ChatGPT)
I got an new information that Gemini API had a great update.
To a big surprise, Gemini API can generate and edit images for free.
As a no money poor web developer, I have been introducing many IT tech that can be used for free.🤑
So I tried the new Gemini API in a hurry.💨
What is Gemini API?
Gemini API is AI API made by Google, and have futures like follows.
https://ai.google.dev/gemini-api/docs
1️⃣ First of all and most important, you can use it for free.🤑
2️⃣ Second, there are many ways to use AI locally for free, but if you don't have a high spec PC, it takes long time to generate images.
However, Gemini API runs on the Web, so its response is fast.
If you want to make images locally for fast and free without GPU, please try this one.↓
🧠🤖Image generative AI on PC without GPU (free and fast (FastSD CPU))
3️⃣ Third, of course Gemini API can handle all kinds of things like text, chat, image and others with AI.
4️⃣ Although, Gemini API for free has a serious side effect.
If you use it for free, Google will use your data for learning.
https://ai.google.dev/gemini-api/terms#unpaid-services
Therefore, if you want to keep your data secret, use the paid plan instead.
I (ChatGPT) made a text to image and image editing App using React
, Next.js
and Gemini API as a sample.
Let's begin!🚀
How to set Gemini API App
1️⃣ Make the API key for Gemini API
https://aistudio.google.com/app/apikey
2️⃣ Make a Next.js project
npx create-next-app@latest
https://nextjs.org/docs/app/getting-started/installation
3️⃣ Install Gemini API library
npm install @google/genai
4️⃣ Install formidable for uploading image
npm install formidable
5️⃣ Set the codes
I asked ChatGPT to change the sample code of Gemini API to an App that has a frontend.↓
https://ai.google.dev/gemini-api/docs/text-generation
Code of frontend (app/page.tsx)
"use client";
import { useState } from "react";
export default function HomePage() {
const [generatePrompt, setGeneratePrompt] = useState(
"Hi, can you create a 3d rendered image of a pig with wings and a top hat flying over a happy futuristic scifi city with lots of greenery?"
);
const [editPrompt, setEditPrompt] = useState("Add a llama next to the image");
const [uploadedImage, setUploadedImage] = useState<File | null>(null);
const [results, setResults] = useState<any[]>([]);
const [loading, setLoading] = useState(false);
const handleGenerate = async () => {
setLoading(true);
setResults([]);
const res = await fetch("/api/generate-image", {
method: "POST",
body: JSON.stringify({ prompt: generatePrompt }),
headers: { "Content-Type": "application/json" },
});
const data = await res.json();
setResults(data);
setLoading(false);
};
const handleEdit = async () => {
if (!uploadedImage) return;
setLoading(true);
setResults([]);
const formData = new FormData();
formData.append("prompt", editPrompt);
formData.append("image", uploadedImage);
const res = await fetch("/api/edit-image", {
method: "POST",
body: formData,
});
const data = await res.json();
setResults(data);
setLoading(false);
};
const exportImage = (base64: string, index: number) => {
const link = document.createElement("a");
link.href = `data:image/png;base64,${base64}`;
link.download = `gemini-image-${index}.png`;
link.click();
};
return (
<main className="p-6 max-w-2xl mx-auto space-y-8">
<h1 className="text-3xl font-bold text-center mb-4">
Gemini Image Playground
h1>
{/* --- Image Generation Section --- */}
<section className="border p-4 rounded-lg shadow">
<h2 className="text-xl font-semibold mb-2">🎨 Generate New Imageh2>
<textarea
value={generatePrompt}
onChange={(e) => setGeneratePrompt(e.target.value)}
className="w-full p-2 border rounded mb-3"
rows={4}
/>
<button
onClick={handleGenerate}
disabled={loading}
className="px-4 py-2 bg-gray-600 text-white rounded"
>
{loading ? "Generating..." : "Generate Image"}
button>
section>
{/* --- Image Editing Section --- */}
<section className="border p-4 rounded-lg shadow">
<div>
<label
htmlFor="file-upload"
className="inline-block px-4 py-2 bg-gray-600 text-white rounded cursor-pointer"
>
Upload Image
label>
<input
id="file-upload"
type="file"
accept="image/*"
onChange={(e) => setUploadedImage(e.target.files?.[0] || null)}
className="hidden"
/>
{uploadedImage && (
<p className="mt-2 text-sm text-gray-700">
Selected: {uploadedImage.name}
p>
)}
div>
<textarea
value={editPrompt}
onChange={(e) => setEditPrompt(e.target.value)}
className="w-full p-2 border rounded mb-3"
rows={3}
/>
<button
onClick={handleEdit}
disabled={loading}
className="px-4 py-2 bg-gray-600 text-white rounded"
>
{loading ? "Editing..." : "Edit Image"}
button>
section>
{/* --- Results --- */}
{results.length > 0 && (
<section className="border p-4 rounded-lg shadow">
<h2 className="text-xl font-semibold mb-2">🖼️ Resulth2>
<div className="space-y-6">
{results.map((res, i) =>
res.type === "text" ? (
<p key={i} className="text-gray-700">
{res.data}
p>
) : (
<div key={i} className="space-y-2">
<img
src={`data:image/png;base64,${res.data}`}
alt="Generated or Edited"
className="w-full rounded"
/>
<button
onClick={() => exportImage(res.data, i)}
className="px-3 py-1 bg-gray-600 text-white rounded"
>
⬇️ Export Image
button>
div>
)
)}
div>
section>
)}
main>
);
}
Code of image generation (app/api/generate-image/route.ts)
import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
export async function POST(req: NextRequest) {
const { prompt } = await req.json();
const ai = new GoogleGenAI({
apiKey: process.env.GEMINI_API_KEY!,
});
const response = await ai.models.generateContent({
model: "gemini-2.0-flash-exp-image-generation",
contents: prompt,
config: {
responseModalities: [Modality.TEXT, Modality.IMAGE],
},
});
const result = [];
for (const part of response.candidates[0].content.parts) {
if (part.text) {
result.push({ type: "text", data: part.text });
} else if (part.inlineData) {
result.push({ type: "image", data: part.inlineData.data }); // base64 image
}
}
return NextResponse.json(result);
}
Code of image editing (app/api/edit-image/route.ts)
import { NextRequest, NextResponse } from "next/server";
import { GoogleGenAI, Modality } from "@google/genai";
import { promises as fs } from "fs";
import path from "path";
import { IncomingForm } from "formidable";
import { Readable } from "stream";
// Disable Next.js default body parser for file upload
export const config = {
api: {
bodyParser: false,
},
};
async function parseFormData(
req: NextRequest
): Promise<{ prompt: string; file: Buffer }> {
const buffers: Uint8Array[] = [];
const readable = Readable.fromWeb(req.body as any);
for await (const chunk of readable) {
buffers.push(chunk);
}
const form = new IncomingForm({ multiples: false });
const reqBuffer = Buffer.concat(buffers);
return new Promise((resolve, reject) => {
form.parse(reqBuffer as any, async (err, fields, files: any) => {
if (err) reject(err);
const prompt = fields.prompt as string;
const file = files.image[0];
const buffer = await fs.readFile(file.filepath);
resolve({ prompt, file: buffer });
});
});
}
export async function POST(req: NextRequest) {
try {
const formData = await req.formData();
const prompt = formData.get("prompt") as string;
const file = formData.get("image") as File;
const buffer = Buffer.from(await file.arrayBuffer());
const ai = new GoogleGenAI({
apiKey: process.env.GEMINI_API_KEY!,
});
const contents = [
{ text: prompt },
{
inlineData: {
mimeType: "image/png",
data: buffer.toString("base64"),
},
},
];
const response = await ai.models.generateContent({
model: "gemini-2.0-flash-exp-image-generation",
contents,
config: {
responseModalities: [Modality.TEXT, Modality.IMAGE],
},
});
const result = [];
for (const part of response.candidates[0].content.parts) {
if (part.text) {
result.push({ type: "text", data: part.text });
} else if (part.inlineData) {
result.push({ type: "image", data: part.inlineData.data });
}
}
return NextResponse.json(result);
} catch (error: any) {
console.error("Error:", error);
return NextResponse.json({ error: error.message }, { status: 500 });
}
}
Environment Variable (.env.local)
GEMINI_API_KEY=your_real_api_key_here
6️⃣ Run the App
npm run dev
7️⃣ Access http://localhost:3000
8️⃣ Hooray! Ready for the App!🎉
How to use Gemini API App
There are image generation and image editing function in this App.
How to generate image
Input the explanation how to generate the image and press Generate Image
button.
This time I inputted a text to make just an ordinal plumber seen everywhere in our daily lives (wearing a red overall and a red hat with a mustache).
You can export the image you made by pressing the Export image
button if you need.
How to edit image
First, upload the image you want to edit from the Upload Image
button.
Input the explanation how to edit the image and press Edit Image
button.
This time I inputted a text to change the color of clothes and hat.
You can export the image you made by pressing the Export image
button if you need.
Outro
As far as I know, there aren't any good API that can generate and edit images for free except Gemini API.
From now on, we can make Meme
as much as we like for free.
There is a marvelous Meme thread called Meme Monday
in this DEV Community
that every software developer must check and post.
I hope you will learn something from this post, or maybe enjoy even a little.
Thank you for reading.
Happy AI coding!🤖 Hi-Yah!🥷