Introduction

AI is transforming image editing by allowing users to enhance and modify images based on text prompts. In this blog, we’ll explore how I built the Gemini Image Editor, a Node.js application that leverages Google’s Gemini API to edit images with AI.

This project allows users to upload an image, describe modifications, and receive an AI-enhanced version.

Project Overview

The Gemini Image Editor is a REST API that supports:

Uploading images and applying modifications.

Google Gemini API integration for AI-powered editing.

Multer file upload handling.

Express.js backend with easy-to-use API endpoints.

Tech Stack

  • Node.js - Backend runtime
  • Express.js - Web framework
  • Google Generative AI SDK - Image modification
  • Multer - File upload handling
  • dotenv - Environment variables

Getting Started

1. Clone the Repository

git clone https://github.com/manthanank/gemini-image-editor.git
cd gemini-image-editor

2. Install Dependencies

npm install

3. Configure Environment Variables

Create a .env file and add your Google Gemini API key:

GEMINI_API_KEY=your_google_gemini_api_key
PORT=5000

4. Start the Server

npm start

Your server will run at http://localhost:5000 🚀


API Endpoints

Modify an Image

📌 Endpoint: POST /api/image/modify

📌 Request Body:

  • prompt (string): Modification instructions
  • image (file): The image to modify

📌 Response:

{
  "message": "Image modified successfully",
  "imagePath": "uploads/modified_1710342456.png"
}

Project Structure

gemini-image-editor/
├── controllers/       # Business logic
├── middleware/        # Multer file upload
├── routes/            # API endpoints
├── services/          # Google Gemini AI logic
├── uploads/           # Stores images
├── server.js          # Entry point
├── package.json       # Dependencies
└── .env               # Environment variables

Core Implementation

1. Setting Up the Express Server

The server.js file initializes the app and ensures the uploads/ directory exists:

const app = require("./app");
const { port } = require("./config/env");
const fs = require("fs");

// Ensure uploads directory exists
if (!fs.existsSync("uploads")) {
  fs.mkdirSync("uploads");
}

app.listen(port, () => {
  console.log(`Server running on http://localhost:${port}`);
});

2. Handling Image Upload & Modification

The imageController.js file manages requests:

const { modifyImage } = require("../services/geminiService");

async function modifyImageController(req, res) {
  const { prompt } = req.body;
  const imageFile = req.file;

  if (!prompt || !imageFile) {
    return res.status(400).json({ error: "Prompt and image are required" });
  }

  try {
    const modifiedImagePath = await modifyImage(prompt, imageFile.path);
    res.status(200).json({ message: "Image modified successfully", imagePath: modifiedImagePath });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
}

module.exports = { modifyImageController };

3. Multer Middleware for File Uploads

The uploadMiddleware.js file sets up Multer to store images in uploads/:

const multer = require("multer");
const path = require("path");

const storage = multer.diskStorage({
  destination: (req, file, cb) => {
    cb(null, "uploads/");
  },
  filename: (req, file, cb) => {
    cb(null, Date.now() + path.extname(file.originalname));
  },
});

const upload = multer({ storage });

module.exports = upload;

4. Google Gemini API Integration

The geminiService.js file connects to Gemini and modifies the image:

const { GoogleGenerativeAI } = require("@google/generative-ai");
const fs = require("fs");
const { geminiApiKey } = require("../config/env");

const genAI = new GoogleGenerativeAI(geminiApiKey);

async function modifyImage(prompt, imagePath) {
  const imageData = fs.readFileSync(imagePath);
  const base64Image = imageData.toString("base64");

  const contents = [
    { text: prompt },
    {
      inlineData: {
        mimeType: "image/png",
        data: base64Image,
      },
    },
  ];

  const model = genAI.getGenerativeModel({
    model: "gemini-2.0-flash-exp-image-generation",
    generationConfig: {
      responseModalities: ["Text", "Image"],
    },
  });

  try {
    const response = await model.generateContent(contents);
    for (const part of response.response.candidates[0].content.parts) {
      if (part.inlineData) {
        const imageData = part.inlineData.data;
        const buffer = Buffer.from(imageData, "base64");
        const outputPath = `uploads/modified_${Date.now()}.png`;
        fs.writeFileSync(outputPath, buffer);
        return outputPath;
      }
    }
  } catch (error) {
    console.error("Error modifying image:", error);
    throw new Error("Failed to modify image");
  }
}

module.exports = { modifyImage };

Conclusion

With Node.js, Express, Multer, and Google Gemini API, we’ve built an AI-powered image editor that allows users to upload images, apply modifications using text prompts, and receive AI-enhanced versions. 🚀

🔹 Potential Enhancements:

🔹 Add a frontend UI with Angular for an interactive user experience.

🔹 Support cloud storage for better image management.

🔹 Allow multiple modifications in one request.

👉 Ready to explore AI-powered image editing? Try the GitHub Repository!