Introduction
This past weekend, fueled by curiosity and the "learn by doing" mantra, I embarked on a mission: to build a more intelligent way to navigate the vast landscape of the Next.js documentation. The result? An AI-powered semantic search and Retrieval-Augmented Generation (RAG) tool specifically designed for the official Next.js docs.
The Problem with Simple Keyword Search
We've all experienced the frustration of sifting through numerous search results, many only tangentially related to our actual query. Traditional keyword-based search often struggles with understanding the nuances of language and the context of our questions. This is particularly true for complex documentation like that of a powerful framework like Next.js.
My Solution: Semantic Search + RAG
To tackle this, I decided to leverage the power of semantic search and Retrieval-Augmented Generation (RAG). Here's the core idea:
Semantic Search: Instead of just matching keywords, the system understands the meaning behind your question. This allows for more relevant results, even if your query doesn't use the exact terminology present in the documentation.
Retrieval-Augmented Generation (RAG): Once relevant documentation snippets are found, a language model uses this context to generate a more informed and accurate answer to your question, complete with citations.
Diving into the Implementation
Here's a peek under the hood at the technologies and processes involved:
1. Populating the Knowledge Base:
The first crucial step was to ingest and process the Next.js documentation. This involved:
Fetching Navigation Links: I started by scraping the main Next.js documentation page to extract all the links to individual documentation pages.
Crawling and Extracting Content: For each of these links, I used CheerioWebBaseLoader
to fetch the content of the page.
Chunking the Text: Large documents were split into smaller, manageable chunks using RecursiveCharacterTextSplitter
. This is important for efficient embedding and retrieval.
Generating Embeddings: The heart of semantic search! I used GoogleGenerativeAIEmbeddings
to create vector embeddings
for each chunk of the documentation. These embeddings
capture the semantic meaning of the text.
Storing in a Vector Database: The generated embeddings
were stored in Chroma
, an efficient vector store that allows for fast similarity searches.
import { Chroma } from "@langchain/community/vectorstores/chroma";
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
import { load } from "cheerio";
import "dotenv/config";
import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";
import { Document } from "@langchain/core/documents";
const embeddings = new GoogleGenerativeAIEmbeddings({
model: "models/text-embedding-004",
apiKey: process.env.GOOGLE_API_KEY,
});
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const vectorStore = new Chroma(embeddings, {
collectionName: "next-js-docs",
});
export async function PopulateDatabase() {
const navs = await GetNavLinks();
if (!Array.isArray(navs) || navs.length < 0) {
throw new Error("Failed to populate database");
}
// const testNavs = navs.slice(0, 3);
const testNavs = navs.slice(1, 375)
for (let index = 0; index < testNavs.length; index++) {
const i = testNavs[index];
try {
console.info(
`🌐 Processing link ${index + 1} of ${
testNavs.length
}: https://nextjs.org/${i}`
);
const loader = new CheerioWebBaseLoader(`https://nextjs.org/${i}`);
const docs = await loader.load();
const splitDocs = await textSplitter.splitDocuments(docs);
await InsertIntoDB(splitDocs);
} catch (error) {
console.error(`Error processing page ${i}:`, error);
throw new Error("Failed to populate database \n " + error);
}
}
console.log("✅ Successfully Completed inserting into db");
}
async function InsertIntoDB(validDocs: Document>[]) {
const batchSize = 1000;
const totalBatches = Math.ceil(validDocs.length / batchSize);
console.info(
`📄 Inserting ${validDocs.length} documents in ${totalBatches} batch(es)...`
);
for (let i = 0; i < validDocs.length; i += batchSize) {
const batch = validDocs.slice(i, i + batchSize);
try {
await vectorStore.addDocuments(batch);
console.log(
`✅ Successfully added batch ${
Math.floor(i / batchSize) + 1
} of ${totalBatches}`
);
} catch (error) {
console.error(
`❌ Error adding batch ${
Math.floor(i / batchSize) + 1
} of ${totalBatches}:`,
error
);
console.error(
"🔍 First few items in failing batch:",
batch
.slice(0, 2)
.map((doc) => doc.pageContent.substring(0, 150) + "...")
);
}
}
}
async function GetNavLinks(): Promise {
try {
const response = await fetch("https://nextjs.org/docs");
const html = await response.text();
const $ = load(html);
const routes: string[] = [];
$("main nav ul li a").map((_, el) => {
const href = $(el).attr("href");
if (href) {
routes.push(href);
}
});
return routes;
} catch (error) {
console.error(error);
return new Error("Failed to populate database");
}
}
2. The Search and Answer Generation Pipeline:
When a user asks a question, the following steps occur:
- Keyword Generation (Optimization): To potentially refine the search, the ChatGoogleGenerativeAI
model is used with a specific prompt to generate a concise list of relevant Next.js keywords from the user's question. This helps focus the subsequent semantic search.
- Semantic Retrieval: The generated keywords (or the original question) are used to perform a similarity search in the Chroma
vector store using the similaritySearch
function. This retrieves the most semantically similar documentation snippets.
- Answer Generation with Context: The retrieved documentation snippets are then passed to the ChatGoogleGenerativeAI
model (specifically gemini-2.0-flash) along with the original question and a carefully crafted prompt. This prompt instructs the model to act as a knowledgeable Next.js assistant, using the provided context to answer the question accurately and cite the source.
import { Chroma } from "@langchain/community/vectorstores/chroma";
import { ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { Annotation, StateGraph } from "@langchain/langgraph";
import { Document } from "@langchain/core/documents";
import 'dotenv/config'
const llm = new ChatGoogleGenerativeAI({
model: "gemini-2.0-flash",
apiKey: process.env.GOOGLE_API_KEY,
});
const embeddings = new GoogleGenerativeAIEmbeddings({
model: "models/text-embedding-004",
apiKey: process.env.GOOGLE_API_KEY,
});
const vectorStore = new Chroma(embeddings, {
collectionName: "next-js-docs",
});
const promptTemplate = ChatPromptTemplate.fromMessages([
["system", `You are a knowledgeable and confident Next.js assistant.
Important behavior rules:
- Your top priority is to answer the user's question as accurately and helpfully as possible.
- Use the provided context to support your answer **only if it is relevant**.
- NEVER reference, mention, or comment on the existence, quality, or amount of the context.
- If helpful information is found in the context, cite it properly in markdown format (e.g., [link](url)).
- If context is not helpful, answer based on your own knowledge.
- Always respond in professional, clear Markdown format.
`],
["user", `
User Question:
{question}
Additional Context (if useful):
{context}
`],
]);
const keywordGenerationPrompt = ChatPromptTemplate.fromTemplate(
`You are a semantic search assistant.
Given a user question, generate a concise list of 5-10 keywords that are highly relevant for searching within NEXT.JS official documentation only.
Important rules:
- Focus ONLY on Next.js concepts, features, methods, APIs, and relevant terminology.
- Do NOT include general cloud providers, hosting platforms, or unrelated services (e.g., AWS, Azure, Netlify, Vercel).
- Prefer specific technical terms over generic words.
- If the question is very general, generate keywords that stay within the context of building or deploying a Next.js app.
Return the keywords as a comma-separated list. No extra explanations.
User Question:
{question}
Keywords:
`
);
// Define state for application
const InputStateAnnotation = Annotation.Root({
question: Annotation,
});
const StateAnnotation = Annotation.Root({
question: Annotation,
context: Annotation,
answer: Annotation,
keywords: Annotation,
});
// Define application steps
const generateKeywords = async (state: typeof InputStateAnnotation.State) => {
const keywordsResponse = await llm.invoke(await keywordGenerationPrompt.invoke(state));
const keywords = keywordsResponse.content?.toString().split(',').map((k: string) => k.trim()) || [state.question];
console.log("Generated Keywords:", keywords);
return { keywords };
};
const retrieve = async (state: typeof StateAnnotation.State) => {
const searchQuery = (state.keywords || [state.question]).join(" ");
const retrievedDocs = await vectorStore.similaritySearch(searchQuery, 20);
return { context: retrievedDocs };
};
const generateAnswer = async (state: typeof StateAnnotation.State) => {
const docsContent = state.context
.map((doc: Document) => `Content: ${doc.pageContent}\nSource: ${doc.metadata.source}`)
.join("\n\n---\n\n");
const messages = await promptTemplate.invoke({
question: state.question,
context: docsContent,
});
const response = await llm.invoke(messages);
return { answer: response.content };
};
// Compile application and test
const graph = new StateGraph(StateAnnotation)
.addNode("generate_keywords", generateKeywords)
.addNode("retrieve", retrieve)
.addNode("generate_answer", generateAnswer)
.addEdge("__start__", "generate_keywords")
.addEdge("generate_keywords", "retrieve")
.addEdge("retrieve", "generate_answer")
.addEdge("generate_answer", "__end__")
.compile();
export default async function Generate(question: string) {
console.info("🔍 Performing a sample search...");
let inputs = {
question: question,
};
const result = await graph.invoke(inputs);
console.log("🤖 Gemini's Response:");
console.log(result.answer);
return result.answer
}
3. Orchestration with Langgraph:
To manage the flow of information through these steps, I utilized Langgraph
. This powerful tool allows you to define a graph of interconnected steps, making it easier to build complex AI workflows. In this case, the graph includes nodes for keyword generation, document retrieval, and answer generation.
Initial Impressions and Learnings
Building this tool over the weekend has been an incredibly rewarding learning experience. Here are some key takeaways:
- Semantic search significantly improves relevance: The ability to understand the meaning behind a query leads to far more accurate and helpful results compared to simple keyword matching.
- RAG provides grounded and trustworthy answers: By grounding the language model's responses in the official documentation, the answers are more reliable and can be easily verified through the provided citations.
- Langchain simplifies complex AI pipelines: It provides a fantastic set of abstractions and integrations that make building these kinds of tools much more manageable.
- Vector databases are crucial for efficient semantic search: Chroma's
performance in storing and querying embeddings
is essential for a responsive search experience.
The Journey Continues...
This weekend project is just the beginning. I'm excited to continue exploring Semantic Search in details and solve actual problems.
👉 Checkout the prototype here.
Stay tuned for future updates as this project evolves!