🧭 Part 3: Implementing Vector Search with Pinecone

In this part, we’ll integrate Pinecone, a vector database that enables semantic search. This allows the chatbot to understand user queries and fetch relevant order information—even if the phrasing varies.


✅ What We'll Cover

  • Introduction to vector databases and embeddings
  • Setting up Pinecone
  • Creating embeddings from order data
  • Storing and retrieving vectors from Pinecone

🧠 1. Why Vector Search?

Traditional keyword search has limits. With vector search, we embed data (like order summaries) into high-dimensional vectors using LLMs. These vectors can then be compared to the user’s query (also embedded) to find semantically similar matches.


🔧 2. Set Up Pinecone

Install Pinecone client (already done in Part 1, just in case):

npm install @pinecone-database/pinecone

Update your .env if not already done:

PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment

Initialize Pinecone in your config:

// backend/langchain/config.js
const { PineconeClient } = require('@pinecone-database/pinecone');

const initPinecone = async () => {
  const client = new PineconeClient();
  await client.init({
    apiKey: process.env.PINECONE_API_KEY,
    environment: process.env.PINECONE_ENVIRONMENT,
  });
  return client;
};

🧬 3. Generate Embeddings

We’ll create vector representations of order data using LangChain’s OpenAI embeddings.

// backend/langchain/embedOrders.js
const { OpenAIEmbeddings } = require('@langchain/openai');
const { initPinecone } = require('./config');
const { connectToDatabase } = require('../database/connection');

async function embedOrders() {
  const db = await connectToDatabase();
  const orders = await db.collection('orders').find().toArray();

  const embeddings = new OpenAIEmbeddings({
    openAIApiKey: process.env.OPENAI_API_KEY,
  });

  const pinecone = await initPinecone();
  const index = pinecone.Index("ecommerce-orders");

  for (const order of orders) {
    const orderSummary = `
      Order ID: ${order.orderId}
      Customer: ${order.customerName}
      Items: ${order.items.map(i => i.productName).join(', ')}
      Status: ${order.status}
    `;

    const [embedding] = await embeddings.embedQuery(orderSummary);

    await index.upsert([
      {
        id: order.orderId,
        values: embedding,
        metadata: {
          orderId: order.orderId,
          customerName: order.customerName,
          status: order.status,
        },
      },
    ]);
  }

  console.log("All orders embedded and stored in Pinecone.");
}

embedOrders();

Run the script:

node backend/langchain/embedOrders.js

🔍 4. Perform a Semantic Search

We’ll later use this to match user queries to order information.

// backend/langchain/searchOrder.js
const { OpenAIEmbeddings } = require('@langchain/openai');
const { initPinecone } = require('./config');

async function searchOrders(userQuery) {
  const embeddings = new OpenAIEmbeddings({
    openAIApiKey: process.env.OPENAI_API_KEY,
  });

  const pinecone = await initPinecone();
  const index = pinecone.Index("ecommerce-orders");

  const [queryVector] = await embeddings.embedQuery(userQuery);

  const results = await index.query({
    vector: queryVector,
    topK: 3,
    includeMetadata: true,
  });

  return results.matches;
}

module.exports = { searchOrders };

✅ Next Steps (Part 4)

In the next part, we will:

  • Connect LangChain components
  • Load data from MongoDB
  • Split and embed text dynamically
  • Retrieve and generate responses

🚀 Stay tuned for Part 4!