In this tutorial, you'll learn how to build a production-ready AI agent that can answer questions using both structured data from a PostgreSQL database and unstructured content from PDF files. We'll use MindsDB, an open-source platform designed to integrate LLM-powered agents with databases and external knowledge sources like documents.
Whether you're developing internal tools, research assistants, or customer support bots, this step-by-step guide will show you how to combine retrieval-augmented generation (RAG), SQL, and language models efficiently.
Tools Used
- MindsDB – agent orchestration and SQL-based pipeline
- PostgreSQL – structured data source
- PDF documents – unstructured data
- OpenAI GPT / Hugging Face Transformers – response generation
Objective
We’ll build an AI agent that can:
- Answer natural language questions by converting them into SQL queries
- Retrieve and summarize content from uploaded PDFs
- Integrate with platforms such as Slack or web interfaces
Step 1: Install MindsDB
You can run MindsDB locally using Docker or use the cloud-hosted version.
Run with Docker:
docker run -p 47334:47334 mindsdb/mindsdb
This provides access to the built-in SQL Editor, REST APIs, and external connectors.
Step 2: Create a Conversational Model
We start by creating a language model capable of natural dialogue, powered by OpenAI or Hugging Face via LangChain.
CREATE MODEL conversational_model
PREDICT answer
USING
engine = 'langchain',
openai_api_key = 'YOUR_OPENAI_API_KEY',
model_name = 'gpt-4',
mode = 'conversational',
user_column = 'question',
assistant_column = 'answer',
prompt_template = 'Answer the user input in a helpful way',
max_tokens = 100,
temperature = 0,
verbose = true;
To verify the model is active:
DESCRIBE conversational_model;
Step 3: Add Skills to Your AI Agent
3.1: Text-to-SQL Skill for Structured Data
Connect MindsDB to your PostgreSQL database:
CREATE DATABASE datasource
WITH ENGINE = "postgres",
PARAMETERS = {
"user": "demo_user",
"password": "demo_password",
"host": "samples.mindsdb.com",
"port": "5432",
"database": "demo",
"schema": "demo_data"
};
Define a skill for querying this data:
CREATE SKILL text2sql_skill
USING
type = 'text2sql',
database = 'datasource',
tables = ['house_sales'],
description = 'Contains US house sales data from 2007 to 2015';
3.2: Knowledge Base Skill for PDFs
Step 1: Create an embedding model:
CREATE MODEL embedding_model_hf
PREDICT embedding
USING
engine = 'langchain_embedding',
class = 'HuggingFaceEmbeddings',
input_columns = ["content"];
Step 2: Create a knowledge base:
CREATE KNOWLEDGE BASE my_knowledge_base
USING
model = embedding_model_hf;
Step 3: Insert PDF content:
INSERT INTO my_knowledge_base
SELECT * FROM files.my_file_name;
Step 4: Create a skill linked to the knowledge base:
CREATE SKILL kb_skill
USING
type = 'knowledge_base',
source = 'my_knowledge_base',
description = 'PDF report with analysis on house pricing trends';
Step 4: Create the AI Agent
Now, connect the conversational model with the two skills.
CREATE AGENT ai_agent
USING
model = 'conversational_model',
skills = ['text2sql_skill', 'kb_skill'];
Test the agent:
SELECT question, answer
FROM ai_agent
WHERE question = 'How many houses were sold in 2015?';
Optional: Deploy the Agent to Slack
CREATE DATABASE mindsdb_slack
WITH ENGINE = 'slack',
PARAMETERS = {
"token": "xoxb-xxx",
"app_token": "xapp-xxx"
};
CREATE CHATBOT ai_chatbot
USING
database = 'mindsdb_slack',
agent = 'ai_agent';
Your AI agent is now live on Slack and can be interacted with directly by your team.
Automating Knowledge Base Updates
To refresh your knowledge base regularly:
CREATE JOB update_kb_hourly AS (
INSERT INTO my_knowledge_base (
SELECT * FROM data_source WHERE id > LAST
)
) EVERY hour;
Example Questions You Can Ask
- "How many houses were sold in California in 2012?"
- "Summarize the key insights from the PDF report."
- "What is the average price of houses sold per state in 2015?"
Use Cases
- Internal AI assistants for querying data systems
- Customer support bots with product manuals in PDF format
- AI dashboards for real-time analytics and summaries
- Slack bots that act as knowledge agents for your team
SEO Keywords (Technical)
- Build AI agent with MindsDB
- Natural language to SQL chatbot
- LangChain agent integration
- RAG PDF chatbot with SQL
- PostgreSQL GPT chatbot
- Hugging Face Embeddings MindsDB
- MindsDB Slack integration
- Enterprise LLM agent orchestration
Conclusion
By combining MindsDB with OpenAI or Hugging Face models, you can rapidly develop intelligent agents that understand both structured and unstructured data sources. With just a few SQL-like commands, we’ve created an end-to-end AI workflow that can be deployed across business-critical applications.
For enterprise use cases, this stack scales well for document understanding, sales analytics, research summarization, and beyond.