Build a RAG Chatbot for Your Documentation With Next.js and OpenAI
AAdmin
21 tháng 3, 2026
8 phút đọc
0 lượt xem
Retrieval Augmented Generation (RAG) lets an AI answer questions about your specific content. Build a docs chatbot that cites sources and stays grounded in your actual documentation.
What Is RAG and Why It Matters for Docs
RAG (Retrieval Augmented Generation) solves the hallucination problem: instead of asking an LLM to answer from training data alone, you first retrieve relevant chunks from your own documents, then ask the LLM to answer based on those chunks. The result is accurate, citable answers grounded in your actual content.
Architecture Overview
The pipeline has two phases:
- Indexing: Parse docs → chunk text → generate embeddings → store in vector DB
- Querying: Embed question → find similar chunks → send chunks + question to LLM → stream response
Step 1: Chunk and Embed Your Docs
// scripts/index-docs.ts
import { embedMany } from 'ai';
import { openai } from '@ai-sdk/openai';
import { db } from '../lib/db';
import { readFileSync, readdirSync } from 'fs';
function chunkText(text: string, chunkSize = 500, overlap = 50): string[] {
const words = text.split(' ');
const chunks: string[] = [];
for (let i = 0; i < words.length; i += chunkSize - overlap) {
chunks.push(words.slice(i, i + chunkSize).join(' '));
}
return chunks;
}
async function indexDocuments() {
const files = readdirSync('./docs').filter(f => f.endsWith('.md'));
for (const file of files) {
const content = readFileSync(`./docs/${file}`, 'utf-8');
const chunks = chunkText(content);
const { embeddings } = await embedMany({
model: openai.embedding('text-embedding-3-small'),
values: chunks,
});
for (let i = 0; i < chunks.length; i++) {
await db.$executeRaw`
INSERT INTO doc_chunks (source, content, embedding)
VALUES (${file}, ${chunks[i]}, ${JSON.stringify(embeddings[i])}::vector)
`;
}
console.log(`Indexed ${file}: ${chunks.length} chunks`);
}
}
indexDocuments();
Step 2: The RAG Query API
// app/api/docs-chat/route.ts
import { streamText, embed } from 'ai';
import { openai } from '@ai-sdk/openai';
import { db } from '@/lib/db';
export async function POST(request: Request) {
const { messages } = await request.json();
const latestQuestion = messages[messages.length - 1].content;
// 1. Embed the question
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: latestQuestion,
});
// 2. Retrieve relevant chunks
const chunks = await db.$queryRaw`
SELECT source, content,
1 - (embedding <=> ${JSON.stringify(embedding)}::vector) AS similarity
FROM doc_chunks
ORDER BY embedding <=> ${JSON.stringify(embedding)}::vector
LIMIT 5
`;
const context = (chunks as any[])
.map(c => `[${c.source}]\n${c.content}`)
.join('\n\n---\n\n');
// 3. Generate answer grounded in retrieved context
const result = streamText({
model: openai('gpt-4o-mini'),
system: `You are a documentation assistant. Answer questions using ONLY the provided context.
If the answer is not in the context, say so. Always cite which document you found the answer in.`,
messages: [
{ role: 'user', content: `Context:\n${context}\n\nQuestion: ${latestQuestion}` }
],
});
return result.toDataStreamResponse();
}
Step 3: The Chat UI With Source Citations
'use client';
import { useChat } from 'ai/react';
export function DocsChat() {
const { messages, input, handleInputChange, handleSubmit } = useChat({
api: '/api/docs-chat',
});
return (
<div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
<div className="flex-1 overflow-y-auto space-y-4">
{messages.map(m => (
<div key={m.id} className={m.role === 'user' ? 'text-right' : 'text-left'}>>
<div className={`inline-block p-3 rounded-lg text-sm max-w-lg ${
m.role === 'user' ? 'bg-blue-600 text-white' : 'bg-gray-100'
}`}>
{m.content}
</div>
</div>
))}
</div>
<form onSubmit={handleSubmit} className="mt-4 flex gap-2">
<input
value={input}
onChange={handleInputChange}
placeholder="Ask about the docs..."
className="flex-1 border rounded p-2"
/>
<button type="submit" className="px-4 py-2 bg-blue-600 text-white rounded">
Ask
</button>
</form>
</div>
);
}
Production Considerations
- Re-index automatically when docs are updated (webhook or CI step)
- Cache embeddings to avoid re-embedding the same content
- Add a confidence threshold — only show answers above 0.7 similarity
- Log queries and low-confidence answers to identify documentation gaps
Tác giảA
Admin
Open SourceSponsored
Cal.com
Open source scheduling — tự host booking system, thay thế Calendly. Free & privacy-first.
Bình luận (0)
Đăng nhập để bình luận
Chưa có bình luận nào. Hãy là người đầu tiên!