Skip to main content

Generative AI

Postgres RAG Stack: Embedding, Chunking & Vector Search

Two Professional It Programers Discussing Blockchain Data Network Architecture Design And Development Shown On Desktop Computer Display. Working Data Center Technical Department With Server Racks

This is Part 2 of a three-part series (links at the bottom). The GitHub repo can be checked out here.

Postgres RAG Stack brings together Postgres, pgVector, and TypeScript to power fast, semantic search. In Part One, we covered the theory behind semantic search: how embeddings convert meaning into vectors, how vector databases and indexes enable fast similarity search, and how RAG combines retrieval with language models for grounded, accurate responses. In this guide, you’ll scaffold your project, set up Docker with pgVector, and build ingestion and query scripts for embedding and chunking your data.

Now we will begin setting up the foundation for our RAG application:

  • Next.js 15 project scaffold with environment files and directory layout
  • PostgreSQL 17 + pgvector in a Docker container
  • content_chunks table with an HNSW index
  • An ingestion script that chunks any text corpus and stores embeddings
  • Commands to validate cosine search plus troubleshooting tips

Create the project directory

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
mkdir rag-chatbot-demo && cd rag-chatbot-demo
mkdir rag-chatbot-demo && cd rag-chatbot-demo
mkdir rag-chatbot-demo && cd rag-chatbot-demo

Scaffold a new Next.js app (optional)

To create a chatbot using Next.js, scaffold now to avoid conflicts. Skip if you only need the RAG basics:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npx create-next-app@latest . \
--typescript \
--app \
--tailwind \
--eslint \
--import-alias "@/*"
npx create-next-app@latest . \ --typescript \ --app \ --tailwind \ --eslint \ --import-alias "@/*"
npx create-next-app@latest . \
  --typescript \
  --app \
  --tailwind \
  --eslint \
  --import-alias "@/*"

Set up folder structure

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
mkdir -p scripts postgres input
mkdir -p scripts postgres input
mkdir -p scripts postgres input

Create docker-compose.yml

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# ./docker-compose.yml
services:
db:
image: pgvector/pgvector:pg17
container_name: rag-chatbot-demo
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: password
POSTGRES_DB: ragchatbot_db
ports:
- '5432:5432'
volumes:
- ./pgdata:/var/lib/postgresql/data
- ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql
volumes:
pgdata:
# ./docker-compose.yml services: db: image: pgvector/pgvector:pg17 container_name: rag-chatbot-demo environment: POSTGRES_USER: postgres POSTGRES_PASSWORD: password POSTGRES_DB: ragchatbot_db ports: - '5432:5432' volumes: - ./pgdata:/var/lib/postgresql/data - ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql volumes: pgdata:
# ./docker-compose.yml
services:
  db:
    image: pgvector/pgvector:pg17
    container_name: rag-chatbot-demo
    environment:
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
      POSTGRES_DB: ragchatbot_db
    ports:
      - '5432:5432'
    volumes:
      - ./pgdata:/var/lib/postgresql/data
      - ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql
volumes:
  pgdata:

Create schema

Add a SQL file that Docker runs automatically on first boot:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
-- ./postgres/schema.sql
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE content_chunks(
id bigserial PRIMARY KEY,
content text,
embedding vector(1536), -- OpenAI text‑embedding‑3‑small
source text, -- optional file name, URL, etc.
added_at timestamptz DEFAULT now()
);
-- High‑recall ANN index for cosine similarity
-- Note: Adding index before inserting data slows down the insert process
CREATE INDEX ON content_chunks
USING hnsw(embedding vector_cosine_ops);
-- ./postgres/schema.sql -- Enable pgvector extension CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE content_chunks ( id bigserial PRIMARY KEY, content text, embedding vector(1536), -- OpenAI text‑embedding‑3‑small source text, -- optional file name, URL, etc. added_at timestamptz DEFAULT now() ); -- High‑recall ANN index for cosine similarity -- Note: Adding index before inserting data slows down the insert process CREATE INDEX ON content_chunks USING hnsw (embedding vector_cosine_ops);
-- ./postgres/schema.sql

-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE content_chunks (
  id         bigserial PRIMARY KEY,
  content    text,
  embedding  vector(1536),      -- OpenAI text‑embedding‑3‑small
  source     text,              -- optional file name, URL, etc.
  added_at   timestamptz DEFAULT now()
);

-- High‑recall ANN index for cosine similarity
-- Note: Adding index before inserting data slows down the insert process

CREATE INDEX ON content_chunks
USING hnsw (embedding vector_cosine_ops);

Launch Docker Compose

After creating the schema file, start the container:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker compose up -d
docker compose up -d
docker compose up -d

Create your .env file

In the project root, add:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# .env
DATABASE_URL=postgresql://postgres:password@localhost:5432/ragchatbot_db
# Get your key from https://platform.openai.com/account/api-keys
OPENAI_API_KEY=your-openai-key-here
# .env DATABASE_URL=postgresql://postgres:password@localhost:5432/ragchatbot_db # Get your key from https://platform.openai.com/account/api-keys OPENAI_API_KEY=your-openai-key-here
# .env
DATABASE_URL=postgresql://postgres:password@localhost:5432/ragchatbot_db
# Get your key from https://platform.openai.com/account/api-keys
OPENAI_API_KEY=your-openai-key-here

Preparing the Embedding Pipeline

Sample data file

Create input/data.txt with sample documentation or FAQs. Download the full file here.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
# AcmeCorp Subscription Guide
## How do I renew my plan?
Log in to your dashboard, select Billing → Renew, and confirm payment. Your new cycle starts immediately.
## How can I cancel my subscription?
Navigate to Billing → Cancel Plan. Your access remains active until the endof the current billing period.
## Do you offer student discounts?
Yes. Email support@acmecorp.com with proof of enrollment to receive a 25% discount code.
---
# Troubleshooting Connectivity
## The app cannot reach the server
Check your internet connection and verify the service URL in Settings → API Host.
# AcmeCorp Subscription Guide ## How do I renew my plan? Log in to your dashboard, select Billing → Renew, and confirm payment. Your new cycle starts immediately. ## How can I cancel my subscription? Navigate to Billing → Cancel Plan. Your access remains active until the end of the current billing period. ## Do you offer student discounts? Yes. Email support@acmecorp.com with proof of enrollment to receive a 25% discount code. --- # Troubleshooting Connectivity ## The app cannot reach the server Check your internet connection and verify the service URL in Settings → API Host.
# AcmeCorp Subscription Guide
## How do I renew my plan?
Log in to your dashboard, select Billing → Renew, and confirm payment. Your new cycle starts immediately.

## How can I cancel my subscription?
Navigate to Billing → Cancel Plan. Your access remains active until the end of the current billing period.

## Do you offer student discounts?
Yes. Email support@acmecorp.com with proof of enrollment to receive a 25% discount code.

---
# Troubleshooting Connectivity

## The app cannot reach the server
Check your internet connection and verify the service URL in Settings → API Host.

Install dependencies

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npm install pg langchain ai @ai-sdk/openai dotenv
npm install pg langchain ai @ai-sdk/openai dotenv
npm install pg langchain ai @ai-sdk/openai dotenv

Dependencies:

  • ai: toolkit for AI models and streaming responses
  • @ai-sdk/openai: OpenAI adapter for embeddings and chat
  • pg: PostgreSQL client for Node.js
  • langchain: splits documents into semantically meaningful chunks
  • dotenv: loads environment variables from .env

Create scripts/embed.ts

This script reads text, splits it into chunks, generates embeddings via OpenAI, and stores them in content_chunks:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// scripts/embed.ts
import'dotenv/config';
import fs from 'node:fs';
import path from 'node:path';
import{ Pool } from 'pg';
import{ RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
import{ openai } from '@ai-sdk/openai'; // OpenAI adapter
import{ embedMany } from 'ai'; // generic AI interface
const BATCH_SIZE = 50;
const MAX_CHUNK_LENGTH = 512; // max characters per chunk
const pool = newPool({
connectionString: process.env.DATABASE_URL
});
/**
* Ingest a plain-text Q&A file where each line is either a question or an answer.
* Splits on single newlines; if a line exceeds MAX_CHUNK_LENGTH, it is further
* chunked by RecursiveCharacterTextSplitter.
*/
asyncfunctioningest(file: string){
const raw = fs.readFileSync(file, 'utf8');
console.log(`Loaded ${file}`);
// Split into lines, drop empty lines
const lines = raw
.split(/\r?\n\s*\r?\n/)
.map((l) = & gt; l.trim())
.filter(Boolean);
// Prepare overflow splitter for any long lines
const overflowSplitter = newRecursiveCharacterTextSplitter({
chunkSize: MAX_CHUNK_LENGTH,
chunkOverlap: 50,
});
// Build final list of chunks
const chunks: string[] = [];
for(const line of lines){
if(line.length & lt; = MAX_CHUNK_LENGTH){
chunks.push(line);
}else{
// Further split long lines into smaller chunks if needed
const sub = await overflowSplitter.splitText(line);
chunks.push(...sub);
}
}
console.log(`Processing ${chunks.length} chunks in batches of ${BATCH_SIZE}`);
// Process chunks in batches using embedMany
for(let i = 0; i & lt; chunks.length; i += BATCH_SIZE){
const batch = chunks.slice(i, i + BATCH_SIZE);
console.log(`Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(chunks.length / BATCH_SIZE)}`);
// Embed the entire batch at once
const{
embeddings
} = awaitembedMany({
model: openai.embedding('text-embedding-3-small'),
values: batch,
});
// Insert all embeddings from this batch into the database
for(let j = 0; j & lt; batch.length; j++){
const chunk = batch[j];
const embedding = embeddings[j];
const vectorString = `[${embedding.join(',')}]`;
console.log(`Inserting chunk ${i + j + 1}/${chunks.length}: ${chunk.slice(0, 60)}...`);
await pool.query('INSERT INTO content_chunks (content, embedding, source) VALUES ($1,$2,$3)', [chunk, vectorString, path.basename(file)]);
}
}
}
asyncfunctionmain(){
console.log('Starting embedding ingestion…');
awaitingest('./input/data.txt');
await pool.end();
}
main().catch((err) = & gt; {
console.error('Ingestion error:', err);
process.exit(1);
});
// scripts/embed.ts import 'dotenv/config'; import fs from 'node:fs'; import path from 'node:path'; import { Pool } from 'pg'; import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter'; import { openai } from '@ai-sdk/openai'; // OpenAI adapter import { embedMany } from 'ai'; // generic AI interface const BATCH_SIZE = 50; const MAX_CHUNK_LENGTH = 512; // max characters per chunk const pool = new Pool({ connectionString: process.env.DATABASE_URL }); /** * Ingest a plain-text Q&A file where each line is either a question or an answer. * Splits on single newlines; if a line exceeds MAX_CHUNK_LENGTH, it is further * chunked by RecursiveCharacterTextSplitter. */ async function ingest(file: string) { const raw = fs.readFileSync(file, 'utf8'); console.log(`Loaded ${file}`); // Split into lines, drop empty lines const lines = raw .split(/\r?\n\s*\r?\n/) .map((l) = & gt; l.trim()) .filter(Boolean); // Prepare overflow splitter for any long lines const overflowSplitter = new RecursiveCharacterTextSplitter({ chunkSize: MAX_CHUNK_LENGTH, chunkOverlap: 50, }); // Build final list of chunks const chunks: string[] = []; for (const line of lines) { if (line.length & lt; = MAX_CHUNK_LENGTH) { chunks.push(line); } else { // Further split long lines into smaller chunks if needed const sub = await overflowSplitter.splitText(line); chunks.push(...sub); } } console.log(`Processing ${chunks.length} chunks in batches of ${BATCH_SIZE}`); // Process chunks in batches using embedMany for (let i = 0; i & lt; chunks.length; i += BATCH_SIZE) { const batch = chunks.slice(i, i + BATCH_SIZE); console.log(`Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(chunks.length / BATCH_SIZE)}`); // Embed the entire batch at once const { embeddings } = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: batch, }); // Insert all embeddings from this batch into the database for (let j = 0; j & lt; batch.length; j++) { const chunk = batch[j]; const embedding = embeddings[j]; const vectorString = `[${embedding.join(',')}]`; console.log(`Inserting chunk ${i + j + 1}/${chunks.length}: ${chunk.slice(0, 60)}...`); await pool.query('INSERT INTO content_chunks (content, embedding, source) VALUES ($1,$2,$3)', [chunk, vectorString, path.basename(file)]); } } } async function main() { console.log('Starting embedding ingestion…'); await ingest('./input/data.txt'); await pool.end(); } main().catch((err) = & gt; { console.error('Ingestion error:', err); process.exit(1); });
// scripts/embed.ts
import 'dotenv/config';
import fs from 'node:fs';
import path from 'node:path';
import { Pool } from 'pg';
import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter';
import { openai } from '@ai-sdk/openai'; // OpenAI adapter
import { embedMany } from 'ai'; // generic AI interface

const BATCH_SIZE = 50;
const MAX_CHUNK_LENGTH = 512; // max characters per chunk
const pool = new Pool({
    connectionString: process.env.DATABASE_URL
});

/**
 * Ingest a plain-text Q&A file where each line is either a question or an answer.
 * Splits on single newlines; if a line exceeds MAX_CHUNK_LENGTH, it is further
 * chunked by RecursiveCharacterTextSplitter.
 */
async function ingest(file: string) {
    const raw = fs.readFileSync(file, 'utf8');
    console.log(`Loaded ${file}`);

    // Split into lines, drop empty lines
    const lines = raw
        .split(/\r?\n\s*\r?\n/)
        .map((l) = & gt; l.trim())
        .filter(Boolean);

    // Prepare overflow splitter for any long lines
    const overflowSplitter = new RecursiveCharacterTextSplitter({
        chunkSize: MAX_CHUNK_LENGTH,
        chunkOverlap: 50,
    });

    // Build final list of chunks
    const chunks: string[] = [];
    for (const line of lines) {
        if (line.length & lt; = MAX_CHUNK_LENGTH) {
            chunks.push(line);
        } else {
            // Further split long lines into smaller chunks if needed
            const sub = await overflowSplitter.splitText(line);
            chunks.push(...sub);
        }
    }

    console.log(`Processing ${chunks.length} chunks in batches of ${BATCH_SIZE}`);

    // Process chunks in batches using embedMany
    for (let i = 0; i & lt; chunks.length; i += BATCH_SIZE) {
        const batch = chunks.slice(i, i + BATCH_SIZE);
        console.log(`Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(chunks.length / BATCH_SIZE)}`);

        // Embed the entire batch at once
        const {
            embeddings
        } = await embedMany({
            model: openai.embedding('text-embedding-3-small'),
            values: batch,
        });

        // Insert all embeddings from this batch into the database
        for (let j = 0; j & lt; batch.length; j++) {
            const chunk = batch[j];
            const embedding = embeddings[j];
            const vectorString = `[${embedding.join(',')}]`;
            console.log(`Inserting chunk ${i + j + 1}/${chunks.length}: ${chunk.slice(0, 60)}...`);
            await pool.query('INSERT INTO content_chunks (content, embedding, source) VALUES ($1,$2,$3)', [chunk, vectorString, path.basename(file)]);
        }
    }
}

async function main() {
    console.log('Starting embedding ingestion…');
    await ingest('./input/data.txt');
    await pool.end();
}

main().catch((err) = & gt; {
    console.error('Ingestion error:', err);
    process.exit(1);
});

Run the embedding script

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npx tsx scripts/embed.ts
npx tsx scripts/embed.ts
npx tsx scripts/embed.ts

Testing Retrieval Functionality

Create scripts/query.ts to embed a query, fetch the top-N chunks, and print them:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
/* scripts/query.ts */
import'dotenv/config';
import{ Pool } from 'pg';
import{ openai } from '@ai-sdk/openai';
import{ embed } from 'ai';
const pool = newPool({ connectionString: process.env.DATABASE_URL});
const TOP_N = 5; // number of chunks to retrieve
asyncfunctionquery(query: string){
console.log(`Embedding query: "${query}"`);
const{ embedding: qVec } = awaitembed({
model: openai.embedding('text-embedding-3-small'),
value: query,
});
const qVecString = `[${qVec.join(',')}]`;
console.log(`Fetching top ${TOP_N} similar chunks from database...`);
const{ rows } = await pool.query<{ content: string; source: string; score: number; }>(
`SELECT content, source,
1 - (embedding <=> $1) AS score
FROM content_chunks
ORDER BY embedding <=> $1
LIMIT $2`,
[qVecString, TOP_N]
);
console.log('Results:');
rows.forEach((row, i)=>{
console.log(`
#${i + 1} (score: ${row.score.toFixed(3)}, source: ${row.source})`);
console.log(row.content);
});
await pool.end();
}
(async()=>{
try{
awaitquery('How can I change my billing information?');
}catch(err){
console.error('Error testing retrieval:', err);
}
})();
/* scripts/query.ts */ import 'dotenv/config'; import { Pool } from 'pg'; import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const pool = new Pool({ connectionString: process.env.DATABASE_URL }); const TOP_N = 5; // number of chunks to retrieve async function query(query: string) { console.log(`Embedding query: "${query}"`); const { embedding: qVec } = await embed({ model: openai.embedding('text-embedding-3-small'), value: query, }); const qVecString = `[${qVec.join(',')}]`; console.log(`Fetching top ${TOP_N} similar chunks from database...`); const { rows } = await pool.query<{ content: string; source: string; score: number; }>( `SELECT content, source, 1 - (embedding <=> $1) AS score FROM content_chunks ORDER BY embedding <=> $1 LIMIT $2`, [qVecString, TOP_N] ); console.log('Results:'); rows.forEach((row, i) => { console.log(` #${i + 1} (score: ${row.score.toFixed(3)}, source: ${row.source})`); console.log(row.content); }); await pool.end(); } (async () => { try { await query('How can I change my billing information?'); } catch (err) { console.error('Error testing retrieval:', err); } })();
/* scripts/query.ts */
import 'dotenv/config';
import { Pool } from 'pg';
import { openai } from '@ai-sdk/openai';
import { embed } from 'ai';

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const TOP_N = 5; // number of chunks to retrieve

async function query(query: string) {
  console.log(`Embedding query: "${query}"`);
  const { embedding: qVec } = await embed({
    model: openai.embedding('text-embedding-3-small'),
    value: query,
  });
  const qVecString = `[${qVec.join(',')}]`;

  console.log(`Fetching top ${TOP_N} similar chunks from database...`);
  const { rows } = await pool.query<{ content: string; source: string; score: number; }>(
    `SELECT content, source,
            1 - (embedding <=> $1) AS score
       FROM content_chunks
   ORDER BY embedding <=> $1
      LIMIT $2`,
    [qVecString, TOP_N]
  );

  console.log('Results:');
  rows.forEach((row, i) => {
    console.log(`
#${i + 1} (score: ${row.score.toFixed(3)}, source: ${row.source})`);
    console.log(row.content);
  });

  await pool.end();
}

(async () => {
  try {
    await query('How can I change my billing information?');
  } catch (err) {
    console.error('Error testing retrieval:', err);
  }
})();

Run the query script

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
npx tsx scripts/query.ts
npx tsx scripts/query.ts
npx tsx scripts/query.ts
Output:
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
Embedding query: "How can I change my billing information?"
Fetching top 5 similar chunks from database...
Results:
#1 (score: 0.774, source: data.txt)
How do I update my billing information?
Navigate to Billing → Payment Methods and click “Edit” next to your stored card.
#2 (score: 0.512, source: data.txt)
How do I change my account password?
Go to Profile → Security, enter your current password, then choose a new one.
#3 (score: 0.417, source: data.txt)
How do I delete my account?
Please contact support to request account deletion; it cannot be undone.
Embedding query: "How can I change my billing information?" Fetching top 5 similar chunks from database... Results: #1 (score: 0.774, source: data.txt) How do I update my billing information? Navigate to Billing → Payment Methods and click “Edit” next to your stored card. #2 (score: 0.512, source: data.txt) How do I change my account password? Go to Profile → Security, enter your current password, then choose a new one. #3 (score: 0.417, source: data.txt) How do I delete my account? Please contact support to request account deletion; it cannot be undone.
Embedding query: "How can I change my billing information?"
Fetching top 5 similar chunks from database...
Results:
#1 (score: 0.774, source: data.txt)
How do I update my billing information?
Navigate to Billing → Payment Methods and click “Edit” next to your stored card.

#2 (score: 0.512, source: data.txt)
How do I change my account password?
Go to Profile → Security, enter your current password, then choose a new one.

#3 (score: 0.417, source: data.txt)
How do I delete my account?
Please contact support to request account deletion; it cannot be undone.

🔎 score reflects cosine similarity: 1.0 is a perfect match; closer to 0 = less similar.

Conclusion

At this point, we’ve built a vector search backend: scaffolded a Next.js project, spun up Postgres with pgvector, created a schema optimized for similarity search, and built a TypeScript pipeline to embed and store content. We validated our setup with real cosine-similarity queries. In Part 3, we’ll build a user-friendly chatbot interface powered by GPT-4 and streaming responses using the ai SDK.

References

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Matthew Aberham

Matthew Aberham is a solutions architect, and full-stack engineer focused on building scalable web platforms and intuitive front-end experiences. He works at the intersection of performance engineering, interface design, and applied AI systems.

More from this Author

Follow Us