In recent years, AI chatbots like ChatGPT have gone from fun tools for answering questions to serious helpers in workplaces, education, and even personal decision-making. With ChatGPT-5 now being the latest and most advanced version, it’s no surprise that people are asking a critical question:
“Is my personal data safe when I use ChatGPT-5?”
ChatGPT-5 is an AI language model created by OpenAI. You can think of it like a super-smart digital assistant that can:
It learns from patterns in data, but here’s an important point – it doesn’t “remember” your conversations unless the developer has built a special memory feature and you’ve agreed to it.
When you chat with ChatGPT-5, your messages are processed to generate a response. Depending on the app or platform you use, your conversations may be:
This is why reading the privacy policy is not just boring legal stuff – it’s how you find out precisely what happens to your data.
The concerns about ChatGPT-5 (and similar AI tools) are less about it being “evil” and more about how your data could be exposed if not appropriately handled.
Here are the main risks:
Many users unknowingly type personal details – such as their full name, home address, phone number, passwords, or banking information – into AI chat windows. While the chatbot itself may not misuse this data, it is still transmitted over the internet and may be temporarily stored by the platform. If the platform suffers a data breach or if the information is accessed by unauthorized personnel, your sensitive data could be exposed or exploited.
Best Practice: Treat AI chats like public forums – never share confidential or personally identifiable information.
AI chatbots are often integrated into third-party platforms, such as browser extensions, productivity tools, or mobile apps. These integrations may collect and store your chat data on their own servers, sometimes without clearly informing you. Unlike official platforms with strict privacy policies, third-party services may lack robust security measures or transparency.
Risk Example: A browser extension that logs your AI chats could be hacked, exposing all stored conversations.
Best Practice: Use only trusted, official apps and review their privacy policies before granting access.
In rare but serious cases, malicious AI integrations or compromised platforms could capture login credentials you enter during a conversation. If you share usernames, passwords, or OTPs (one-time passwords), these could be used to access your accounts and perform unauthorized actions – such as placing orders, transferring money, or changing account settings.
Real-World Consequence: You might wake up to find that someone used your credentials to order expensive items or access private services.
Best Practice: Never enter login details into any AI chat, and always use two-factor authentication (2FA) for added protection.
If chat logs containing personal information are accessed by cybercriminals, they can use that data to craft highly convincing phishing emails or social engineering attacks. For example, knowing your name, location, or recent purchases allows attackers to impersonate trusted services and trick you into clicking malicious links or revealing more sensitive data.
Best Practice: Be cautious of unsolicited messages and verify the sender before responding or clicking links.
AI chatbots are trained on vast datasets, but they can still generate inaccurate, outdated, or misleading information. Relying on AI responses without verifying facts can lead to poor decisions, especially in areas like health, finance, or legal advice.
Risk Example: Acting on incorrect medical advice or sharing false information publicly could have serious consequences.
Best Practice: Always cross-check AI-generated content with reputable sources before taking action or sharing it.
Here are simple steps you can take:
ChatGPT-5 is a tool, and like any tool, it can be used for good or misused. The AI itself isn’t plotting to steal your logins or credentials, but if you use it carelessly or through untrusted apps, your data could be at risk.
Golden rule: Enjoy the benefits of AI, but treat it like a stranger online – don’t overshare, and keep control of your personal data.
]]>In modern enterprise systems, stability and fault tolerance are not optional; they are essential. One proven approach to ensure robustness is the Circuit Breaker pattern, widely used in API development to prevent cascading failures. HCL Commerce takes this principle further by embedding circuit breakers into its HCL Cache to effectively manage Redis failures.
What Is a Circuit Breaker?
The Circuit Breaker is a design pattern commonly used in API development to stop continuous requests to a service that is currently failing, thereby protecting the system from further issues. It helps maintain system stability by detecting failures and stopping the flow of requests until the issue is resolved.
The circuit breaker typically operates in three main (or “normal”) states. These are part of the standard global pattern of Circuit Breaker design.
Normal States:
Circuit breaker pattern with normal states
Special States:
Circuit breaker pattern with special states
Circuit Breaker in HCL Cache (for Redis)
In HCL Commerce, the HCL Cache layer interacts with Redis for remote coaching. But what if Redis becomes unavailable or slow? HCL Cache uses circuit breakers to detect issues and temporarily stop calls to Redis, thus protecting the rest of the system from being affected.
Behavior Overview:
Configuration Snapshot
To manage Redis outages effectively, HCL Commerce provides fine-grained configuration settings for both Redis client behavior and circuit breaker logic. These settings are defined in the Cache YAML file, allowing teams to tailor fault-handling based on their system’s performance and resilience needs.
Redis Request Timeout Configuration
Slow Redis responses are not treated as failures unless they exceed the defined timeout threshold. The Redis client in HCL Cache supports timeout and retry configurations to control how persistent the system should be before declaring a failure:
timeout: 3000 # Max time (in ms) to wait for a Redis response retryAttempts: 3 # Number of retry attempts on failure retryInterval: 1500 # Specifies the delay (in milliseconds) between each retry attempt.
With the above configuration, the system will spend up to 16.5 seconds (3000 + 3 × (3000 + 1500)) trying to get a response before returning a failure. While these settings offer robustness, overly long retries can result in delayed user responses or log flooding, so tuning is essential.
Circuit Breaker Configuration
Circuit breakers are configured under the redis.circuitBreaker section of the Cache YAML file. Here’s an example configuration:
redis: circuitBreaker: scope: auto retryWaitTimeMs: 60000 minimumFailureTimeMs: 10000 minimumConsecutiveFailures: 20 minimumConsecutiveFailuresResumeOutage: 2 cacheConfigs: defaultCacheConfig: localCache: enabled: true maxTimeToLiveWithRemoteOutage: 300
Explanation of Key Fields:
Real-world Analogy
Imagine you have a web service that fetches data from an external API. Here’s how the circuit breaker would work:
Final Thought
By combining the classic circuit breaker pattern with HCL Cache’s advanced configuration, HCL Commerce ensures graceful degradation during Redis outages. It’s not just about availability—it’s about intelligent fault recovery.
For more detailed information, you can refer to the official documentation here:
HCL Commerce Circuit Breakers – Official Docs
Ready to go from “meh” to “whoa” with your AI coding assistant? Here’s how to get started.
You’ve installed GitHub Copilot. Now what?
Here’s how to actually get it to work for you – not just with you.
In the blog Using GitHub Copilot in VS Code, we have already seen how to use GitHub Copilot in VS Code.
Copilot is like a teammate who’s really fast at coding but only understands what you clearly explain.
Use descriptive comments or function names to guide Copilot.
// Fetch user data from API and cache it locally function fetchUserData() {
Copilot will often generate useful logic based on that. It works best when you think one step ahead.
Copilot shines when your code is modular.
Instead of writing:
function processEverything() { // 50 lines of logic }
Break it down:
// Validate form input function validateInput(data) { } // Submit form to backend function submitForm(data) { }
This way, you get smarter, more accurate completions.
Speed = flow. These shortcuts help you ride Copilot without breaking rhythm:
Action | Shortcut (Windows) | Shortcut (Mac) |
---|---|---|
Accept Suggestion | Tab |
Tab |
Next Suggestion | Alt + ] |
Option + ] |
Previous Suggestion | Alt + [ |
Option + [ |
Dismiss Suggestion | Esc |
Esc |
Open Copilot Panel | Ctrl + Enter |
Cmd + Enter |
Power Tip: Hold Tab
to preview full suggestion before accepting it.
Don’t settle for the first suggestion. Try giving Copilot:
Copilot might generate multiple versions. Pick or tweak the one that fits best.
Copilot is smart, but not perfect.
Think of Copilot as your fast-thinking intern. You still need to double-check their work.
Copilot isn’t just for JS or Python. Try it in:
Write a comment like # Dockerfile for Node.js app – and watch the magic.
Use Copilot to write your test cases too:
// Test case for addTwoNumbers function describe('addTwoNumbers', () => {
It will generate a full Jest test block. Use this to write tests faster – especially for legacy code.
Treat Copilot suggestions as learning opportunities:
It’s like having a senior dev whispering best practices in your ear.
If you have access to GitHub Copilot Chat, try it. Ask questions like:
It works like a Stack Overflow built into your IDE.
Tip | Benefit |
---|---|
Write clear comments | Better suggestions |
Break logic into chunks | Modular, reusable code |
Use shortcuts | Stay in flow |
Cycle suggestions | Explore better options |
Review output | Avoid bugs |
Test case generation | Faster TDD |
Learn as you go | Level up coding skills |
To truly master Copilot:
You’ll slowly build trust – and skill.
]]>Machine Learning (ML) is no longer limited to research labs — it’s actively driving decisions in real estate, finance, healthcare, and more. But deploying and managing ML models in production is a different ballgame. That’s where MLOps comes in.
In this blog, we’ll walk through a practical MLOps learning project — building a House Price Predictor using Azure DevOps as the CI/CD backbone. We’ll explore the evolution from DevOps to MLOps, understand the model development lifecycle, and see how to automate and manage it effectively.
MLOps (Machine Learning Operations) is the discipline of combining Machine Learning, DevOps, and Data Engineering to streamline the end-to-end ML lifecycle.
It aims to:
MLOps ensures that your model doesn’t just work in Jupyter notebooks but continues to deliver accurate predictions in production environments over time.
DevOps revolutionized software engineering by integrating development and operations through automation, CI/CD, and infrastructure as code (IaC). However, ML projects add new complexity:
Aspect | Traditional DevOps | MLOps |
Artifact | Source code | Code + data + models |
Version Control | Git | Git + Data Versioning (e.g., DVC) |
Testing | Unit & integration tests | Data validation + model validation |
Deployment | Web services, APIs | ML models, pipelines, batch jobs |
Monitoring | Logs, uptime, errors | Model drift, data drift, accuracy decay |
So, MLOps builds on DevOps but extends it with data-centric workflows, experimentation tracking, and model governance.
Our goal is to build an ML model that predicts house prices based on input features like square footage, number of bedrooms, location, etc. This learning project is structured to follow MLOps best practices, using Azure DevOps pipelines for automation.
house-price-predictor/ ├── configs/ # Model configurations stored in YAML format ├── data/ # Contains both raw and processed data files ├── deployment/ │ └── mlflow/ # Docker Compose files to set up MLflow tracking ├── models/ # Saved model artifacts and preprocessing objects ├── notebooks/ # Jupyter notebooks for exploratory analysis and prototyping ├── src/ │ ├── data/ # Scripts for data preparation and transformation │ ├── features/ # Logic for generating and engineering features │ ├── models/ # Code for model building, training, and validation ├── k8s/ │ ├── deployment.yaml # Kubernetes specs to deploy the Streamlit frontend │ └── fast_model.yaml # Kubernetes specs to deploy the FastAPI model service ├── requirements.txt # List of required Python packages
Before getting started, make sure the following tools are installed on your machine:
# Replace 'xxxxxx' with your GitHub username or organization git clone https://github.com/xxxxxx/house-price-predictor.git cd house-price-predictor
uv venv --python python3.11 source .venv/bin/activate
uv pip install -r requirements.txt
To enable experiment and model run tracking with MLflow:
cd deployment/mlflow docker compose -f mlflow-docker-compose.yml up -d docker compose ps
podman compose -f mlflow-docker-compose.yml up -d podman compose ps
Access the MLflow UI. Once running, open your browser and navigate to http://localhost:5555
Perform cleaning and preprocessing on the raw housing dataset:
python src/data/run_processing.py --input data/raw/house_data.csv --output data/processed/cleaned_house_data.csv
Perform data transformations and feature generation:
python src/features/engineer.py --input data/processed/cleaned_house_data.csv --output data/processed/featured_house_data.csv --preprocessor models/trained/preprocessor.pkl
Train the model and track all metrics using MLflow:
python src/models/train_model.py --config configs/model_config.yaml --data data/processed/featured_house_data.csv --models-dir models --mlflow-tracking-uri http://localhost:5555
The source code for both applications — the FastAPI backend and the Streamlit frontend — is already available in the src/api and streamlit_app directories, respectively. To build and launch these applications:
Once both services are up and running, you can access the Streamlit web UI in your browser to make predictions.
You can also test the prediction API directly by sending requests to the FastAPI endpoint.
curl -X POST "http://localhost:8000/predict" \ -H "Content-Type: application/json" \ -d '{ "sqft": 1500, "bedrooms": 3, "bathrooms": 2, "location": "suburban", "year_built": 2000, "condition": fair }'
Be sure to replace http://localhost:8000/predict with the actual endpoint based on where it’s running.
At this stage, your project is running locally. Now it’s time to implement the same workflow using Azure DevOps.
To implement a similar MLOps pipeline using Azure DevOps, the following prerequisites must be in place:
Start by cloning the existing GitHub repository into your Azure Repos. Inside the repository, you’ll find the azure-pipeline.yaml file, which defines the Azure DevOps CI/CD pipeline consisting of the following four stages:
This pipeline automates the end-to-end ML workflow from raw data to production deployment.
The CI/CD pipeline is already defined in the existing YAML file and is configured to run manually based on the parameters specified at runtime.
This pipeline is manually triggered (no automatic trigger on commits or pull requests) and supports the conditional execution of specific stages using parameters.
It consists of four stages, each representing a step in the MLOps lifecycle:
Condition: Runs if run_all or run_data_processing is set to true.
Depends on: DataProcessing
Condition: Runs if run_all or run_model_training is set to true.
Depends on: ModelTraining
Condition: Runs if run_all or run_build_and_publish is set to true.
Depends on: BuildAndPublish
Condition: Runs only if the previous stages succeed.
Both deployment and service YAML files for these components are already present in the k8s/ folder and will be used for deploying to Azure Kubernetes Service (AKS).
In short, it deploys the Streamlit frontend and makes it publicly accessible while connecting it to the FastAPI backend for predictions.
In short, it runs the backend API in Kubernetes and makes it accessible for predictions.
Now it’s time for the final run to verify the deployment on the AKS cluster. Trigger the pipeline by selecting the run_all parameter.
After the pipeline completes successfully, all four stages and their corresponding jobs will be executed, confirming that the application has been successfully deployed to the AKS cluster.
Now, log in to the Azure portal and retrieve the external IP address of the Streamlit app service. Once accessed in your browser, you’ll see the House Price Prediction Streamlit application up and running.
Now, go ahead and perform model inference by selecting the appropriate parameter values and clicking on “Predict Price” to see how the model generates the prediction.
In this blog, we explored the fundamentals of MLOps and how it bridges the gap between machine learning development and scalable, production-ready deployment. We walked through a complete MLOps workflow—from data processing and feature engineering to model training, packaging, and deployment—using modern tools like FastAPI, Streamlit, and MLflow.
Using Azure DevOps, we implemented a robust CI/CD pipeline to automate each step of the ML lifecycle. Finally, we deployed the complete House Price Predictor application on an Azure Kubernetes Service (AKS) cluster, enabling a user-friendly frontend (Streamlit) to interact seamlessly with a predictive backend (FastAPI).
This end-to-end project not only showcases how MLOps principles can be applied in real-world scenarios but also provides a strong foundation for deploying scalable and maintainable ML solutions in production.
]]>Let’s be honest – coding isn’t always easy. Some days, you’re laser-focused, knocking out feature after feature. Other days, you stare at your screen, wondering,
“What’s the fastest way to write this function?”
“Is there a cleaner way to loop through this data?”
That’s where GitHub Copilot comes in.
If you haven’t tried it yet, you’re seriously missing out on one of the biggest productivity boosters available to developers today. In this blog, I’ll walk you through how to use GitHub Copilot with Visual Studio Code (VS Code), share my personal experience, and help you decide if it’s worth adding to your workflow.
Think of GitHub Copilot as your AI pair programmer.
It’s trained on billions of lines of public code from GitHub repositories and can:
It’s like having a coding buddy that never sleeps, doesn’t get tired, and is always ready to assist.
Getting started is easy. Here’s a step-by-step guide:
If you don’t have VS Code installed yet, you can install it from here.
Or directly visit here to find the extension.
After installing, you’ll be prompted to sign in using your GitHub account.
Note: GitHub Copilot is a paid service (currently), but there’s usually a free trial to test it out.
Once set up, Copilot starts making suggestions as you code. It’s kind of magical.
Here’s how it typically works:
// Function to reverse a string
Copilot will automatically generate the function for you!
Tab
to accept a suggestion, or use Alt
+ [
/ Alt
+ ]
to browse different options.Here’s how I personally use Copilot in my day-to-day coding:
Use Case | Why I Use Copilot |
---|---|
Boilerplate Code | Saves time writing repetitive patterns |
API Calls | Auto-completes fetch or axios calls quickly |
Learning New Syntax | Helps with unfamiliar frameworks like Rust or Go |
Unit Tests | Suggests test cases faster than starting from scratch |
Regular Expressions | Generates regex patterns (saves Googling!) |
Short answer? No.
Copilot is a tool, not a replacement for developers.
It speeds up the boring parts, but:
Think of Copilot as an assistant, not a boss. It helps you code faster, but you’re still in charge of the logic and creativity.
If you’re someone who:
Then GitHub Copilot is absolutely worth trying out.
Personally, I’ve found it to be a game-changer for productivity. It doesn’t write all my code, but it takes away the mental fatigue of boilerplate so I can focus on solving real problems.
GitHub Copilot is an AI-powered programming assistant that assists developers by generating code based on their comments and existing code. Now natively integrated into Visual Studio 2022, it supports multiple languages, including C#.
After installation, sign in with your GitHub account that has Copilot access. You’ll see a GitHub Copilot icon in the top-right corner of the IDE indicating its status.
If GitHub Copilot is installed but in an inactive state, it may be because:
Or…
Once signed in, the Copilot status icon will change to indicate it’s active.
Step 1: Add a comment, e.g.
// Prompt: Write a method to calculate the factorial of a number using recursion
Step 2: GitHub Copilot will suggest code based on your comment after a few seconds.
Step 3: Accept or Modify
Please watch the video below which helps us to use GitHub Copilot in C#
You can even prompt GitHub Copilot to generate unit tests:
// Unit test for Factorial method using MSTest
GitHub Copilot will suggest
[TestMethod] public void TestFactorial() { var calc = new Calculator(); Assert.AreEqual(120, calc.Factorial(5)); }
If we need the list of customers from the database with some filters, then we can prompt like
// Method to get active customers from a specific city using DbContext
GitHub Copilot will generate like below sample code
public class CustomerService { private readonly AppDbContext _context; public CustomerService(AppDbContext context) { _context = context; } public async Task<List<Customer>> GetActiveCustomersByCityAsync(string cityName) { return await _context.Customers .Where(c => c.IsActive && c.City == cityName) .OrderBy(c => c.LastName) .ToListAsync(); } }
For C# developers, GitHub Copilot in Visual Studio 2022 is revolutionary. It decreases complexity and increases productivity in everything from creating methods to writing tests. Experience the future of AI-assisted development by giving it a try.
]]>Running a Sitecore Docker instance is a game-changer for developers. It streamlines deployments, accelerates local setup, and ensures consistency across environments. However, performance can suffer – even on high-end laptops – if Docker resources aren’t properly optimized, especially after a hardware upgrade.
I recently faced this exact issue. My Sitecore XP0 instance, running on Docker, became noticeably sluggish after I upgraded my laptop. Pages loaded slowly, publishing dragged on forever, and SQL queries timed out.
The good news? The fix was surprisingly simple: allocate more memory to the proper containers using docker-compose.override.yml
After the upgrade, I noticed:
At first, this was puzzling because my new laptop had better specs. However, I then realized that Docker was still running with outdated memory limits for containers. By default, these limits are often too low for heavy workloads, such as Sitecore.
Docker containers run with memory constraints either from:
When memory is too low, Sitecore roles such as CM and MSSQL can’t perform optimally. They need significant RAM for caching, pipelines, and database operations.
To fix the issue, I updated the memory allocation for key containers (mssql and cm) in the docker-compose.override.yml file.
Here’s what I did:
mssql: mem_limit: 2G
mssql: mem_limit: 4GB cm: image: ${REGISTRY}${COMPOSE_PROJECT_NAME}-xp0-cm:${VERSION:-latest} build: context: ./build/cm args: BASE_IMAGE: ${SITECORE_DOCKER_REGISTRY}sitecore-xp0-cm:${SITECORE_VERSION} SPE_IMAGE: ${SITECORE_MODULE_REGISTRY}sitecore-spe-assets:${SPE_VERSION} SXA_IMAGE: ${SITECORE_MODULE_REGISTRY}sitecore-sxa-xp1-assets:${SXA_VERSION} TOOLING_IMAGE: ${SITECORE_TOOLS_REGISTRY}sitecore-docker-tools-assets:${TOOLS_VERSION} SOLUTION_IMAGE: ${REGISTRY}${COMPOSE_PROJECT_NAME}-solution:${VERSION:-latest} HORIZON_RESOURCES_IMAGE: ${SITECORE_MODULE_REGISTRY}horizon-integration-xp0-assets:${HORIZON_ASSET_VERSION} depends_on: - solution mem_limit: 8GB volumes: - ${LOCAL_DEPLOY_PATH}\platform:C:\deploy - ${LOCAL_DATA_PATH}\cm:C:\inetpub\wwwroot\App_Data\logs - ${HOST_LICENSE_FOLDER}:c:\license - ${LOCAL_ITEM_PATH}:c:\items-mounted
docker-compose.override.yml
.mssql → 4GB
cm → 8GB
docker compose down docker compose up --build -d
docker stats
After increasing memory:
Sitecore roles (especially CM) and SQL Server are memory-hungry. If Docker allocates too little memory:
By increasing memory:
A simple tweak in docker-compose.override.yml can drastically improve your Sitecore Docker instance performance. If your Sitecore CM is sluggish or SQL queries are slow, try increasing the memory limit for critical containers.
]]>In this post I’d like to share a workflow “attacher” implementation I built on a recent Sitecore XM Cloud project. The solution attaches workflows to new items based on a configurable list of template and path rules. It was fun to build and ended up involving a couple Sitecore development mechanisms I hadn’t used in a while:
This implementation provided our client with a semi-extensible way of attaching workflows to items without writing any additional code themselves. “But, Nick, Sitecore already supports attaching workflows to items, why write any custom code to do this?” Great question .
The go-to method of attaching workflows to new items in Sitecore is to set the workflow fields on Standard Values for the template(s) in question. For example, on a Page template in a headless site called Contoso (/sitecore/templates/Project/Contoso/Page/__Standard Values). This is documented in the Accelerate Cookbook for XM Cloud here. Each time a new page is created using that template, the workflow is associated to (and is usually started on) the new page.
Setting workflow Standard Values fields on site-specific or otherwise custom templates is one thing, but what about on out-of-the-box (OOTB) templates like media templates? On this particular project, there was a requirement to attach a custom workflow to any new versioned media items.
I didn’t want to edit Standard Values on any of the media templates that ship with Sitecore. However unlikely, those templates could change in a future Sitecore version. Also, worrying about configuring Sitecore to treat any new, custom media templates in the same way as the OOTB media templates just felt like a bridge too far.
I thought it would be better to “listen” for new media items being created and then check to see if a workflow should be attached to the new item or not. And, ideally, it would be configurable and would allow the client’s technical resources to enumerate one or more workflow “attachments,” each independently configurable to point to a specific workflow, one or more templates, and one or more paths.
Disclaimer: Okay, real talk for a second. Before I describe the solution, broadly speaking, developers should try to avoid customizing the XM Cloud content management (CM) instance altogether. This is briefly mentioned in the Accelerate Cookbook for XM Cloud here. The less custom code deployed to the CM the better; that means fewer points of failure, better performance, more expedient support ticket resolution, etc. As Robert Galanakis once wrote, “The fastest code is the code which does not run. The code easiest to maintain is the code that was never written.”
With that out of the way, in the real world of enterprise XM Cloud solutions, you may find yourself building customizations. In the case of this project, I didn’t want to commit to the added overhead and complexity of building out custom media templates, wiring them up in Sitecore, etc., so I instead built a configurable workflow attachment mechanism to allow technical resources to enumerate which workflows should start on which items based on the item’s template and some path filters.
Assuming it’s enabled and not otherwise bypassed, the addFromTemplate pipeline processor is invoked when an item is created using a template, regardless of where or how the item was created. For example:
In years past, the item:added event handler may have been used in similar situations; however, it isn’t as robust and doesn’t fire as consistently given all the different ways an item can be created in Sitecore.
To implement an addFromTemplate pipeline processor, developers implement a class inheriting from AddFromTemplateProcessor (via Sitecore.Pipelines.ItemProvider.AddFromTemplate). Here’s the implementation for the workflow attacher:
using Contoso.Platform.Extensions; using Sitecore.Pipelines.ItemProvider.AddFromTemplate; ... namespace Contoso.Platform.Workflow { public class AddFromTemplateGenericWorkflowAttacher : AddFromTemplateProcessor { private List<WorkflowAttachment> WorkflowAttachments = new List<WorkflowAttachment>(); public void AddWorkflowAttachment(XmlNode node) { var attachment = new WorkflowAttachment(node); if (attachment != null) { WorkflowAttachments.Add(attachment); } } public override void Process(AddFromTemplateArgs args) { try { Assert.ArgumentNotNull(args, nameof(args)); if (args.Aborted || args.Destination.Database.Name != "master") { return; } // default to previously resolved item, if available Item newItem = args.ProcessorItem?.InnerItem; // use previously resolved item, if available if (newItem == null) { try { Assert.IsNotNull(args.FallbackProvider, "Fallback provider is null"); // use the "base case" (the default implementation) to create the item newItem = args.FallbackProvider.AddFromTemplate(args.ItemName, args.TemplateId, args.Destination, args.NewId); if (newItem == null) { return; } // set the newly created item as the result and downstream processor item args.ProcessorItem = args.Result = newItem; } catch (Exception ex) { Log.Error($"{nameof(AddFromTemplateGenericWorkflowAttacher)} failed. Removing partially created item, if it exists", ex, this); var item = args.Destination.Database.GetItem(args.NewId); item?.Delete(); throw; } } // iterate through the configured workflow attachments foreach (var workflowAttachment in WorkflowAttachments) { if (workflowAttachment.ShouldAttachToItem(newItem)) { AttachAndStartWorkflow(newItem, workflowAttachment.WorkflowId); // an item can only be in one workflow at a time break; } } } catch (Exception ex) { Log.Error($"There was a processing error in {nameof(AddFromTemplateGenericWorkflowAttacher)}.", ex, this); } } private void AttachAndStartWorkflow(Item item, string workflowId) { item.Editing.BeginEdit(); // set default workflow item.Fields[Sitecore.FieldIDs.DefaultWorkflow].Value = workflowId; // set workflow item.Fields[Sitecore.FieldIDs.Workflow].Value = workflowId; // start workflow var workflow = item.Database.WorkflowProvider.GetWorkflow(workflowId); workflow.Start(item); item.Editing.EndEdit(); } } }
Notes:
The implementation of the ShouldAttachToItem() extension method is as follows:
... namespace Contoso.Platform { public static class Extensions { ... public static bool ShouldAttachToItem(this WorkflowAttachment workflowAttachment, Item item) { if (item == null) return false; // check exclusion filters if (workflowAttachment.PathExclusionFilters.Any(exclusionFilter => item.Paths.FullPath.IndexOf(exclusionFilter, StringComparison.OrdinalIgnoreCase) > -1)) return false; // check inclusion filters if (workflowAttachment.PathFilters.Any() && !workflowAttachment.PathFilters.Any(includeFilter => item.Paths.FullPath.StartsWith(includeFilter, StringComparison.OrdinalIgnoreCase))) return false; var newItemTemplate = TemplateManager.GetTemplate(item); // check for template match or template inheritance return workflowAttachment.TemplateIds.Any(id => ID.TryParse(id, out ID templateId) && (templateId.Equals(item.TemplateID) || newItemTemplate.InheritsFrom(templateId))); } } ... }
Notes:
Here’s the WorkflowAttachment POCO that defines the workflow attachment object and facilitates the Sitecore configuration factory’s initialization of objects:
using Sitecore.Diagnostics; using System; using System.Collections.Generic; using System.Linq; using System.Xml; namespace Contoso.Platform.Workflow { public class WorkflowAttachment { public string WorkflowId { get; set; } public List<string> TemplateIds { get; set; } public List<string> PathFilters { get; set; } public List<string> PathExclusionFilters { get; set; } public WorkflowAttachment(XmlNode workflowAttachmentNode) { TemplateIds = new List<string>(); PathFilters = new List<string>(); PathExclusionFilters = new List<string>(); if (workflowAttachmentNode == null) throw new ArgumentNullException(nameof(workflowAttachmentNode), $"The workflow attachment configuration node is null; unable to create {nameof(WorkflowAttachment)} object."); // parse nodes foreach (XmlNode childNode in workflowAttachmentNode.ChildNodes) { if (childNode.NodeType != XmlNodeType.Comment) ParseNode(childNode); } // validate Assert.IsFalse(string.IsNullOrWhiteSpace(WorkflowId), $"{nameof(WorkflowId)} must not be null or whitespace."); Assert.IsTrue(TemplateIds.Any(), "The workflow attachment must enumerate at least one (1) template ID."); } private void ParseNode(XmlNode node) { switch (node.LocalName) { case "workflowId": WorkflowId = node.InnerText; break; case "templateIds": foreach (XmlNode childNode in node.ChildNodes) { if (childNode.NodeType != XmlNodeType.Comment) TemplateIds.Add(childNode.InnerText); } break; case "pathFilters": foreach (XmlNode childNode in node.ChildNodes) { if (childNode.NodeType != XmlNodeType.Comment) PathFilters.Add(childNode.InnerText); } break; case "pathExclusionFilters": foreach (XmlNode childNode in node.ChildNodes) { if (childNode.NodeType != XmlNodeType.Comment) PathExclusionFilters.Add(childNode.InnerText); } break; default: break; } } } }
The following patch configuration file is defined to A. wire-up the addFromTemplate pipeline processor and B. describe the various workflow attachments. In the sample file below, for brevity, there’s only one (1) attachment defined, but multiple attachments are supported.
<configuration> <sitecore> ... <pipelines> <group name="itemProvider" groupName="itemProvider"> <pipelines> <addFromTemplate> <processor type="Contoso.Platform.Workflow.AddFromTemplateGenericWorkflowAttacher, Contoso.Platform" mode="on"> <!-- Contoso Media Workflow attachment for versioned media items and media folders --> <workflowAttachmentDefinition hint="raw:AddWorkflowAttachment"> <workflowAttachment> <!-- /sitecore/system/Workflows/Contoso Media Workflow --> <workflowId>{88839366-409A-4E57-86A4-167150ED5559}</workflowId> <templateIds> <!-- /sitecore/templates/System/Media/Versioned/File --> <templateId>{611933AC-CE0C-4DDC-9683-F830232DB150}</templateId> <!-- /sitecore/templates/System/Media/Media folder --> <templateId>{FE5DD826-48C6-436D-B87A-7C4210C7413B}</templateId> </templateIds> <pathFilters> <!-- Contoso Media Library Folder --> <pathFilter>/sitecore/media library/Project/Contoso</pathFilter> </pathFilters> <pathExclusionFilters> <pathExclusionFilter>/sitecore/media library/System</pathExclusionFilter> <pathExclusionFilter>/Sitemap</pathExclusionFilter> <pathExclusionFilter>/Sitemaps</pathExclusionFilter> <pathExclusionFilter>/System</pathExclusionFilter> <pathExclusionFilter>/_System</pathExclusionFilter> </pathExclusionFilters> </workflowAttachment> </workflowAttachmentDefinition> ... </processor> </addFromTemplate> </pipelines> </group> </pipelines> ... </sitecore> </configuration>
Notes:
While certainly not a one-size-fits-all solution, this approach was a good fit for this particular project considering the requirements and a general reticence for modifying Standard Values on OOTB Sitecore templates. Here are some pros and cons for this solution:
Pros
Cons
Takeaways:
Thanks for the read!
This is Part 3 of a three-part series (links at the bottom).
In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity.
In this final installment, we’ll build a Next.js chatbot interface that streams GPT‑4 responses powered by your indexed content and demonstrates how to use GPT‑4 function‑calling (“tool calling”) for type‑safe, server‑side operations. Along the way, we’ll integrate polished components from shadcn UI to level‑up the front‑end experience.
Prerequisite You should already have the rag‑chatbot‑demo
repo from Parts 1 & 2, with Dockerised PostgreSQL 17 + pgvector, the content_chunks
schema, and embeddings ingested. Link to the repo here.
By the end of this guide you will:
If you already have the project folder from earlier parts, skip straight to Install Dependencies.
Before diving into implementation, it’s helpful to understand what tool calling is and why it matters for a robust RAG-based chatbot.
Tool calling lets your LLM not only generate free-form text but also invoke predefined functions, or “tools”, with strictly validated arguments. By exposing only a controlled set of server-side capabilities (for example, looking up the current time, querying an external API, or managing user sessions), you:
In our setup, we register each tool with a name, description, and a Zod schema that describes the allowed parameters. When GPT-4 decides to call a tool, the AI SDK intercepts that intent, validates the arguments against the Zod schema, runs the tool’s execute
function on the server, and then feeds the result back into the model’s next generation step. This orchestration happens entirely within the streaming response, so the user sees a seamless, conversational experience even when live data or actions are involved.
npm install ai @ai-sdk/openai @openai/agents zod shadcn-ui pg
Package | Purpose |
---|---|
`@openai/agents` | Registers functions as callable tools |
`zod` | Runtime schema validation |
`shadcn-ui` | Tailwind‑friendly React components |
`ai` & `@ai-sdk/openai` | Manage LLM calls & streaming |
`pg` | PostgreSQL client |
Initialise shadcn UI and select a few components:
npx shadcn@latest init
npx shadcn@latest add button input card scroll-area
Use the OpenAI Agents SDK to create a vectorSearch tool that embeds user queries, searches your Postgres vector store, and returns results:
// tools/vectorSearch.ts
import { embed, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
import { Pool } from 'pg';
const db = new Pool({ connectionString: process.env.DATABASE_URL });
// Define the vector search tool
export const vectorSearchTool = tool({
description: 'Search for relevant information in the knowledge base',
parameters: z.object({
query: z.string().describe('The search query to find relevant information'),
}),
execute: async ({ query }) => {
console.log('Searching for:', query);
// Embed the search query
const { embedding: qVec } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: query,
});
const qVecString = `[${qVec.join(',')}]`;
// Retrieve top-5 most similar chunks
const { rows } = await db.query<{ content: string; source: string }>(
`SELECT content, source
FROM content_chunks
ORDER BY embedding <=> $1
LIMIT 5`,
[qVecString]
);
const results = rows.map((r, i) => ({
content: r.content,
source: r.source,
rank: i + 1,
}));
return { results };
},
});
Modify app/api/chat/route.ts
to register the tool and let the model decide when to call it:
import { streamText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { NextRequest } from 'next/server';
import { vectorSearchTool } from '@/tools/vectorSearch';
export const POST = async (req: NextRequest) => {
const { messages } = await req.json();
const systemMsg = {
role: 'system',
content: `You are a helpful support assistant.
When users ask questions, use the vector search tool to find relevant information from the knowledge base.
Base your answers on the search results.
Always provide a response after using the tool.
If the user asks a question that is not related to the knowledge base, say that you are not sure about the answer.`,
};
try {
// Stream GPT-4's response with tool calling
const result = streamText({
model: openai('gpt-4.1'),
messages: [systemMsg, ...messages],
tools: {
vectorSearch: vectorSearchTool,
},
maxSteps: 5, // Allow multiple tool calls and responses
});
return result.toDataStreamResponse();
} catch (error) {
console.error('Error in chat API:', error);
return new Response('Internal Server Error', { status: 500 });
}
};
Create app/chat/page.tsx
, selectively importing Shadcn components and wiring up useChat
:
'use client';
import { useChat } from '@ai-sdk/react';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { ScrollArea } from '@/components/ui/scroll-area';
import { Input } from '@/components/ui/input';
import { Button } from '@/components/ui/button';
import { useState } from 'react';
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit } = useChat({
api: '/api/chat',
});
const customHandleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
await handleSubmit(e); // Call the handleSubmit from useChat
};
const renderMessage = (message: any, index: number) => {
const isUser = message.role === 'user';
const hasToolInvocations = message.toolInvocations && message.toolInvocations.length > 0;
return (
<div className={`mb-4 ${isUser ? 'text-right' : 'text-left'}`}>
<div className={`inline-block p-2 rounded-lg ${isUser ? 'bg-primary text-primary-foreground' : 'bg-muted'}`}>{message.content}</div>
{/* Debug section for tool calls */}
{!isUser && hasToolInvocations && <ToolCallDebugSection toolInvocations={message.toolInvocations} />}
</div>
);
};
return (
<Card className="w-full max-w-2xl mx-auto">
<CardHeader>
<CardTitle>Chat with AI</CardTitle>
</CardHeader>
<CardContent>
<ScrollArea className="h-[60vh] mb-4 p-4 border rounded">
{messages.map((message, index) => (
<div key={index}>{renderMessage(message, index)}</div>
))}
</ScrollArea>
<form onSubmit={customHandleSubmit} className="flex space-x-2">
<Input type="text" value={input} onChange={handleInputChange} placeholder="Type your message here..." className="flex-1" />
<Button type="submit">Send</Button>
</form>
</CardContent>
</Card>
);
}
function ToolCallDebugSection({ toolInvocations }: { toolInvocations: any[] }) {
const [isExpanded, setIsExpanded] = useState(false);
return (
<div className="mt-2 text-left">
<button onClick={() => setIsExpanded(!isExpanded)} className="text-xs text-gray-500 hover:text-gray-700 flex items-center gap-1">
<span>{isExpanded ? '▼' : '▶'}</span>
<span>Debug: Tool calls ({toolInvocations.length})</span>
</button>
{isExpanded && (
<div className="mt-2 space-y-2 text-xs bg-gray-50 dark:bg-gray-900 p-2 rounded border">
{toolInvocations.map((tool: any, index: number) => (
<div key={index} className="bg-white dark:bg-gray-800 p-2 rounded border">
<div className="font-semibold text-blue-600 dark:text-blue-400 mb-1">🔧 {tool.toolName}</div>
<div className="text-gray-600 dark:text-gray-300 mb-2">
<strong>Query:</strong> {tool.args?.query}
</div>
{tool.result && (
<div>
<div className="font-semibold text-green-600 dark:text-green-400 mb-1">Results:</div>
<div className="space-y-1 max-h-32 overflow-y-auto">
{tool.result.results?.map((result: any, idx: number) => (
<div key={idx} className="bg-gray-100 dark:bg-gray-700 p-1 rounded">
<div className="text-gray-800 dark:text-gray-200 text-xs">{result.content}</div>
<div className="text-gray-500 text-xs mt-1">
Source: {result.source} | Rank: {result.rank}
</div>
</div>
))}
</div>
</div>
)}
</div>
))}
</div>
)}
</div>
);
}
Your chatbot now features type‑safe tool calling, a vector‑powered knowledge base, and a refined shadcn UI front‑end—ready for users.
Part 1: Vector Search Embeddings and RAG
Part 2: Postgres RAG Stack: Embedding, Chunking & Vector Search
]]>This blog post will discuss MultiSite validation for either ContentArea or LinkItemCollection, which are both capable of storing multiple items. Although we can use the custom MaxItem attribute to validate the ContentArea or LinkItemCollection, the problem arises when the same property is used for multiple sites with different validation limits.
In a recent project, we were tasked with migrating multiple websites into a single platform using Optimizely. These sites shared common ContentTypes wherever applicable, though their behavior varied slightly depending on the site.
One of the main challenges involved a ContentType used as the StartPage with the same properties across different sites. While the structure remained the same, the validation rules for its properties differed based on the specific site requirements. A common issue was enforcing a maximum item validation limit on a property like a ContentArea, where each site had a different limit—for example, Site A allowed a maximum of 3 items, while Sites B and C allowed 4 and 5 items, respectively.
To solve this multisite validation scenario, we implemented a custom validation attribute that dynamically validated the maximum item limit based on the current site context.
Below are the steps we followed to achieve this.
[AttributeUsage(AttributeTargets.Property, AllowMultiple = true)] public class MaxItemsBySitesAttribute : ValidationAttribute { private readonly string[] _siteName; private int _max; public MaxItemsBySitesAttribute(int max, params string[] siteName) { _max = max; _siteName = siteName; } protected override ValidationResult IsValid(object value, ValidationContext validationContext) { var siteSpecificLimit = GetSiteSpecificMaxLimitByFieldName(validationContext.MemberName, validationContext?.ObjectType?.BaseType); string errorMsg = $"{validationContext.DisplayName}, exceeds the maximum limit of {siteSpecificLimit} items for site {SiteDefinition.Current.Name}"; if (value is ContentArea contentArea) { if (contentArea.Count > siteSpecificLimit) { return new ValidationResult(errorMsg); } } else if (value is LinkItemCollection linkItems) { if (linkItems.Count > siteSpecificLimit) { return new ValidationResult(errorMsg); } } return ValidationResult.Success; } private int GetSiteSpecificMaxLimitByFieldName(string fieldName, Type type) { var propertyInfo = type.GetProperty(fieldName); if (propertyInfo != null) { var attributes = propertyInfo.GetCustomAttributes<MaxItemsBySitesAttribute>()?.ToList(); var siteMaxLimit = attributes.FirstOrDefault(x => x._siteName != null && x._siteName.Any(site => site == SiteDefinition.Current.Name)); return siteMaxLimit == null ? 0 : siteMaxLimit._max; } return 0; } }
public class StartPage : SitePageData { [Display( GroupName = SystemTabNames.Content, Order = 320)] [CultureSpecific] [MaxItemsBySites(2, "AlloyBlog")] [MaxItemsBySites(3, "AlloyDemo", "AlloyEvents")] public virtual ContentArea MainContentArea { get; set; } }
By implementing site-specific maximum item validation in your Optimizely CMS multisite, content authors can ensure content consistency, enhance user experience, and maintain precise control over content areas and link collections across diverse properties in different sites.
In case you want other validation by site-specific, you can use the same approach by changing the code accordingly.
]]>If you’re using Sitecore Docker containers on Windows, you’ve probably noticed your disk space mysteriously shrinking over time. I recently encountered this issue myself and was surprised to discover the culprit: orphaned Docker layers – leftover chunks of data that no longer belong to any container or image.
This happened while I was working with Sitecore XP 10.2 in a Dockerized environment. After several rounds of running’ docker-compose up’ and rebuilding custom images, Docker started hoarding storage, and the usual’ docker system prune’ didn’t fully resolve the issue.
That’s when I stumbled upon a great blog post by Vikrant Punwatkar: Regain disk space occupied by Docker
Inspired by his approach, I automated the cleanup process with PowerShell, and it worked like a charm. Let me walk you through it.
Docker uses layers to build and manage images. Over time, when images are rebuilt or containers removed, some layers are left behind. These “orphan” layers hang around in your system, specifically under:
C:\ProgramData\Docker\windowsfilter
They’re not in use, but they still consume gigabytes of space. If you’re working with large containers, such as Sitecore’s, these can add up quickly.
I broke the cleanup process into two simple scripts:
This script compares the layers used by active images and containers against what’s actually on your disk. Anything extra is flagged as an orphan. You can choose to rename those orphan folders (we add -removing at the end) for safe deletion.
A. Download PowerShell script and execute (as Administrator) with the parameter -RenameOrphanLayers
B. To Run:
.\Find-OrphanDockerLayers.ps1 -RenameOrphanLayers
C. Sample Output:
WARNING: YOUR-PC - Found orphan layer: C:\ProgramData\Docker\windowsfilter\abc123 with size: 500 MB ... YOUR-PC - Layers on disk: 130 YOUR-PC - Image layers: 90 YOUR-PC - Container layers: 15 WARNING: YOUR-PC - Found 25 orphan layers with total size 4.8 GB
This provides a clear picture of the space you can recover.
Stop Docker completely first using the below PowerShell command, or you can manually stop the Docker services:
Stop-Service docker
Once you’ve renamed the orphan layers, this second script deletes them safely. It first fixes folder permissions using takeown and icacls, which are crucial for system directories like these.
A. Download the PowerShell script and execute (as Administrator)
B. To Run:
.\Delete-OrphanDockerLayers.ps1
C. Sample Output:
Fixing permissions and deleting: C:\ProgramData\Docker\windowsfilter\abc123-removing ...
Simple and effective — no manual folder browsing or permission headaches.
After running these scripts, I was able to recover multiple gigabytes of storage, and you’ll definitely benefit from this cleanup. If you’re frequently working with:
Huge thanks to Vikrant Punwatkar for the original idea and guidance. His blog post was the foundation for this automated approach.
Check out his post here: Regain disk space occupied by Docker
If your Docker setup is bloated and space is mysteriously disappearing, try this approach. It’s quick, safe, and makes a noticeable difference – especially on Windows, where Docker’s cleanup isn’t always as aggressive as we’d like.
Have you tried it? Got a different solution? Feel free to share your thoughts or suggestions for improvement.
]]>This is Part 2 of a three-part series (links at the bottom). The GitHub repo can be checked out here.
Postgres RAG Stack brings together Postgres, pgVector, and TypeScript to power fast, semantic search. In Part One, we covered the theory behind semantic search: how embeddings convert meaning into vectors, how vector databases and indexes enable fast similarity search, and how RAG combines retrieval with language models for grounded, accurate responses. In this guide, you’ll scaffold your project, set up Docker with pgVector, and build ingestion and query scripts for embedding and chunking your data.
Now we will begin setting up the foundation for our RAG application:
content_chunks
table with an HNSW indexmkdir rag-chatbot-demo && cd rag-chatbot-demo
To create a chatbot using Next.js, scaffold now to avoid conflicts. Skip if you only need the RAG basics:
npx create-next-app@latest . \ --typescript \ --app \ --tailwind \ --eslint \ --import-alias "@/*"
mkdir -p scripts postgres input
docker-compose.yml
# ./docker-compose.yml services: db: image: pgvector/pgvector:pg17 container_name: rag-chatbot-demo environment: POSTGRES_USER: postgres POSTGRES_PASSWORD: password POSTGRES_DB: ragchatbot_db ports: - '5432:5432' volumes: - ./pgdata:/var/lib/postgresql/data - ./postgres/schema.sql:/docker-entrypoint-initdb.d/schema.sql volumes: pgdata:
Add a SQL file that Docker runs automatically on first boot:
-- ./postgres/schema.sql -- Enable pgvector extension CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE content_chunks ( id bigserial PRIMARY KEY, content text, embedding vector(1536), -- OpenAI text‑embedding‑3‑small source text, -- optional file name, URL, etc. added_at timestamptz DEFAULT now() ); -- High‑recall ANN index for cosine similarity -- Note: Adding index before inserting data slows down the insert process CREATE INDEX ON content_chunks USING hnsw (embedding vector_cosine_ops);
After creating the schema file, start the container:
docker compose up -d
.env
fileIn the project root, add:
# .env DATABASE_URL=postgresql://postgres:password@localhost:5432/ragchatbot_db # Get your key from https://platform.openai.com/account/api-keys OPENAI_API_KEY=your-openai-key-here
Create input/data.txt
with sample documentation or FAQs. Download the full file here.
# AcmeCorp Subscription Guide ## How do I renew my plan? Log in to your dashboard, select Billing → Renew, and confirm payment. Your new cycle starts immediately. ## How can I cancel my subscription? Navigate to Billing → Cancel Plan. Your access remains active until the end of the current billing period. ## Do you offer student discounts? Yes. Email support@acmecorp.com with proof of enrollment to receive a 25% discount code. --- # Troubleshooting Connectivity ## The app cannot reach the server Check your internet connection and verify the service URL in Settings → API Host.
npm install pg langchain ai @ai-sdk/openai dotenv
Dependencies:
scripts/embed.ts
This script reads text, splits it into chunks, generates embeddings via OpenAI, and stores them in content_chunks
:
// scripts/embed.ts import 'dotenv/config'; import fs from 'node:fs'; import path from 'node:path'; import { Pool } from 'pg'; import { RecursiveCharacterTextSplitter } from 'langchain/text_splitter'; import { openai } from '@ai-sdk/openai'; // OpenAI adapter import { embedMany } from 'ai'; // generic AI interface const BATCH_SIZE = 50; const MAX_CHUNK_LENGTH = 512; // max characters per chunk const pool = new Pool({ connectionString: process.env.DATABASE_URL }); /** * Ingest a plain-text Q&A file where each line is either a question or an answer. * Splits on single newlines; if a line exceeds MAX_CHUNK_LENGTH, it is further * chunked by RecursiveCharacterTextSplitter. */ async function ingest(file: string) { const raw = fs.readFileSync(file, 'utf8'); console.log(`Loaded ${file}`); // Split into lines, drop empty lines const lines = raw .split(/\r?\n\s*\r?\n/) .map((l) = & gt; l.trim()) .filter(Boolean); // Prepare overflow splitter for any long lines const overflowSplitter = new RecursiveCharacterTextSplitter({ chunkSize: MAX_CHUNK_LENGTH, chunkOverlap: 50, }); // Build final list of chunks const chunks: string[] = []; for (const line of lines) { if (line.length & lt; = MAX_CHUNK_LENGTH) { chunks.push(line); } else { // Further split long lines into smaller chunks if needed const sub = await overflowSplitter.splitText(line); chunks.push(...sub); } } console.log(`Processing ${chunks.length} chunks in batches of ${BATCH_SIZE}`); // Process chunks in batches using embedMany for (let i = 0; i & lt; chunks.length; i += BATCH_SIZE) { const batch = chunks.slice(i, i + BATCH_SIZE); console.log(`Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(chunks.length / BATCH_SIZE)}`); // Embed the entire batch at once const { embeddings } = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: batch, }); // Insert all embeddings from this batch into the database for (let j = 0; j & lt; batch.length; j++) { const chunk = batch[j]; const embedding = embeddings[j]; const vectorString = `[${embedding.join(',')}]`; console.log(`Inserting chunk ${i + j + 1}/${chunks.length}: ${chunk.slice(0, 60)}...`); await pool.query('INSERT INTO content_chunks (content, embedding, source) VALUES ($1,$2,$3)', [chunk, vectorString, path.basename(file)]); } } } async function main() { console.log('Starting embedding ingestion…'); await ingest('./input/data.txt'); await pool.end(); } main().catch((err) = & gt; { console.error('Ingestion error:', err); process.exit(1); });
npx tsx scripts/embed.ts
Create scripts/query.ts
to embed a query, fetch the top-N chunks, and print them:
/* scripts/query.ts */ import 'dotenv/config'; import { Pool } from 'pg'; import { openai } from '@ai-sdk/openai'; import { embed } from 'ai'; const pool = new Pool({ connectionString: process.env.DATABASE_URL }); const TOP_N = 5; // number of chunks to retrieve async function query(query: string) { console.log(`Embedding query: "${query}"`); const { embedding: qVec } = await embed({ model: openai.embedding('text-embedding-3-small'), value: query, }); const qVecString = `[${qVec.join(',')}]`; console.log(`Fetching top ${TOP_N} similar chunks from database...`); const { rows } = await pool.query<{ content: string; source: string; score: number; }>( `SELECT content, source, 1 - (embedding <=> $1) AS score FROM content_chunks ORDER BY embedding <=> $1 LIMIT $2`, [qVecString, TOP_N] ); console.log('Results:'); rows.forEach((row, i) => { console.log(` #${i + 1} (score: ${row.score.toFixed(3)}, source: ${row.source})`); console.log(row.content); }); await pool.end(); } (async () => { try { await query('How can I change my billing information?'); } catch (err) { console.error('Error testing retrieval:', err); } })();
npx tsx scripts/query.ts
Output:
Embedding query: "How can I change my billing information?" Fetching top 5 similar chunks from database... Results: #1 (score: 0.774, source: data.txt) How do I update my billing information? Navigate to Billing → Payment Methods and click “Edit” next to your stored card. #2 (score: 0.512, source: data.txt) How do I change my account password? Go to Profile → Security, enter your current password, then choose a new one. #3 (score: 0.417, source: data.txt) How do I delete my account? Please contact support to request account deletion; it cannot be undone.
![]()
score
reflects cosine similarity: 1.0 is a perfect match; closer to 0 = less similar.
At this point, we’ve built a vector search backend: scaffolded a Next.js project, spun up Postgres with pgvector
, created a schema optimized for similarity search, and built a TypeScript pipeline to embed and store content. We validated our setup with real cosine-similarity queries. In Part 3, we’ll build a user-friendly chatbot interface powered by GPT-4 and streaming responses using the ai
SDK.