Skip to main content

Matthew Aberham

Matthew Aberham is a solutions architect, and full-stack engineer focused on building scalable web platforms and intuitive front-end experiences. He works at the intersection of performance engineering, interface design, and applied AI systems.

Blogs from this Author

Rag Decortive Header

vLLM v0.16 Adds WebSocket Realtime API and Faster Scheduling

vLLM v0.16.0: Throughput Scheduling and a WebSocket Realtime API Date: February 24, 2026 Source: vLLM Release Notes Release Context: This is a version upgrade. vLLM v0.16.0 is the latest release of the popular open-source inference server. The WebSocket Realtime API is a new feature that mirrors the functionality of OpenAI’s Realtime API, providing a self-hosted […]

C0045b10 5bc9 4b79 92f7 Ae643c5cb4ae

LLM Concept Vectors: MIT Research on Steering AI Behavior

Date: February 23, 2026 Source: Science Researchers from MIT and UC San Diego published a paper in Science describing LLM concept vectors and a new algorithm called the Recursive Feature Machine (RFM) that can extract these concept vectors from large language models. Essentially, these are patterns of neural activity corresponding to specific ideas or behaviors. […]

Anthropic Claude Ai Chatbot

Anthropic Accuses DeepSeek of Distillation Attacks on Claude

Date: February 23, 2026 Source: Anthropic Blog Anthropic published a detailed post revealing what it calls an Anthropic distillation attack at industrial scale, accusing three Chinese AI labs (DeepSeek, Moonshot AI/Kimi, and MiniMax) of systematically extracting Claude’s capabilities. According to Anthropic, the labs created over 24,000 fraudulent accounts and generated more than 16 million exchanges […]

Minimax M2 Reap 172b A10b Mxfp4 Moe Gguf

Minimax M2: Innovative Reasoning Strategy from Open-Source Model Showing Big Results

In the fast-paced world of artificial intelligence, a new open-source model from Chinese AI firm Minimax is making a significant impact. Released in late October 2025, Minimax M2 has rapidly gained acclaim for its innovative approach to reasoning, impressive performance, and cost-effectiveness, positioning it as a formidable competitor to established proprietary models. A New Architecture for a […]

Chandra Ocr Feature

Chandra OCR: The BEST in Open-Source AI Document Parsing

In the specialized field of Optical Character Recognition (OCR), a new open-source model from Datalab is setting a new benchmark for accuracy and versatility. Chandra OCR, released in October 2025, has rapidly ascended to the top of the leaderboards, outperforming even proprietary giants like GPT-4o and Gemini Pro on key benchmarks. Beyond Simple Text Extraction Chandra is not […]

Istock 2162026367

Request Hedging: Accelerate Your App by Racing Duplicate Calls

Users notice slow requests; even if 99 % finish quickly, that 1 % “long‑tail” latency can make your app feel sluggish. Request hedging solves this by speculatively firing a second duplicate after a short delay, racing to beat out outliers before they ever impact the UI. Why the slowest 1 % of requests matter The time it takes […]

Rag Decortive Header

Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

This is Part 3 of a three-part series (links at the bottom). In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity. In […]

Two Professional It Programers Discussing Blockchain Data Network Architecture Design And Development Shown On Desktop Computer Display. Working Data Center Technical Department With Server Racks

Postgres RAG Stack: Embedding, Chunking & Vector Search

This is Part 2 of a three-part series (links at the bottom). The GitHub repo can be checked out here. Postgres RAG Stack brings together Postgres, pgVector, and TypeScript to power fast, semantic search. In Part One, we covered the theory behind semantic search: how embeddings convert meaning into vectors, how vector databases and indexes enable […]

C0045b10 5bc9 4b79 92f7 Ae643c5cb4ae

Vector Search Embeddings and Retrieval-Augmented Generation

This is Part 1 of a three-part series (links at the bottom). Traditional search engines and databases match based on keywords. These systems are fine when you’re looking for an exact or partial string match but fail when the goal is to find content that’s conceptually similar, not just textually identical. Vector search bridges this […]