Matthew Aberham

Matthew Aberham is a solutions architect, and full-stack engineer focused on building scalable web platforms and intuitive front-end experiences. He works at the intersection of performance engineering, interface design, and applied AI systems.

Blogs from this Author

vLLM v0.16 Adds WebSocket Realtime API and Faster Scheduling

vLLM v0.16.0: Throughput Scheduling and a WebSocket Realtime API Date: February 24, 2026 Source: vLLM Release Notes Release Context: This is a version upgrade. vLLM v0.16.0 is the latest release of the popular open-source inference server. The WebSocket Realtime API is a new feature that mirrors the functionality of OpenAI’s Realtime API, providing a self-hosted […]

Artificial Intelligence Generative AI

LLM Concept Vectors: MIT Research on Steering AI Behavior

Date: February 23, 2026 Source: Science Researchers from MIT and UC San Diego published a paper in Science describing LLM concept vectors and a new algorithm called the Recursive Feature Machine (RFM) that can extract these concept vectors from large language models. Essentially, these are patterns of neural activity corresponding to specific ideas or behaviors. […]

Generative AI

Anthropic Accuses DeepSeek of Distillation Attacks on Claude

Date: February 23, 2026 Source: Anthropic Blog Anthropic published a detailed post revealing what it calls an Anthropic distillation attack at industrial scale, accusing three Chinese AI labs (DeepSeek, Moonshot AI/Kimi, and MiniMax) of systematically extracting Claude’s capabilities. According to Anthropic, the labs created over 24,000 fraudulent accounts and generated more than 16 million exchanges […]

Artificial Intelligence

Minimax M2 Reap 172b A10b Mxfp4 Moe Gguf

Minimax M2: Innovative Reasoning Strategy from Open-Source Model Showing Big Results

In the fast-paced world of artificial intelligence, a new open-source model from Chinese AI firm Minimax is making a significant impact. Released in late October 2025, Minimax M2 has rapidly gained acclaim for its innovative approach to reasoning, impressive performance, and cost-effectiveness, positioning it as a formidable competitor to established proprietary models. A New Architecture for a […]

Artificial Intelligence Generative AI

Chandra OCR: The BEST in Open-Source AI Document Parsing

In the specialized field of Optical Character Recognition (OCR), a new open-source model from Datalab is setting a new benchmark for accuracy and versatility. Chandra OCR, released in October 2025, has rapidly ascended to the top of the leaderboards, outperforming even proprietary giants like GPT-4o and Gemini Pro on key benchmarks. Beyond Simple Text Extraction Chandra is not […]

Artificial Intelligence Data + Intelligence Machine Intelligence

Request Hedging: Accelerate Your App by Racing Duplicate Calls

Users notice slow requests; even if 99 % finish quickly, that 1 % “long‑tail” latency can make your app feel sluggish. Request hedging solves this by speculatively firing a second duplicate after a short delay, racing to beat out outliers before they ever impact the UI. Why the slowest 1 % of requests matter The time it takes […]

Development

Tool‑Augmented RAG Chatbot: GPT‑4, pgVector & Next.js

This is Part 3 of a three-part series (links at the bottom). In Part Two, we moved from concept to execution by building the foundation of a Retrieval‑Augmented Generation (RAG) system. We set up a Postgres database with pgvector, defined a schema, wrote a script to embed and chunk text, and validated vector search with cosine similarity. In […]

Artificial Intelligence Back-End Development Front-End Development Generative AI

Two Professional It Programers Discussing Blockchain Data Network Architecture Design And Development Shown On Desktop Computer Display. Working Data Center Technical Department With Server Racks

Postgres RAG Stack: Embedding, Chunking & Vector Search

This is Part 2 of a three-part series (links at the bottom). The GitHub repo can be checked out here. Postgres RAG Stack brings together Postgres, pgVector, and TypeScript to power fast, semantic search. In Part One, we covered the theory behind semantic search: how embeddings convert meaning into vectors, how vector databases and indexes enable […]

Artificial Intelligence Back-End Development Generative AI

Vector Search Embeddings and Retrieval-Augmented Generation

This is Part 1 of a three-part series (links at the bottom). Traditional search engines and databases match based on keywords. These systems are fine when you’re looking for an exact or partial string match but fail when the goal is to find content that’s conceptually similar, not just textually identical. Vector search bridges this […]

Artificial Intelligence Data + Intelligence Generative AI