llm - Abrarqasim Blogs

LLM Context Windows: The Anti-Hallucination Prompt That Backfired

My long-context RAG bot kept refusing questions whose answers were right there. The fix was in my anti-hallucination prompt, not my retriever.

Rayyan | July 19, 2026 | 8 min

AI

Claude Prompt Caching in 2026: The Setup That Actually Cut My API Bill

I ran Claude's prompt caching in production for six months. Here's the cache_control setup, TTL trade-offs, and the anti-patterns that burned me.

Rayyan | July 16, 2026 | 8 min

AI

pgvector in 2026: Why I Deleted Pinecone From Half My Side Projects

pgvector plus HNSW replaced Pinecone on half my side projects. The schema, the index picks, hybrid search with FTS, and where I still reach for…

Rayyan | July 10, 2026 | 10 min

AI

LLM Evaluation Metrics: What I Actually Trust After Judges Failed Me

I ran an LLM judge for six weeks before I noticed it kept rewarding longer answers. Here is the evaluation metrics stack I actually trust…

Rayyan | July 5, 2026 | 8 min

AI

When a Fine-Tuned Small LLM Beats GPT-5 (and When It Doesn’t)

Fine-tune an LLM and a small open model can beat GPT-5 on a narrow task, cheaper. A look at when QLoRA wins, when it loses,…

Rayyan | June 17, 2026 | 7 min

AI

Context Engineering vs Prompt Engineering: Why My Agents Failed

Prompt engineering tunes what you ask. Context engineering decides what the model sees at all. Where my agents kept dying on long runs, and the…

Rayyan | June 13, 2026 | 7 min

AI

LLM Hallucination Examples (And How I Catch Them)

Real LLM hallucination examples from my own apps, why retrieval doesn't fully fix them, and the cheap code checks I run to catch made-up answers…

Rayyan | June 11, 2026 | 8 min

AI

AI Debugging: Why I Stopped Letting One Model Fix Itself

A new paper on autonomous programming found that separating code generation from debugging across models nearly doubled solved problems. Here's the takeaway.

Rayyan | May 28, 2026 | 8 min

AI

Agentic Coding: Why the Feedback Loop Beats a Smarter Model

A smarter model still writes the same broken first draft. What actually moved my output was wrapping a cheaper model in a feedback loop that…

Rayyan | May 27, 2026 | 7 min

AI

Local LLMs in 2026: What I Actually Run, and What I Don’t

I spent a month running local LLMs on my own GPU instead of an API. Here is where self-hosted models earned their place, and where…

Rayyan | May 23, 2026 | 8 min