llm - Abrarqasim Blogs

RAG Chunking Strategies in 2026: What I Actually Use

After two years of running RAG in production, here are the chunking strategies I actually use, the ones I dropped, and why chunk size is…

Rayyan | May 4, 2026 | 8 min

AI

LLM Safety Has a Stateless Multi-Turn Blind Spot (And What It Means for Your App)

A new attack exploits the fact that most LLM safety layers evaluate turns in isolation. Here's what the Transient Turn Injection result means for anyone…

Rayyan | April 26, 2026 | 7 min

AI

LLM limitations on spatial tasks: what devs need to know

Why LLMs fail at spatial tasks isn't always about model size. The input format matters more than you'd expect, and here's what to do about…

Rayyan | April 19, 2026 | 7 min

AI

LLM self-repair: how many retries actually help?

New numbers on iterative LLM self-repair: where extra tries earn their keep, where they quietly waste tokens, and the retry loop I'll actually defend.

Rayyan | April 17, 2026 | 8 min

AI

Your reasoning model is more fragile than the benchmarks say

A new paper tested 8 reasoning models with 14 formatting perturbations. Open-weight models lost up to 55% accuracy. Here's what that means for production use.

Rayyan | April 15, 2026 | 6 min

AI

Which 2026 Open-Weight LLMs Actually Run on One Workstation?

A practical look at which new open-weight LLMs you can actually run on a single workstation with llama.cpp in 2026, and which ones are just…

Rayyan | April 14, 2026 | 7 min

AI

The skills LLMs are best at replacing aren’t the ones you’d expect

New research scores 263 tasks across four LLMs. Math and coding top the automation list, but the jobs most exposed to AI demand the opposite…

Rayyan | April 14, 2026 | 6 min

AI

Your LLM Evals Are Testing the Wrong Thing

New research shows LLM API testing misses how chatbots actually behave. Here's what the data says about the gap and how to fix your eval…

Rayyan | April 14, 2026 | 7 min

AI

Your LLM context window is lying to you about numbers

LLMs claim massive context windows but choke on long number sequences. A training-free trick called SepSeq fixes attention dispersion and boosts accuracy 35%.

Rayyan | April 14, 2026 | 6 min

$LLMs Do Worse at Math When the Names Aren’t English$

AI

LLMs Do Worse at Math When the Names Aren’t English

A new benchmark shows LLMs drop 0.3 to 5.9% accuracy on grade-school math when names and places go non-Western. Same arithmetic, different answers.

Rayyan | April 11, 2026 | 6 min