Insights
Insights.
Notes from real engagements. We write when we have something specific to say — not on a content schedule.
-
2026-04-15
Evals
Stop benchmarking models on benchmarks
Public benchmarks tell you almost nothing about how a model will behave on your problem. Build your own evals.
-
2026-03-18
RAG
Your RAG isn't broken. Your chunking is.
Most teams pick a chunk size, ship it, and then spend three months tuning the wrong layer.
-
2026-02-22
Agents
Ship the agent with a human in the loop. Always.
Every agentic project we've shipped has gone live with a human in the loop. Always.