ai2025-08· 8 min

Building a RAG eval harness that actually catches regressions

Beyond hit-rate: how to measure the things that matter in retrieval-augmented systems.

draft · long-form essay in progress

This piece is in the editing queue. The argument is sketched, the evidence is logged, the diagrams are half-finished. When it ships it ships — not before. In the meantime, the headline blurb above captures the thesis.

If you'd like to be notified when it's published, drop a line via the contact section.

next read

Trust-as-a-Service: the missing primitive in cross-border settlement

All writing