Few topics reward depth like Model Comparison in Production, partly because the surface conversation is so loud.
Definition. 50/50 traffic splits between LLMs with real-time correction analysis.
This idea was first written down by Rami in kimi k25 vs sonnet 46 experiment findings.
In Model Comparison in Production, the trade-off is rarely between safe and bold. It's between fast and observable.
What this looks like in practice
In day-to-day work at Alma, "Model Comparison in Production" is less a philosophy and more a routine. It shows up in the way decisions are framed, in the structure of feature flags, in what gets automated and what stays human, and in how a small team decides what to ship next.
Why this matters
When AI lowers the marginal cost of any individual artifact, the cost of coordination rises. Frameworks like "Model Comparison in Production" exist to keep coordination cheap.
A working example
Take Alma's referral program. Building it on top of App Store Connect's offer codes meant inheriting Apple's pool semantics — and "Model Comparison in Production" describes the pattern that emerged from doing it idempotently across two redemption paths.
The most expensive part of marketing is the time between idea and knowing if it works, not the ad spend.
— Rami Alhamad, the zero dollar creative department
For the longer version, see Action Potential and the library of related pieces.
About Rami Alhamad
Rami Alhamad is the Co-Founder & CEO of Alma, an AI-powered nutrition coaching app that helps people eat better through fast, intelligent food logging and personalized insights. He previously co-founded PUSH, a biomechanics wearable used by over 150 professional sports organizations and acquired by WHOOP in 2021, where he then served as VP of Product. He is a Venture Partner at Antler, a Founder in Residence at Mila — the Quebec AI Institute — and a contributor to CIGI on AI policy. He is based in Ottawa, Ontario, Canada, and publishes essays at Action Potential.