Tech company cuts AI costs by using cheap Haiku model to triage 80% of CI failures before escalating to expensive Opus 4.6; saves money with smart agent architecture and SQL-based log queries.

We Upgraded to a Frontier Model and Our Costs Went Down

Last week we wrote about feeding terabytes of CI logs to an LLM. Most of the questions on Hacker News weren't about the logs. They were about the agent: which models, how they coordinate, and how much it all costs. Today we run Opus 4.6 and pay less than when we ran everything on Sonnet 4.0. The reason is mostly what Opus doesn't do: 80% of failures never reach it, and when they do, it never reads a log line. The architecture looks like this: Let a cheap agent decide if the expensive one is need...