New ask Hacker News story: Ask HN: What's your biggest LLM cost multiplier?

Ask HN: What's your biggest LLM cost multiplier?
4 by teilom | 3 comments on Hacker News.
"Tokens per request" has been a misleading cost model for us in production. The real drivers seem to be multipliers: retries/429s, tool fanout, P95 context growth, and safety passes. What’s been the biggest cost multiplier in your prod LLM systems, and what policies worked (caps, degraded mode, fallback, hard fail)?

Comments