2025-03-02 · Diego Fontes
Smoke Prompts After Deploy: A Field Note
What we log when a smoke prompt fails, and how we keep logs readable for humans.
automation · deploy · observability
Smoke prompts should behave like concise unit tests: one intent, one expected shape, and a clear failure string. We avoid dumping entire completions into logs—only the diff-friendly slice that proves drift.
Teams attach the prompt hash, model route, and wall clock. If a provider rotates a minor version, the hash helps correlate failures without exposing customer content.
We also recommend a “green path” archive: three golden completions stored locally for offline comparison when VPNs fail.
This approach will not catch every subtle regression, but it gives release managers a binary signal within minutes. That is often enough to stop a rollout before noisy user reports arrive.