2024-11-03 · Vivian Costa
Routing Tables for Models That Change Mid-Sprint
A lightweight routing table template we use when providers tweak tokenizers between sprints.
operations · routing · reliability
Routing is not only about latency. When a provider silently adjusts tokenization, prompts that felt stable can drift. We teach operations groups to keep a routing table with three columns: task type, acceptable latency band, and fallback prompt.
Fallback prompts are written like incident runbooks—short, imperative, and tested on archived samples. That discipline prevents teams from improvising under pressure.
We also log provider notices in the same sheet. It sounds bureaucratic until the first midnight page arrives and the on-call engineer knows exactly which route to disable.
Finally, we encourage a monthly “route retirement” review. If a path has not been used in thirty days, it should leave the table or earn a written reason to stay.