2025-02-12
Pairwise Grids Beat Lone Scores for Draft Reviews
Why we stopped asking reviewers for single numbers and started forcing comparisons between outputs.
evaluation · cohorts · quality
Longer reads for teams who want the messy details, not keynote summaries.