Live Eval Roadmap

This roadmap is the practical build plan for adding live eval to the framework.

It is written for people deciding what to build next, not only for framework maintainers.

Phase 1: Shadow Mode

Goal:

Build:

Questions answered:

Phase 2: Assist Mode

Goal:

Build:

Questions answered:

Phase 3: Gate Mode

Goal:

Build:

Questions answered:

What Comes Last

These can wait until the framework has real live-eval history:

If telemetry is added before dashboards, it should stay downstream from canonical artifacts and use command-level spans or events rather than a synthetic end-to-end lifecycle trace.

The early goal is not complexity. The early goal is to learn what guidance actually works.