Expand description
Fixture-based reflection evaluation harness.
This module lets us score the current reflection heuristics against canned scenarios without requiring a live model/provider call.
Structs§
- Reflection
Eval Case - A canned scenario for evaluating reflection-guided retry quality.
- Reflection
Eval Report - Report for one reflection evaluation case.
- Reflection
Eval Summary - Summary report for a batch of reflection evaluation cases.
- Reflection
Eval Tool Result - Compact fixture representing one tool result for an evaluation turn.
- Reflection
Eval Turn - Compact fixture representing an initial or revised answer.
Enums§
- Reflection
Eval Tool Outcome - Tool outcome used in a fixture-based reflection evaluation turn.
Functions§
- builtin_
reflection_ eval_ cases - Built-in scenarios for regression-checking reflection effectiveness.
- evaluate_
reflection_ case - Evaluate one canned reflection case.
- evaluate_
reflection_ cases - Evaluate a batch of reflection cases.