Skip to main content

Module reflection_eval

gestura_core_pipeline

Module reflection_eval

Expand description

Fixture-based reflection evaluation harness.

This module lets us score the current reflection heuristics against canned scenarios without requiring a live model/provider call.

Structs§

ReflectionEvalCase: A canned scenario for evaluating reflection-guided retry quality.
ReflectionEvalReport: Report for one reflection evaluation case.
ReflectionEvalSummary: Summary report for a batch of reflection evaluation cases.
ReflectionEvalToolResult: Compact fixture representing one tool result for an evaluation turn.
ReflectionEvalTurn: Compact fixture representing an initial or revised answer.

Enums§

ReflectionEvalToolOutcome: Tool outcome used in a fixture-based reflection evaluation turn.

Functions§

builtin_reflection_eval_cases: Built-in scenarios for regression-checking reflection effectiveness.
evaluate_reflection_case: Evaluate one canned reflection case.
evaluate_reflection_cases: Evaluate a batch of reflection cases.