Evaluation Schema Reference
Schema definitions for unit test runs, rubric scoring, and evaluation result payloads.
Unit test runs
`EvalUnitTestRequest` describes commands, environment, timeout, and evidence expectations. Responses deliver structured pass/fail summaries.
- `sandbox_profile` selects containerized environments (roadmap).
- `artifact_paths` capture generated files like coverage reports.
- `retry_policy` instructs orchestrator on automatic reruns.
Rubric scoring
`EvalRubricRequest` references rubric templates, scoring weights, and evaluation prompts. Responses include rubric items, scores, and rationales.
- `threshold` determines pass/fail gating.
- `explanations` store the evaluator's justification for transparency.
- `reference_outputs` optionally compare against ground truth examples.