Skip to main content
MCP Documentation

Evaluation Schema Reference

Schema definitions for unit test runs, rubric scoring, and evaluation result payloads.

Unit test runs

`EvalUnitTestRequest` describes commands, environment, timeout, and evidence expectations. Responses deliver structured pass/fail summaries.

  • `sandbox_profile` selects containerized environments (roadmap).
  • `artifact_paths` capture generated files like coverage reports.
  • `retry_policy` instructs orchestrator on automatic reruns.

Rubric scoring

`EvalRubricRequest` references rubric templates, scoring weights, and evaluation prompts. Responses include rubric items, scores, and rationales.

  • `threshold` determines pass/fail gating.
  • `explanations` store the evaluator's justification for transparency.
  • `reference_outputs` optionally compare against ground truth examples.