Generation & Embedding API
Formal schema for `POST /generate` and `POST /embed`, including safety controls, caching hints, and evaluation hooks.
Generate endpoint
Requests accept prompt messages, stop sequences, budget hints, and policy context. Responses return streaming chunks plus the final completion with token usage.
- Set `safety_mode` to `strict` to automatically redact risky outputs.
- `model_ref` identifies the local model hash used for the run.
- `evaluation_plan` attaches rubric or unit-test follow-ups for orchestrator workflows.
Embed endpoint
Embedding requests contain documents, metadata, and retention tags. Responses return vectors alongside checksum references for deduplication.
- Supports streaming large corpora through pagination tokens.
- `vector_namespace` segments embeddings for different projects.
- `retention_ttl` enforces automatic purging aligned with compliance rules.