How Personal Assistant System (PAS) runs autonomous work safely
The orchestrator plans directed workflows, uses approved tools, and verifies results—all on your machine by default.
The Orchestrator
The orchestrator converts natural language goals into a directed plan (DAG). It enforces autonomy budgets, handles retries, and pauses for approvals automatically. Every decision is logged to episodic memory so you can replay a run step-by-step.
Directed planning
Goals become DAGs with explicit dependencies, retries, and fallbacks.
Autonomy budgets
Bound each run with time, token, and cost budgets that cannot be exceeded without approval.
Approvals
Pause at high-risk nodes. Continue only when an approver confirms the action.
Audit trail
Every step writes structured logs to episodic memory and emits metrics.
Tools & Memory
Personal Assistant System (PAS) ships with reader tools for code, documents, vector memory, and data. Writer tools are gated with approvals. Vector and knowledge graph stores give the assistant durable recall.
Reader tools first
Search docs, inspect code, query data sources—all scoped by allowlists.
Writer tools (gated)
Diff proposals, doc writes, and outbound comms require explicit approval.
Vector memory
Embed content locally with llm-local and retrieve top-K results without network calls.
Knowledge graph & episodic
Structured facts and chronological logs give the assistant durable recall.
Retrieval + recall example
- Goal
- "Summarize latest sprint notes and highlight blockers."
- Steps
1. docs.search_docs("sprint notes") → returns markdown files
2. vector.search(top_k=5) → fetch prior decision context
3. llm-local.generate → create summary & blocker table
4. docs.write_doc (gated) → approval required before publishing
- Artifacts
- Markdown summary, blocker checklist, audit log with hashes.
Verification with mcp-eval
Every plan can include evaluation nodes that run before results are published. Failed checks trigger revisions or hold for approval, guaranteeing that risky actions are reviewed.
mcp-eval
Bundle rubric checks, assertions, and unit-test stubs to validate outputs.
Self-check loops
The orchestrator can branch into revision steps if evaluations fail.
Policy integration
Mark specific tools as requiring eval passes before the plan can complete.
Eval flow example
- Prepare: orchestrator registers eval nodes in the DAG.
- Run: eval.run_unit_tests executes containerized tests (CI-friendly).
- Score: eval.score_qa checks rubric adherence for natural language outputs.
- Decide: pass → continue; fail → branch into revision or hold for approval.
- Record: episodic.append stores outcomes with artifacts for audit.
Safety modes & policy controls
Choose the autonomy posture that fits your environment. Switch modes instantly and audit every change.
Constrained (default)
Manual approvals required for writer tools, outbound connections disabled, budgets enforced.
Autonomous
Pre-approved plans can run unattended but still respect budgets and risk classes.
Rate limits & allowlists
Throttle tool usage and restrict access by path, domain, or schema.
Tool risk classes
Tag tools with Low/Medium/High risk and tailor approval rules accordingly.
Plan → Execute → Verify loop
Plan
Create DAG with explicit edges, retries, and budgets.
Execute
Call approved tools with automatic telemetry and guardrails.
Verify
Run eval harnesses and gather artifacts for review.
Approve
Pause for consent when policy requires it or when eval signals fail.
Publish
Deliver results, log everything, and notify subscribers.
Ready to see it in action?
Download Personal Assistant System (PAS), run the verification script, and explore the sample plans included in the repo.