Architecture
Overview
Janus sits between the LLM agent and its tools. Every tool call is intercepted, evaluated against the policy, and either allowed or blocked before execution.
Design Decisions
Tool-level enforcement: We enforce at the tool-call boundary, not at prompt or output. Prompt-level filters cannot stop a shell command; they operate on text. This is the same insight as Progent; we implement it with JSON Schema conditions and optional SpiceDB/taint.
JSON Schema for conditions: Policies use JSON Schema to restrict arguments. We chose this over a custom DSL because (a) developers already know JSON Schema, (b) it integrates with existing tool definitions (OpenAI, etc.), and (c) validation is deterministic and well-tested. The tradeoff: complex logic (e.g. cross-argument dependencies) requires multiple rules or the SpiceDB engine.
Two engines: The default JSON enforcer is stateless and policy-file driven. The SpiceDB engine adds role-based ACLs and taint tracking. We keep them separate because taint requires session state and SpiceDB; many deployments need only the JSON enforcer. The SpiceDB engine is optional and can be omitted at install time.
Deny by default posture: Tools not in the policy are blocked.
Competitive Landscape
| Tool | Focus | How Janus Differs |
|---|---|---|
| Lakera Guard, NeMo Guardrails | Input/output content filtering | They filter text. Janus enforces at tool execution. A blocked prompt does not stop a tool call; a blocked tool call does. |
| IronCurtain | Natural-language policies via MCP | Janus uses machine-readable JSON Schema; deterministic evaluation; no reliance on NL interpretation. |
| Protect AI, Robust Intelligence | Pre-deployment scanning, red-teaming | Janus is runtime enforcement. It does not scan; it intercepts every call. |
| Progent | Academic prototype; privilege control | Janus is a production implementation with LangChain/ADK adapters, LLM policy generation, and optional SpiceDB/taint. |
Enforcement Flow
- Agent produces a tool call (name + arguments).
ToolRegistryreceives the call and invokesenforcer.enforce(tool_name, arguments).- Enforcer evaluates rules in priority order. First matching rule wins.
- If allowed: tool executes; result returns to the agent.
- If blocked:
PolicyViolationis raised (or fallback: exit / ask user). The exception can be fed back to the LLM as text so it can adjust.
Components
| Component | Role |
|---|---|
JanusAgent |
Entry point. Wires enforcer, tool registry, LLM provider, and runner. |
PolicyEnforcer |
Evaluates tool calls against loaded policy. Used by ToolRegistry. |
ToolRegistry |
Registers tools, dispatches calls, and invokes enforcer before execution. |
LLMRunner |
Conversation loop: messages → LLM → tool calls → results → LLM until done. |
| Providers | janus/llm/providers/ — OpenAI, Anthropic, Google, Azure, Bedrock, Ollama, vLLM, Together, OpenRouter. |
Threat Model and Guarantees
In Scope
- Over-privileged tool calls: Agent attempts tools or arguments beyond what the task requires.
- Indirect prompt injection (IPI): Malicious instructions in files, web pages, or retrieved data that cause dangerous tool calls.
- Memory/knowledge poisoning: Poisoned RAG or memory leading to harmful tool invocations.
- Malicious tools: Attacker-introduced tools; policies restrict to an allowlist.
- Path traversal: Built-in file tools sandbox to
workspace; out-of-workspace paths are rejected. - Runaway commands:
run_commandcan be constrained via policy (enum, pattern) to an allowlist.
Out of Scope
- Preference manipulation: Attacker tricks the agent to favor a malicious option among valid choices without exceeding least-privilege tool access. Janus cannot distinguish "valid but attacker-preferred" from "valid and user-intended."
- Text-output attacks: Exfiltration or harm via the model's text response only (no tool calls). Janus enforces at the tool boundary; it does not filter or constrain text output.
- Attacks within least privilege: If the user task legitimately requires a dangerous tool and the policy allows it, Janus will not block it. Policy correctness is the user's responsibility.
Guarantees
- Deterministic enforcement: Policy evaluation is symbolic and reproducible.
- Default-deny: Tools not covered by any matching rule are blocked. Unlisted tools are blocked when a policy is loaded.
- No reliance on LLM trust: Enforcement happens at the tool-call boundary; it does not depend on the model being resistant to prompt injection.
Caveats
- Policy correctness: Security guarantees hold only if policies are correct. Incorrect or incomplete policies can allow harmful calls.
- SpiceDB engine: Taint tracking depends on
update_taint()being called when the agent reads from risky sources. If integration omits this, taint stays low and high-risk tools may remain accessible.
Known Limitations
- Missing arguments: The JSON enforcer validates only arguments that are present. If the LLM omits an argument that a policy condition restricts, the condition is skipped and the call may be allowed. Fix planned: raise
ArgumentValidationErrorwhen a restricted argument is missing. - Taint session reset: Taint only increases during a session. Long-running services need a way to reset (e.g. per-request sessions). No
reset_session()yet. - SpiceDB unreachable: If SpiceDB is down, the engine raises a gRPC exception. No timeout, retry, or fail-closed toggle.
- Schema divergence: The SpiceDB schema in
main.pyandschema.zedhave diverged;schema.zedis not used at runtime. - LLM-generated policies: Effective but not provably complete. Manual policies can be crafted for provable coverage; LLM-generated ones reduce attack surface but may miss edge cases.
Project Structure
janus/
├── agent.py # JanusAgent
├── exceptions.py
├── llm/
│ ├── base.py
│ ├── runner.py
│ └── providers/
├── policy/
│ ├── enforcer.py # PolicyEnforcer (JSON Schema)
│ ├── pde_enforcer.py # SpiceDB engine adapter
│ ├── generator.py # LLM policy generation
│ ├── loader.py
│ └── validator.py
├── tools/
│ ├── base.py
│ ├── registry.py
│ └── builtin/
├── adapters/
│ ├── langchain.py
│ └── adk.py
examples/ # Demo scenario framework
├── shared/ # Events, mock tools, scripted LLM, scenario runner
├── scenarios/ # demo1_poisoned_readme, demo5_taint_cascade, ...
├── config.yaml
└── run.py # CLI: python -m examples.run <scenario> [--protected|--unprotected]
demos/ # Web app
├── app.py # FastAPI + WebSocket
├── static/index.html
├── docker-compose.yml # SpiceDB for Demo 5
└── README.md
Policy-Discovery-Engine/ # SpiceDB-backed enforcement (optional)
├── policy_engine/
│ ├── enforcement.py # GraphInterceptor, _SyncClient
│ └── main.py # Schema, TOOL_TAINT_LIMIT, bootstrap
└── ...