Daevix Docs

LLM Content Policy

Open as Markdown

Content security pipeline for the LLM proxy with model restrictions, content inspection, sidecar integration, and 3-tier policy hierarchy.

LLM Content Policy

The LLM proxy includes a content security pipeline that inspects all agent LLM traffic for policy violations. Content inspection is a platform capability that Daevix operates: it runs inside the managed enclave and LLM proxy, and policy is layered across Platform > Org > Agent scopes. Operators configure policies with the dvx policy and dvx config CLI using this 3-tier hierarchy.

The pipeline runs entirely on the enclave host - data never leaves the customer’s infrastructure.

Request flow:
  Agent → JWT auth → Read body → [Content Pipeline] → Resolve API key → Forward upstream
                                   ├─ Model restriction (block)
                                   ├─ Secrets filter (block)
                                   ├─ PII/API key/regex patterns (block or warn)
                                   └─ Sidecar inspectors (block or async)

Response flow:
  Upstream → Buffer response → [Tool policy] → [Content Pipeline] → Audit → Replay
                                                 ├─ Secrets filter (block)
                                                 ├─ PII/API key/regex patterns (block or warn)
                                                 └─ Sidecar inspectors (async)

Policy Hierarchy

Policies are resolved across three scopes. Platform and org rows live in service_config with service = "llmproxy"; agent-scope rows live in agent_config with override_service = "llmproxy" (set with dvx agent config set <agent> … --override-service llmproxy).

ScopeSet viaDescription
PlatformPUT /api/v1/platform/config/llmproxy/{key}Applies to all orgs and agents
OrgPUT /api/v1/config/llmproxy/{key}Applies to all agents in the org
AgentPOST /api/v1/agents/{id}/config with override_service: "llmproxy"Applies to a single agent

Resolution

For each policy key, the resolver looks up the value in order:

  1. Higher-scope lock check. If a service_config row at org or platform scope has locked = true, that value is returned - agent overrides are ignored.
  2. Agent override. If an agent_config row exists for (override_service = "llmproxy", name = key, agent_id), it is returned.
  3. Org fallback. Otherwise, the org-level service_config row (if any) is returned.
  4. Platform fallback. Otherwise, the platform-level service_config row (if any) is returned.
  5. Code default. Otherwise, the built-in default registered by the llmproxy process.

Locked Floors

Setting locked: true on a platform or org service_config row prevents the agent scope from overriding that key. Use it when you need a value enforced across every agent in the scope and below.

For policy values whose own JSON shape supports merging (for example, pattern lists), the resolver currently returns the winning row’s value as-is - there is no automatic append/merge across scopes. If you need platform baselines plus org additions, keep the baselines at platform scope and avoid setting the same key at org scope, or express the union explicitly in the higher-scope value.

Caching

Resolved policies are cached for 30 seconds per (orgID, agentID) pair. Changes to policy take effect within 30 seconds without proxy restarts.

Inert by Default

The pipeline is constructed unconditionally whenever the proxy has a database connection, but it is inert until policy is written. With no model_policy, content_inspection, or sidecar_inspection key set at any tier for an org/agent, the proxy builds an empty inspector list and request/response inspection is a no-op - zero behavioral change. This makes enabling inspection safe-by-default: nothing happens until an operator authors policy via the keys below.

Global Kill-Switch (inspect:llm)

A per-organization feature flag named inspect:llm gates the entire content-inspection pipeline. It is default-enabled: if no flag row exists for the org, inspection runs as configured.

Disabling it skips all inspection for that org - model restriction, content inspectors, and sidecars alike - even when policy is configured. The check is evaluated per request (the flag is org-scoped, while interceptor registration is process-global), so toggling it takes effect without a proxy restart. Use it as an org-wide off-switch without having to delete every policy key.

# Disable all LLM content inspection for an org
curl -X PUT https://controlplane:8443/api/v1/feature-flags/inspect:llm \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'

This mirrors the audit:llm flag that gates LLM audit logging - both run through the same default-enabled, DB-backed feature checker.

Failure Posture

The inspection pipeline runs inline on every LLM call, so its failure behavior is deliberately fail-open by default: a broken inspector, a transient database error, or a slow sidecar produces no findings and the request proceeds. This matches the sidecar fail-open behavior (see Error Handling below) and prevents a single misbehaving inspector from taking down all LLM traffic for an org.

Two proxy flags make the posture explicit and bounded:

  • --inspect-timeout (env DVX_INSPECT_TIMEOUT, default 2s) - a deadline around the entire request-phase inspection run (the whole inspector pipeline, not just sidecar calls). If inspection exceeds it, the in-flight run is abandoned and the configured failure posture applies.

  • --inspect-fail-closed (env DVX_INSPECT_FAIL_CLOSED, default false) - for high-assurance deployments. Selects what happens when request-path inspection times out or errors:

    • Fail-open (default, false) - the request proceeds with no findings, logged with the stable marker inspection failopen.
    • Fail-closed (true) - the request is rejected with a 503 carrying:
      {
        "type": "error",
        "error": {
          "type": "content_inspection_unavailable",
          "message": "Request rejected: content security inspection is unavailable."
        }
      }
      

    Response-path inspection remains observe-only regardless of this setting.

The key safety signal under fail-open is the rate of silently-skipped inspections; operators running fail-open should monitor for the inspection failopen log marker (and inspector/sidecar error logs) and alert on a non-trivial rate.

Policy Keys

Three policy keys are available under service = "llmproxy":

Model Policy

Restricts which models an agent can request. Request-phase only - the proxy extracts the model field from the request JSON body and checks it against the policy.

{
  "immutable": false,
  "mode": "allowlist",
  "models": ["claude-sonnet-4-*", "claude-haiku-*"]
}
FieldTypeDescription
immutableboolIf true, lower tiers cannot override this policy
modestring"allowlist" (only listed models allowed) or "blocklist" (listed models denied)
modelsstring[]Glob patterns matched against model identifiers (uses path.Match syntax)

Examples

Allow only Claude models:

curl -X PUT https://controlplane:8443/api/v1/service-config/llmproxy/model_policy \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"mode\":\"allowlist\",\"models\":[\"claude-*\"]}"}'

Block GPT models at the platform level (immutable):

curl -X PUT https://controlplane:8443/api/v1/platform-config/llmproxy/model_policy \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"immutable\":true,\"mode\":\"blocklist\",\"models\":[\"gpt-*\",\"o1-*\",\"o3-*\"]}"}'

model_policy vs Execution-Policy model_match

The platform has two ways to constrain which models an agent may use, and they are complementary - do not configure both for the same constraint:

  • model_policy (this key) - the pack-composable allow/blocklist enforced by the content pipeline. Model identifiers are matched as path.Match glob patterns. Choose this when you want policy-pack composition: a higher tier’s base policy and every installed pack’s model contribution compose by logical AND, so a pack can only ever narrow the permitted set, never relax it.
  • model_match in an execution policy - a condition in the broader DB-driven execution-policy rule engine. Choose this when the model constraint is part of a larger execution rule (combined with time windows or other conditions) and pack composition is not needed.

Express a model limit through model_policy when pack composition is desired, and through an execution policy otherwise. Configuring both for the same constraint produces overlapping, redundant enforcement.

When a request uses a disallowed model, the proxy returns a 403 with:

{
  "type": "error",
  "error": {
    "type": "content_policy_violation",
    "message": "Request blocked by content security policy."
  }
}

Content Inspection

Scans request and response bodies for sensitive content. All built-in inspectors run in parallel.

{
  "immutable": false,
  "secrets_filter": { "enabled": true, "severity": "block" },
  "pii_detection": { "enabled": true, "severity": "block", "types": ["email", "credit_card", "ssn"] },
  "api_key_detection": { "enabled": true, "severity": "block" },
  "patterns": [
    { "pattern": "INTERNAL_PROJECT_.*", "description": "Internal codename", "severity": "block" }
  ]
}

Secrets Filter

Detects leakage of the agent’s own secrets (managed with dvx secret). The proxy decrypts the agent’s secrets and performs substring scanning against the request/response body.

  • Only secrets with values >= 8 characters are scanned (shorter values produce too many false positives)
  • Matches are redacted in findings (first 4 characters + ****)

PII Detection

Detects personally identifiable information using regex patterns:

TypeWhat it detects
emailEmail addresses
credit_cardCredit card numbers (with Luhn checksum validation)
ssnUS Social Security Numbers (XXX-XX-XXXX format)

Use the types array to limit which PII types are detected. If omitted, all types are checked.

API Key Detection

Detects common API key formats:

ProviderPattern prefix
AWSAKIA
GitHubghp_, ghs_, github_pat_
Anthropicsk-ant-
Google CloudAIza
OpenAIsk-
Stripesk_test_, sk_live_, pk_test_, pk_live_

Custom Regex Patterns

Add organization-specific patterns for detecting proprietary terms, internal identifiers, or any text matching a regular expression. Patterns from all tiers are additive - lower tiers can add patterns but cannot remove patterns set by higher tiers.

{
  "patterns": [
    { "pattern": "PROJECT_(ALPHA|BETA)_\\d+", "description": "Internal project code", "severity": "block" },
    { "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b", "description": "SSN-like number", "severity": "warn" }
  ]
}

Invalid regex patterns are logged and skipped (they do not cause errors).

Caveat - inspectors see the post-injection request body. Request inspection runs after the proxy’s policy-context interceptor, which injects a platform-generated security-context prefix into the request’s system prompt. The intent is to scan exactly what leaves the proxy, so the inspector sees the final outbound body including that injected text. The injected prefix is platform-generated and trusted, so it is unlikely to trip the built-in secrets/PII/API-key inspectors - but a custom regex pattern can unintentionally match the injected prefix and produce false positives. Test custom patterns against bodies that include the injected context, not just the agent’s raw prompt.

Severity Levels

Each inspector has a configurable severity that determines what happens when a match is found:

SeverityEffect
logRecord the finding in the audit log. Request proceeds normally.
warnRecord the finding prominently. Request proceeds normally.
blockOn the request path, reject with a 403 response. On the response path, observe-only (see below). Finding recorded in audit log either way.

The redact severity is reserved for future use and currently behaves like warn.

Request vs Response Enforcement

Enforcement differs by phase:

  • Request phase - a block-severity finding short-circuits the request with a 403 carrying the generic content_policy_violation envelope (shown under Model Policy). The redacted match and inspector type appear only in the audit log and proxy logs - they are never leaked to the agent.
  • Response phase - inspection is observe-only for all severities, including block. A block finding on a completion is recorded (and alertable) but does not reject the response: the model has already produced (and billed) it, and the response is replayed to the agent. This is a deliberate divergence from the network proxy, which can block on the response path.

Separately, when --inspect-fail-closed is enabled, an inspection error or timeout on the request path returns a 503 (inspection unavailable) rather than a 403 - a 403 means a policy match, a 503 means inspection could not run.

Sidecar Inspection

Route LLM traffic to external HTTP services for additional inspection. Sidecars can implement custom logic like NER-based PII detection (Presidio), toxicity classification, or LLM-as-judge evaluation.

{
  "immutable": false,
  "sidecars": [
    {
      "name": "presidio-ner",
      "url": "http://presidio.daevix.svc:8080",
      "timeout_ms": 5000,
      "on_request": true,
      "on_response": true,
      "async": false,
      "severity": "block"
    },
    {
      "name": "llm-judge",
      "url": "http://llm-judge.daevix.svc:8081",
      "timeout_ms": 30000,
      "on_request": true,
      "on_response": false,
      "async": true,
      "severity": "warn",
      "include_context": true
    }
  ]
}
FieldTypeDefaultDescription
namestring(required)Identifier for the sidecar (used in finding reports and logs)
urlstring(required)HTTP endpoint the proxy POSTs to
timeout_msint5000Request timeout in milliseconds
on_requestboolfalseRun on outgoing LLM requests
on_responseboolfalseRun on incoming LLM responses
asyncboolfalseFire-and-forget mode (findings logged but don’t block)
severitystring"warn"Severity applied to returned findings
include_contextboolfalseInclude recent conversation messages in the sidecar request

Sidecars from all tiers are additive - lower tiers add sidecars, they cannot remove sidecars defined by higher tiers.

Sidecar Protocol

The proxy POSTs a JSON request to the sidecar URL:

{
  "agent_id": 42,
  "organization_id": 1,
  "agent_name": "coding-agent",
  "phase": "request",
  "body": "<raw LLM API request/response JSON>",
  "context": {
    "recent_messages": [{"role": "user", "content": "..."}]
  }
}

The context field is only populated when include_context is true. It contains the messages array extracted from the LLM request body, intended for LLM-as-judge sidecars that need conversation history.

The sidecar responds with:

{
  "findings": [
    {
      "description": "Toxic content detected",
      "match": "offensive phrase",
      "severity": "block"
    }
  ]
}

Error Handling

All sidecars fail open:

  • Network errors or timeouts produce no findings (request proceeds)
  • Non-200 HTTP responses produce no findings
  • Malformed JSON responses produce no findings
  • Response bodies are limited to 1 MB

Errors are logged for operator monitoring but never block LLM traffic. Operators who need fail-closed behavior should monitor sidecar error logs and alert accordingly.

Async Mode

When async is true, the sidecar request fires in the background. The LLM request proceeds immediately without waiting for the sidecar response. Any findings are logged to the audit log but cannot block the request.

Use async mode for high-latency inspectors (e.g., LLM-as-judge) where blocking would add unacceptable latency.

Policy Pack Composition

When policy packs are installed, the resolver composes their content_inspection and model contributions on top of the operator-authored base at read time, behind the same 30s cache. Composition is monotonic toward stricter:

  • Content inspection contributions fold in as additional layers - a pack can enable a toggle, add PII types, or add regex patterns, but can never disable an inspector the base enabled.
  • Model contributions compose by logical AND: a model is permitted only if the resolved base policy and every pack contribution permits it. A pack can only narrow the permitted model set, never relax the operator base.

No separate enablement step is required - installed packs take effect automatically alongside the service_config policy.

Audit Integration

All findings from content inspectors are recorded in the llm_audit_log table as a findings JSONB column. Each finding includes:

FieldDescription
inspector_typeWhich inspector produced the finding (e.g., pii, api_key, model_restriction, sidecar:presidio-ner)
severitylog, warn, or block
descriptionHuman-readable description of what was detected
matchThe matched content (redacted: first 4 characters + ****)
locationWhere the match was found: request_body, response_body, model, or sidecar

Query audit logs with findings via GET /api/v1/agents/{id}/audit-log. Each entry includes the findings array when content policy violations were detected.

Configuration Examples

Minimal: Model Restriction Only

Restrict an org to Claude models:

curl -X PUT https://controlplane:8443/api/v1/service-config/llmproxy/model_policy \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"mode\":\"allowlist\",\"models\":[\"claude-*\"]}"}'

Full: Platform Security Baseline

Set an immutable platform baseline with PII and API key detection:

# Platform-level content inspection (immutable)
curl -X PUT https://controlplane:8443/api/v1/platform-config/llmproxy/content_inspection \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"immutable\":true,\"pii_detection\":{\"enabled\":true,\"severity\":\"block\"},\"api_key_detection\":{\"enabled\":true,\"severity\":\"block\"}}"}'

Orgs can add custom patterns on top of this baseline but cannot disable PII or API key detection:

# Org adds custom patterns (additive)
curl -X PUT https://controlplane:8443/api/v1/service-config/llmproxy/content_inspection \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"patterns\":[{\"pattern\":\"CONFIDENTIAL_.*\",\"description\":\"Confidential marker\",\"severity\":\"block\"}]}"}'

Sidecar: Presidio NER + LLM Judge

curl -X PUT https://controlplane:8443/api/v1/service-config/llmproxy/sidecar_inspection \
  -H "Authorization: Bearer $OPERATOR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"value": "{\"sidecars\":[{\"name\":\"presidio\",\"url\":\"http://presidio.daevix.svc:8080\",\"timeout_ms\":5000,\"on_request\":true,\"on_response\":true,\"severity\":\"block\"},{\"name\":\"llm-judge\",\"url\":\"http://llm-judge.daevix.svc:8081\",\"timeout_ms\":30000,\"on_request\":true,\"async\":true,\"severity\":\"warn\",\"include_context\":true}]}"}'