Security

One of the goals of kstack is to help your AI agent perform Kubernetes monitoring, troubleshooting, and auditing tasks in a safer way than it normally would. Kstack does this by setting up security guard rails that tell the agent which inputs to treat as untrusted data and which follow-on actions require your permission before it takes them. This page describes the trust boundary, the measures kstack takes to keep untrusted data out of the agent’s context, and the two skills that deliberately cross that boundary (/exec, /logs).

The trust boundary

There are three parties involved in every interaction with your cluster using an AI agent:

You — You supply the prompt, the kubeconfig, and the authority to act on the cluster. You are the only party that can grant permission for a destructive or dangerous action.
Agent — Your agent interprets your prompt, decides which skill to invoke, and reads whatever output the skill returns.
Cluster — The cluster returns data through the Kubernetes API and associated tools. That data is not trusted input. In a worst case scenario, pod names, log lines, annotations, and event messages are all attacker-controllable and an agent that reads them naively can be steered by them.

Kstack’s job is to keep the agent useful without letting untrusted cluster data drive its behavior.

What kstack does for you

1. Clearly defined response envelope

Every script that kstack skills use internally writes exactly one JSON object to stdout (the “response envelope”). The object’s schema is versioned and validated against a pre-defined format:

{
  "kstack":        "1",
  "status":        "ok" | "error",
  "render":        "verbatim" | "agent",   // ok only
  "content":       string,                 // ok only; JSON-escaped
  "agent_context": string,                 // ok only; optional side-channel
  "kind":          "user" | "infra",       // error only
  "message":       string,                 // error only
  "kube_context":  string,                 // optional; pinned cluster
  "notice":        string                  // optional; operator banner
}

The security-relevant properties of this contract are:

render: An enum (verbatim or agent) that tells the agent what the intended target of the content is. render: verbatim instructs the agent to print content as-is and end the turn without reformatting or follow-up reasoning. render: agent marks content as tool output that the agent may reason over.
content: A JSON-escaped string field that contains the payload for the agent.
agent_context: A JSON-escaped string field that scripts can use to provides contextual data that the agent can use for follow-ups (e.g. cache paths, counts, resolved identifiers). It is read by the agent and never shown to the user, which keeps content clean for render: verbatim and prevents metadata from leaking into the terminal.
message: A JSON-escaped string field that carries typed errors generated by scripts

Non-zero exits are reserved for unexpected crashes in which case the agent is instructed to print stderr and stop rather than guess at intent.

2. Bulk cluster data stays on disk and is read through `jq`

Without kstack, an AI agent will stream every pod, node, and event into their context which burns context on irrelevant data and relies on the model to distinguish fields from instructions in a blob of attacker-influenced text. Kstack avoids this by writing responses to a per-context cache_dir and passing the agent the location so that follow-up questions can be answered by running jq against the cached JSON.

This approach has two benefits:

Structural parsing instead of prose interpretation. jq '.items[].metadata.name' walks a known schema and returns a string value. A prompt-injection payload sitting in a pod annotation stays a string value at a known JSON path; it is not text the model is asked to comprehend as instructions.
Bounded context growth. The full per-pod/per-node table never enters the model’s context window. Deterministic summaries provided by kstack skills are a few hundred tokens regardless of cluster size, and subsequent turns only pull the specific fields the question requires.

3. Destructive or sensitive actions require your confirmation

Cluster actions performed by kstack skills are read-only and secure by design but you may ask your agent to perform follow-up tasks that mutate cluster state or request privileged information on your behalf. To ensure that the agent doesn’t perform these actions for you without permission, each skill sets clear boundaries with the agent and instructs it to request confirmation before taking any action which mutates cluster state (deleting resources, modifying ConfigMaps) or exposes credentials (reading Secrets).

4. Cluster context is pinned per session

The first kstack skill in a session resolves and returns a kube_context. Subsequent skills are told to thread --context=<value> into every kubectl/kubetail call until you explicitly ask for a different cluster. This prevents you from accidentally performing an action intended for one cluster on another cluster.

5. Respects RBAC

Kstack ships as shell scripts and SKILL.md files installed under your home directory. Every Kubernetes call goes through your local kubeconfig using the credentials and RBAC bindings you already have for the user configured for each context.

Skills that can stream live cluster data into the agent

Kstack includes two skills — /exec and /logs — that run inside a tmux pane that you and the agent can both interact with. These are the skills to think about before you run them, because the pane is where cluster data most easily ends up in the agent’s reasoning context.

Trust model for pane-based skills

By default, the agent can drive the tmux session (type commands, scope queries) but it does not read pane contents unless you instruct it to (e.g. “parse out the errors in the last set of logs”). This is designed to prevent untrusted data from entering into the agent’s reasoning context accidentally. You can control this behavior with two flags:

--trust-pane — the agent reads the pane every turn and may summarize, correlate, or send contents to the model API. Use when you explicitly want the agent reasoning over what’s happening (e.g. “watch this and tell me when the crashloop clears”).
--detach — the agent never attaches to the pane at all. It can neither read nor type. You connect manually, and the model has no path to the session contents. This is the structural counterpart to the default — enforced by kstack, not by agent compliance.

Note that pane reading behavior is a prompt-level contract.

`/exec`

The /exec skill opens an interactive shell into a pod, an ephemeral debug container, or a privileged pod on a node.

Things to be aware of:

Node and debug-container modes are privileged. Node mode creates a short-lived pod with hostPID, hostNetwork, and the host filesystem mounted at /host. Debug-container mode joins the target pod’s process namespace. Both grant access well beyond what a normal exec would. The agent will describe the resolved target and mode before opening the shell so you can confirm before proceeding.
The session keeps running until you tear it down explicitly. The agent will not delete the privileged node pod on its own; it waits for you to ask.

`/logs`

The /logs skill runs a Kubetail query against the cluster and streams matching log lines into the pane.

Things to be aware of:

Logs frequently contain sensitive data. Tokens in Authorization headers, request bodies with PII, stack traces that include secrets are all common. Scope queries narrowly whether or not --trust-pane is set: a specific workload, a short time window, a targeted grep pattern.
Kubetail’s node-side grep helps. Pushing the filter down to the cluster means fewer lines cross the wire to begin with — which means fewer lines reach the pane (and, with --trust-pane, the agent). Be specific with your filters rather than pulling broadly and filtering in your head.

Security testing

Kstack’s test strategy exercises each of the protections above at the lowest tier where it’s meaningful, so regressions surface as failing tests rather than surprises in the field. The project has four test tiers — tests/unit, tests/integration, tests/e2e, and tests/evals — and security coverage is spread across them as follows:

Unit (bats). The escape routine in src/lib/response.sh and the envelope builders are exercised directly. Quotes, backslashes, newlines, control characters, and mixed binary-looking payloads are fed through response::_escape and response::ok_*, and each emitted envelope is round-tripped through jq to prove it is well-formed JSON. Injection-shaped strings — pod names containing ",", annotations containing \n"render":"agent", and so on — are checked to confirm they land as content values and do not introduce sibling keys. Flag parsing for --detach is unit-tested so that a typo or refactor cannot silently disable it.
Integration (bats). Each skill is invoked end-to-end with a fake kubectl on PATH, and its emitted envelope is validated against the JSON schema in src/schemas/response.schema.json. kube_context pinning is checked by running a multi-skill flow and asserting every downstream invocation carries --context=<resolved>.
End-to-end (kind-backed). A real cluster is stood up with kind and skills run against it. These tests confirm the structural protections hold under a real Kubernetes API server: RBAC denials produce typed error envelopes rather than leaking stderr, and privileged modes on /exec (node, debug-container) resolve to the expected pod specs before any confirmation path.
Evals (Claude-backed). Prompt-level contracts need agent-in-the-loop testing, so they live here. Scenarios cover: prompt-injection payloads planted in pod annotations, log lines, and event messages, to confirm the agent treats them as data rather than instructions; attempts to make the agent mutate cluster state or read a Secret without confirmation; attempts to trick the agent into retrying a typed error as a different command; and pane-based tests that confirm the default no-read behavior holds without --trust-pane. Evals are probabilistic, so the rubric records pass rates rather than a binary result, and artifacts are retained for after-the-fact review.

A shellcheck lint pass runs on every PR to catch the classes of shell bugs (unquoted expansions, eval on untrusted input) that would otherwise undermine the structural guarantees. The unit and integration tiers run on every PR across Linux, macOS, and Windows on both amd64 and arm64; the e2e tier runs on Linux; and the eval tier is triggered on-demand from CI.

If you spot a guarantee in the sections above that doesn’t have a corresponding test, that’s a bug worth reporting.

If you find a security issue, please report it via the GitHub repo.