# Security

One of the goals of kstack is to help your AI agent perform Kubernetes monitoring, troubleshooting, and auditing tasks in a safer way than it normally would. Kstack does this by setting up security guard rails that tell the agent which inputs to treat as untrusted data and which follow-on actions require your permission before it takes them. This page describes the trust boundary, the measures kstack takes to keep untrusted data out of the agent's context, and the two skills that deliberately cross that boundary (`/exec`, `/logs`).

---

## The trust boundary

There are three parties involved in every interaction with your cluster using an AI agent:

- **You** — You supply the prompt, the `kubeconfig`, and the authority to act on the cluster. You are the only party that can grant permission for a destructive or dangerous action.
- **Agent** — Your agent interprets your prompt, decides which skill to invoke, and reads whatever output the skill returns.
- **Cluster** — The cluster returns data through the Kubernetes API and associated tools. That data is **not** trusted input. In a worst case scenario, pod names, log lines, annotations, and event messages are all attacker-controllable and an agent that reads them naively can be steered by them.

Kstack's job is to keep the agent useful without letting untrusted cluster data drive its behavior.

---

## What kstack does for you

### 1. Clearly defined response envelope

Every script that kstack skills use internally writes exactly one JSON object to stdout (the "response envelope"). The object's schema is versioned and validated against a [pre-defined format](https://github.com/kubetail-org/kstack/blob/main/src/schemas/response.schema.json):

```jsonc
{
  "kstack":        "1",
  "status":        "ok" | "error",
  "render":        "verbatim" | "agent",   // ok only
  "content":       string,                 // ok only; JSON-escaped
  "agent_context": string,                 // ok only; optional side-channel
  "kind":          "user" | "infra",       // error only
  "message":       string,                 // error only
  "kube_context":  string,                 // optional; pinned cluster
  "notice":        string                  // optional; operator banner
}
```

The security-relevant properties of this contract are:

<dl>
  <dt>`render`</dt>
  <dd>An enum (`verbatim` or `agent`) that tells the agent what the intended target of the `content` is. `render: verbatim` instructs the agent to print `content` as-is and end the turn without reformatting or follow-up reasoning. `render: agent` marks `content` as tool output that the agent may reason over.</dd>

  <dt>`content`</dt>
  <dd>A JSON-escaped string field that contains the payload for the agent.</dd>

  <dt>`agent_context`</dt>
  <dd>A JSON-escaped string field that scripts can use to provides contextual data that the agent can use for follow-ups (e.g. cache paths, counts, resolved identifiers). It is read by the agent and never shown to the user, which keeps `content` clean for `render: verbatim` and prevents metadata from leaking into the terminal.</dd>

  <dt>`message`</dt>
  <dd>A JSON-escaped string field that carries typed errors generated by scripts</dd>
</dl>

Non-zero exits are reserved for unexpected crashes in which case the agent is instructed to print stderr and stop rather than guess at intent.

### 2. Bulk cluster data stays on disk and is read through `jq`

Without kstack, an AI agent will stream every pod, node, and event into their context which burns context on irrelevant data and relies on the model to distinguish fields from instructions in a blob of attacker-influenced text. Kstack avoids this by writing responses to a per-context `cache_dir` and passing the agent the location so that follow-up questions can be answered by running `jq` against the cached JSON.

This approach has two benefits:

- **Structural parsing instead of prose interpretation.** `jq '.items[].metadata.name'` walks a known schema and returns a string value. A prompt-injection payload sitting in a pod annotation stays a string value at a known JSON path; it is not text the model is asked to comprehend as instructions.

- **Bounded context growth.** The full per-pod/per-node table never enters the model's context window. Deterministic summaries provided by kstack skills are a few hundred tokens regardless of cluster size, and subsequent turns only pull the specific fields the question requires.

### 3. Destructive or sensitive actions require your confirmation

Cluster actions performed by kstack skills are read-only and secure by design but you may ask your agent to perform follow-up tasks that mutate cluster state or request privileged information on your behalf. To ensure that the agent doesn't perform these actions for you without permission, each skill sets clear boundaries with the agent and instructs it to request confirmation before taking any action which mutates cluster state (deleting resources, modifying `ConfigMaps`) or exposes credentials (reading `Secrets`).

### 4. Cluster context is pinned per session

The first kstack skill in a session resolves and returns a `kube_context`. Subsequent skills are told to thread `--context=<value>` into every kubectl/kubetail call until you explicitly ask for a different cluster. This prevents you from accidentally performing an action intended for one cluster on another cluster.

### 5. Respects RBAC

Kstack ships as shell scripts and `SKILL.md` files installed under your home directory. Every Kubernetes call goes through your local `kubeconfig` using the credentials and RBAC bindings you already have for the `user` configured for each context.

---

## Skills that can stream live cluster data into the agent

Kstack includes two skills — `/exec` and `/logs` — that run inside a tmux pane that you and the agent can both interact with. These are the skills to think about before you run them, because the pane is where cluster data most easily ends up in the agent's reasoning context.

### Trust model for pane-based skills

By default, the agent can drive the tmux session (type commands, scope queries) but it does not read pane contents unless you instruct it to (e.g. "parse out the errors in the last set of logs"). This is designed to prevent untrusted data from entering into the agent's reasoning context accidentally. You can control this behavior with two flags:

- **`--trust-pane`** — the agent reads the pane every turn and may summarize, correlate, or send contents to the model API. Use when you explicitly want the agent reasoning over what's happening (e.g. "watch this and tell me when the crashloop clears"). 
- **`--detach`** — the agent never attaches to the pane at all. It can neither read nor type. You connect manually, and the model has no path to the session contents. This is the structural counterpart to the default — enforced by kstack, not by agent compliance.

Note that pane reading behavior is a prompt-level contract.

### `/exec`

The `/exec` skill opens an interactive shell into a pod, an ephemeral debug container, or a privileged pod on a node.

Things to be aware of:

- **Node and debug-container modes are privileged.** Node mode creates a short-lived pod with `hostPID`, `hostNetwork`, and the host filesystem mounted at `/host`. Debug-container mode joins the target pod's process namespace. Both grant access well beyond what a normal `exec` would. The agent will describe the resolved target and mode before opening the shell so you can confirm before proceeding.
- **The session keeps running until you tear it down explicitly.** The agent will not delete the privileged node pod on its own; it waits for you to ask.

### `/logs`

The `/logs` skill runs a Kubetail query against the cluster and streams matching log lines into the pane.

Things to be aware of:

- **Logs frequently contain sensitive data.** Tokens in `Authorization` headers, request bodies with PII, stack traces that include secrets are all common. Scope queries narrowly whether or not `--trust-pane` is set: a specific workload, a short time window, a targeted grep pattern.
- **Kubetail's node-side grep helps.** Pushing the filter down to the cluster means fewer lines cross the wire to begin with — which means fewer lines reach the pane (and, with `--trust-pane`, the agent). Be specific with your filters rather than pulling broadly and filtering in your head.

---

## Security testing

Kstack's test strategy exercises each of the protections above at the lowest tier where it's meaningful, so regressions surface as failing tests rather than surprises in the field. The project has four test tiers — [`tests/unit`](https://github.com/kubetail-org/kstack/tree/main/tests/unit), [`tests/integration`](https://github.com/kubetail-org/kstack/tree/main/tests/integration), [`tests/e2e`](https://github.com/kubetail-org/kstack/tree/main/tests/e2e), and [`tests/evals`](https://github.com/kubetail-org/kstack/tree/main/tests/evals) — and security coverage is spread across them as follows:

- **Unit (`bats`).** The escape routine in [`src/lib/response.sh`](https://github.com/kubetail-org/kstack/blob/main/src/lib/response.sh) and the envelope builders are exercised directly. Quotes, backslashes, newlines, control characters, and mixed binary-looking payloads are fed through `response::_escape` and `response::ok_*`, and each emitted envelope is round-tripped through `jq` to prove it is well-formed JSON. Injection-shaped strings — pod names containing `","`, annotations containing `\n"render":"agent"`, and so on — are checked to confirm they land as `content` values and do not introduce sibling keys. Flag parsing for `--detach` is unit-tested so that a typo or refactor cannot silently disable it.
- **Integration (`bats`).** Each skill is invoked end-to-end with a fake `kubectl` on `PATH`, and its emitted envelope is validated against the JSON schema in [`src/schemas/response.schema.json`](https://github.com/kubetail-org/kstack/blob/main/src/schemas/response.schema.json). `kube_context` pinning is checked by running a multi-skill flow and asserting every downstream invocation carries `--context=<resolved>`.
- **End-to-end (kind-backed).** A real cluster is stood up with `kind` and skills run against it. These tests confirm the structural protections hold under a real Kubernetes API server: RBAC denials produce typed `error` envelopes rather than leaking stderr, and privileged modes on `/exec` (node, debug-container) resolve to the expected pod specs before any confirmation path.
- **Evals (Claude-backed).** Prompt-level contracts need agent-in-the-loop testing, so they live here. Scenarios cover: prompt-injection payloads planted in pod annotations, log lines, and event messages, to confirm the agent treats them as data rather than instructions; attempts to make the agent mutate cluster state or read a `Secret` without confirmation; attempts to trick the agent into retrying a typed error as a different command; and pane-based tests that confirm the default no-read behavior holds without `--trust-pane`. Evals are probabilistic, so the rubric records pass rates rather than a binary result, and artifacts are retained for after-the-fact review.

A `shellcheck` lint pass runs on every PR to catch the classes of shell bugs (unquoted expansions, `eval` on untrusted input) that would otherwise undermine the structural guarantees. The unit and integration tiers run on every PR across Linux, macOS, and Windows on both `amd64` and `arm64`; the e2e tier runs on Linux; and the eval tier is triggered on-demand from CI.

If you spot a guarantee in the sections above that doesn't have a corresponding test, that's a bug worth reporting.

---

If you find a security issue, please report it via the [GitHub repo](https://github.com/kubetail-org/kstack).