Skip to content

/events

The /events skill pulls recent Kubernetes events and collapses them into a short, ranked list — Warning events first, then anything notable in Normal. It is read-only and never mutates cluster state.

The output is deliberately bounded so the first response is cheap to read and cheap to re-emit through the model. The full event list is written to a local JSON cache that the agent reads from on follow-up questions, without re-hitting the API.

/events # snapshot (uses cache if fresh)
/events --refresh # force a fresh fetch
/events --ttl 5m # only re-fetch if older than 5m

This skill takes no positional arguments. Follow-up questions (“only payments”, “events on pod/checkout-7c9”, “show me the suppressed Normal events”) are answered from the cache — see Follow-ups below.


Sources: Kubernetes API only — kubectl get events --all-namespaces (or the equivalent against events.k8s.io/v1), sorted server-side by lastTimestamp.


The skill fetches events for the cluster in a single call and writes the full list to a per-context cache directory as events.json. Aggregation and severity ranking happen client-side on that JSON, so repeat runs within the TTL window skip the API entirely.

The summary block looks like this:

Events: prod-us-east · last 1h · 2 warning groups, 1 notable
WARN payments/Pod BackOff 14× 2m ago "Back-off restarting failed container server in pod checkout-7c9"
WARN ingress/Pod FailedScheduling 1× 38m ago "0/12 nodes are available: 3 node(s) had untolerated taint…"
NOTE kube-system/Node NodeNotReady 1× 52m ago "Node ip-10-0-3-14 status is now: NodeNotReady"
…and 412 Normal events (Pulled, Created, Started, Scheduled) suppressed.
Snapshot cached (TTL 5m). Ask to drill in — e.g. "only payments", "events on pod/checkout-7c9", "show suppressed".

When the window is clean, the skill prints a single line confirming there’s nothing to report and exits.


The summary deliberately collapses chatty Normal reasons and truncates per-object detail so the initial response stays small. When you ask for more — or anything else that can be answered from the cached event list — the agent reads the cache with jq instead of re-running the skill:

❯ /events
[ summary... ]
❯ only payments
[ events filtered to namespace payments, from events.json ]
❯ show the suppressed Normal events
[ full Normal-event list, from events.json ]
❯ events on pod/checkout-7c9
[ filtered by involvedObject, walking owners one level up, from events.json ]

For data that isn’t in the cache (logs, a specific resource’s YAML, root-cause across multiple sources), the agent routes to the right skill — /logs or /investigate — rather than widening /events.

Say “refresh” / “fetch again” / “re-check” and the agent re-invokes the skill with --refresh.


Beyond fetching the event list, the skill briefs the agent on how to behave on follow-ups:

  • Prefer answering from the cached events.json with jq over re-invoking the skill.
  • Treat the chatty reason set (Pulled, Created, Started, Scheduled, SuccessfulCreate) as collapsible — surface them only when the user asks for the suppressed set.
  • When the user asks about “events on pod/X”, walk owners one level up (PodReplicaSetDeployment, PodJobCronJob) so events fired against the controller aren’t missed.
  • Hand off to /logs for the container output behind a BackOff or CrashLoopBackOff, and to /investigate when a single resource becomes the focus.

--refresh
Bypass the cache and fetch fresh data from the API.
--ttl <duration>
Only re-fetch if the cached snapshot is older than this (kubectl-style: 1m, 5m, 1h). Default: 5m. Ignored when —refresh is set.

Global flags from Overview also apply.