Skip to content
KubeIntellect

KUBEINTELLECT

AI DEVOPS ENGINEER FOR KUBERNETES

Diagnose CrashLoopBackOff, pending pods, and RBAC issues in plain English. Parallel specialist agents investigate your cluster — a human-approval gate waits before any write operation runs.

10 Failure playbooks
4 Parallel subagents
4 Role tiers
HITL All write ops gated

What it does

  • :material-kubernetes: Kubernetes Intelligence


    Runs kubectl across get, describe, logs, top, events, apply, scale, and delete. Routes complex diagnostics to four parallel specialist subagents (pod, metrics, logs, events) and synthesises findings into a single root-cause report.

  • :material-chart-line: Metrics + Logs


    Native Prometheus PromQL and Loki LogQL integration. The coordinator automatically delegates to the right data source — you ask in plain English, it picks the tool.

  • :material-shield-check: Safety Gates


    Every destructive operation pauses for human approval before kubectl is called. Four role tiers (superadmin / admin / operator / readonly) limit what each API key can request. Shell injection, secret/serviceaccount access, and writes to infrastructure namespaces are blocked before any subprocess runs.

  • :material-brain: Stateful Conversations


    Sessions are checkpointed in PostgreSQL or SQLite. Ask follow-up questions, approve a pending action hours later, or replay a session post-mortem — all in the same thread.


See it in action

kq — KubeIntellect CLI
kq
Connected to KubeIntellect · cluster: prod-aks · namespace: default

You why is the payments pod crashing?

Dispatching to DiagnosticsOrchestrator …
├─ pod-agent → OOMKilled (exit 137, last 3 restarts)
├─ metrics-agent → memory.working_set spiking to 512 Mi (limit: 256 Mi)
└─ log-agent → "java.lang.OutOfMemoryError: Java heap space"

Root cause: payments-api is hitting its memory limit.
Heap dump shows unbounded cache growth in PaymentSessionCache.

Proposed fix: patch memory limit → 512 Mi + add cache eviction policy.

⚠ Requires approval before kubectl patch runs.
You approve
✓ patched deployment/payments-api — rollout in progress

How it works

You (kq CLI or any OpenAI-compatible client)
   │  POST /v1/chat/completions  (SSE streaming)
┌──────────────────────────────────────────────────────────────────┐
│  LangGraph workflow                                              │
│                                                                  │
│  memory_loader  →  context_fetcher  →  coordinator               │
│   (DB context)    (live snapshot +      (LLM + tools)            │
│                    playbook match)                               │
│                                              │                   │
│           ┌──────────────────────────────────┴──────────┐        │
│           ▼ TARGETED      ▼ RCA_REQUIRED               ▼ direct │
│   targeted_investigator   subagent_executor × 4     answer →END │
│   (3 parallel reads)      (pod | metrics |                       │
│           │                logs | events,                        │
│           │                parallel fan-out)                     │
│           ▼                       │                              │
│       coordinator ← findings — coordinator (synthesis) → END     │
│                                                                  │
│  Tools: run_kubectl │ query_prometheus (PromQL) │ query_loki     │
│  HITL interrupt fires on every destructive kubectl verb.         │
└──────────────────────────────────────────────────────────────────┘
LangGraph checkpoint store (Postgres / SQLite)
+ rca_outcomes / failure_patterns (reflexion subsystem)

Each turn passes through five additive agent behaviors — error interpretation, snapshot bias, parallel discipline, playbook injection, and a visible investigation plan. Verified outcomes feed the reflexion subsystem, which promotes recurring fixes back into future prompts (cluster-scoped, with cooldown and decay).

Responses stream back as Server-Sent Events. The API is OpenAI-compatible — point any SSE client or your own tooling at /v1/chat/completions.


Pick your path

  • :material-lightning-bolt: Quickest — no cluster


    pip install kubeintellect + SQLite. Explore the API and CLI in minutes, no Kubernetes needed.

    → Install guide

  • :material-server-network: Existing cluster


    Connect KubeIntellect to any cluster you already have — AKS, EKS, GKE, or any kubeconfig.

    → Install guide

  • :material-docker: Docker Compose


    Full local stack with PostgreSQL, optional Prometheus + Grafana + Loki, and optional Langfuse LLM tracing.

    → Deploy guide

  • :material-cloud-upload: Production (Helm)


    Helm chart for AKS / EKS / GKE. Includes RBAC, secrets management, ingress, and resource limits.

    → Deploy guide


Quick install

pip install kubeintellect
kubeintellect init   # setup wizard — configures LLM key,
                     # optionally creates Kind cluster,
                     # installs systemd service
kq                   # open a new terminal — that's it
git clone https://github.com/MSKazemi/kubeintellect
cd kubeintellect
cp .env.example .env        # set LLM key + KUBEINTELLECT_ADMIN_KEYS
docker compose up -d
pip install kube-q
KUBE_Q_API_KEY=<your-admin-key> kq --url http://localhost:8000
git clone https://github.com/MSKazemi/kubeintellect
cd kubeintellect
make kind-cluster-create
cp .env.example .env        # add your LLM key
make kind-deploy-kubeintellect
make cli                    # opens kq REPL
helm repo add kubeintellect https://mskazemi.github.io/kubeintellect
helm install kubeintellect kubeintellect/kubeintellect \
  --set llm.provider=openai \
  --set llm.apiKey=<YOUR_KEY>

Supported LLM providers

  • :material-brain: OpenAI

    GPT-4o (coordinator) + GPT-4o-mini (subagents). Set LLM_PROVIDER=openai.

  • :material-microsoft-azure: Azure OpenAI

    Any Azure-hosted deployment. Default AZURE_OPENAI_API_VERSION=2024-10-01-preview enables automatic prefix caching. Set LLM_PROVIDER=azure.

See Configuration → LLM provider for the full list of model and deployment variables.


Open-source · AGPL-3.0 · GitHub · KubeIntellect v1 (legacy)