Introduction
Lovenet Home Operations Repository
Production-grade Kubernetes for a household. GitOps with Flux ยท Automated dependency updates with Renovate ยท Self-hosted by design
๐ Overview
This is the live configuration for a multi-node Kubernetes cluster that runs a household โ home automation, security cameras, media, document management, AI workloads, and the operational tooling required to keep it all up. Every change lands in Git first; Flux reconciles the cluster from there, and Renovate keeps dependencies current via PRs.
The repo is GitOps-strict: applications are declared as HelmRelease resources, secrets are pulled from 1Password through External Secrets Operator, and clusters are mostly identical except for app selection and sizing. Operational quirks, durability tiers, and security defaults live alongside the manifests in .agents/instructions/ so the conventions are enforceable, not folklore.
๐บ๏ธ Architecture
flowchart LR
Dev[๐ค Operator] -->|git push| Repo[(๐ฆ GitHub<br/>home-ops)]
Renovate[๐ค Renovate] -.->|automated PRs| Repo
Repo -->|reconciles| Flux[โ๏ธ Flux]
Flux -->|deploys| Cluster[โธ๏ธ Kubernetes<br/>10 nodes ยท 168 apps]
Cluster --> Ceph[(๐ชจ Ceph<br/>block ยท default durable)]
Cluster --> LH[(๐ Longhorn<br/>+ recurring backups)]
Cluster --> Garage[(๐งบ Garage<br/>S3-compatible)]
Cluster --> NFS[(๐๏ธ NFS<br/>beast / brain ยท bulk media)]
LH -->|weekly + monthly| NFS
Garage -->|rclone CronJobs| AWS[โ๏ธ AWS S3<br/>Glacier Deep Archive<br/>offsite DR]
classDef store fill:#1e293b,stroke:#475569,color:#e2e8f0
class Ceph,LH,Garage,NFS,AWS store
Storage tiers are picked deliberately per workload โ see storage-class.instructions.md for the decision tree.
๐งฐ Stack at a glance
| Layer | Tool | Role |
|---|---|---|
| OS | CentOS Stream 9 / 10 | Node operating system |
| Runtime | cri-o + crun | CRI runtime + OCI runtime (C implementation) |
| Kubernetes | v1.35.4 | Control-plane and node version |
| GPU | NVIDIA GPU Operator + Container Toolkit | P40 driver/runtime management on worker8 |
| GitOps | Flux2 | Declarative cluster reconciliation |
| Automation | Renovate + GitHub Actions | Dependency PRs, link checks, self-hosted runners |
| CNI | Cilium (eBPF) | Networking, BGP peering, LoadBalancer pool |
| Ingress | Envoy Gateway | L7 gateway / HTTPRoute |
| Service mesh | Istio | mTLS + traffic mgmt for mcp-system |
| DNS | external-dns | Cloudflare + bind9 split-horizon |
| TLS | cert-manager | Let's Encrypt + internal CA |
| Tunnel | cloudflared | Public ingress without exposing home WAN |
| AuthN/Z | Authelia + oauth2-proxy | OIDC SSO; 24 oauth2-proxy instances gate apps |
| Secrets | External Secrets Operator + 1Password | 109 ExternalSecrets, zero plain-text in Git |
| VPN | wg-easy | Operator OOB WireGuard access |
| Storage | Rook-Ceph, Longhorn, Garage, direct NFS | Tiered by durability requirement |
| Databases | CloudNative-PG, Dragonfly, Qdrant | 24 Postgres clusters, KV, vector |
| Observability | kube-prometheus-stack, Loki, Grafana, HolmesGPT | Metrics, logs, dashboards, AI alert triage |
| Images | ZOT | Pull-through registry / local cache |
๐ฅ๏ธ Hardware
| Role | Hostname | Device | CPU | RAM | OS | Storage / Accelerators | Notes |
|---|---|---|---|---|---|---|---|
| ๐ง | master1 | bare-metal | 4 | 32 GB | CentOS 10 | NVMe (Longhorn) | Intel iGPU ยท RTL-SDR ยท control plane |
| ๐ง | master2 | VM on beast | 3 | 12 GB | CentOS 9 | virtualized control plane | |
| ๐ง | master3 | VM on beast | 3 | 10 GB | CentOS 9 | virtualized control plane | |
| ๐ช | worker2 | ThinkCentre M910x | 8 | 32 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | ZWA-2 Z-Wave dongle |
| ๐ช | worker3 | ThinkCentre M910x | 8 | 64 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | Sonoff Zigbee dongle |
| ๐ช | worker4 | ThinkCentre M910x | 8 | 32 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | Coral USB TPU |
| ๐ช | worker5 | VM on beast | 10 | 24 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | |
| ๐ช | worker6 | VM on beast | 10 | 30 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | |
| ๐ช | worker7 | VM on beast | 10 | 30 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | |
| ๐ฎ | worker8 | VM on beast | 10 | 55 GB | CentOS 9 | NVMe (Longhorn + Ceph OSD) | NVIDIA P40 (24 GB VRAM) |
Off-cluster infrastructure
| Host | Role |
|---|---|
beast | Dell R730xd ยท iDRAC 8 ยท RAID6 bulk storage ยท primary NFS ยท Longhorn backup target ยท Garage substrate ยท VM host |
brain | Router/gateway ยท RAID6 mass_storage ยท NFS for downloads & TV ยท OOB SSH on :3231 |
๐ Network
Physical topology (click to expand)
| Network | CIDR | VLAN |
|---|---|---|
| Default | 192.168.0.0/16 | 0 |
| IoT | 10.10.20.0/24 | 20 |
| Guest | 10.10.30.0/24 | 30 |
| Security (cameras) | 10.10.40.0/24 | 40 |
| Kubernetes pod subnet (Cilium) | 10.42.0.0/16 | โ |
| Kubernetes services subnet (Cilium) | 10.43.0.0/16 | โ |
| Kubernetes LB pool (CiliumLoadBalancerIPPool) | 10.45.0.0/24 | โ |
Worker nodes attach to iot and sec VLANs via Multus for direct camera and IoT-device reachability. Cilium peers BGP with the upstream router to advertise the LB pool; external ingress flows through Envoy Gateway behind cloudflared.
๐ฆ What's running
๐ Home Automation โ Home Assistant ecosystem, 400+ devices
| App | Purpose |
|---|---|
| Home Assistant | Primary orchestrator; 400+ Z-Wave / Zigbee / Matter / ESPHome devices |
| ESPHome | Build & deploy firmware for DIY sensors |
| EMQX | MQTT broker |
| Node-RED | Visual automation flows |
| Zigbee2MQTT | Zigbee bridge (Sonoff stick on worker3) |
| Z-Wave JS UI | Z-Wave bridge (ZWA-2 stick on worker1) |
| Matter Server | Matter protocol bridge |
| Frigate | NVR + ML camera analysis (7+ cameras, Frigate+ trained model) |
| n8n | Workflow automation (AlertManager โ HolmesGPT, etc.) |
| NetBox | IPAM / DCIM |
| wyoming-services | Piper TTS + Whisper STT for voice |
| smtp-relay | Maddy โ Mailgun outbound mail |
๐ฌ Media & Entertainment โ Jellyfin, Immich, Music Assistant, RomM
| App | Purpose |
|---|---|
| Jellyfin | Primary media server (read-only metadata) |
| Immich + immich-pet-tagger + immichkiosk + immich-power-tools | Photo library with ML face/pet recognition, offsite-backed |
| Music Assistant + Gonic | Multi-room music control + Subsonic API |
| RomM | Retro game library (~10k ROMs) |
| Beets | Music library tagging |
| cutVideo / av1corrector / videodupfinder / medialyze | Custom video tooling |
| Theme Park | Consistent UI theming across apps |
| Batocera Webdashboard Pro | Retro-gaming console dashboard |
| kodi-playback-watcher | Bridge for Kodi playback state |
๐ค AI & ML โ Local inference, agents, image generation
| App | Purpose |
|---|---|
| Ollama | Local LLM serving on the P40 (Qwen 2.5 7b/14b, DeepSeek-R1, etc.) |
| ComfyUI | Image generation workflows |
| Khoj | Personal AI assistant over notes + docs |
| LangGraph Agents | Custom multi-agent runtime (rwlove/langgraph-agents); Postgres-checkpointed; MCP-gateway client. See AI agent pipeline section below. |
| KubeClaw | Workflow agent platform w/ browser automation (upstream chart); being phased out in favor of LangGraph |
| MCP Inspector | Model Context Protocol debugger UI |
| Paperless-AI | Auto-tagging for paperless-ngx |
| sync-receiver | Cross-host AI state sync endpoint |
๐ Observability โ Prom/Loki/Grafana with AI triage on top
| App | Purpose |
|---|---|
| kube-prometheus-stack | Prometheus + AlertManager + node-exporter |
| Loki | Log aggregation |
| Grafana | Dashboards + alerting UI |
| HolmesGPT | LLM-backed alert investigation |
| kube-state-metrics / kube-ops-view | Cluster state & visualization |
| Goldilocks | VPA-driven resource right-sizing recommendations |
| Kromgo | Prometheus โ Glance dashboard bridge |
| Netdata | Per-node real-time metrics |
| network-ups-tools (NUT) | UPS monitoring & graceful shutdown |
| exporters | Custom Prometheus exporters |
๐๏ธ Data & Storage โ Databases, object storage, vector search
| App | Purpose |
|---|---|
| CloudNative-PG | 24 Postgres clusters with WAL archiving to Garage |
| Dragonfly | Redis-compatible in-memory store |
| Qdrant | Vector DB for embeddings / RAG |
| pgAdmin | Postgres admin UI |
| Rook-Ceph | Distributed block storage (default durable tier) |
| Longhorn | Block storage with NFS-backed recurring backups |
| Garage | S3-compatible object storage (DB backups, app S3 workloads) |
๐ Network, Auth & Platform โ Ingress, SSO, GitOps machinery
| App | Purpose |
|---|---|
| Cilium | CNI, BGP, LoadBalancer pool |
| Envoy Gateway | Ingress / HTTPRoute (30 routes) |
| cert-manager | TLS certificate lifecycle |
| external-dns | Cloudflare + bind9 record sync |
| cloudflared | Public tunnel without exposed WAN |
| Authelia | OIDC identity provider |
| LLDAP | Lightweight LDAP directory backing Authelia |
| oauth2-proxy | 24 instances gating per-app SSO |
| wg-easy | Primary OOB WireGuard access |
| External Secrets Operator | 1Password-backed secret materialization |
| Flux2 | GitOps reconciler |
| Renovate | Image & Helm chart update PRs |
| Kuadrant | MCP server gateway (Authelia-gated JWT) |
| actions-runner-controller | Self-hosted GitHub Actions runners |
| ZOT | Pull-through registry cache |
๐๏ธ Documents & Collaboration โ Personal knowledge stack + self-hosted tools
| App | Purpose |
|---|---|
| Paperless-ngx | Document scanning, OCR, tagging (CNPG-backed, offsite-backed) |
| Obsidian + obsidian-couchdb | Notes sync (CouchDB w/ Cloudflare rate-limiting) |
| Zulip | Self-hosted team chat (also wired into agent pipeline approvals) |
| Kitchenowl | Shopping lists + recipe / meal management |
| Open WebUI | Self-hosted LLM frontend (Ollama + MCP servers as tool servers) |
| SearXNG | Privacy-respecting metasearch engine |
| Glance | Personal dashboard / start page |
| Atuin | Encrypted shell-history sync across machines |
| IT-Tools | Self-hosted developer toolbox |
| MediKeep | Personal medical records |
| Nametag | Name tag / badge generator |
| Pump + Pump-cv | Custom personal apps (rwlove-built) |
๐ MCP Servers โ 14 Model Context Protocol servers behind an Authelia-gated gateway
| Server | Exposes |
|---|---|
| mcp-gateway | Aggregating gateway; Envoy SecurityPolicy validates Authelia-issued JWTs (daily-rotated key) |
| ha-mcp | Home Assistant entities + service calls |
| immich-mcp | Immich library search + asset metadata |
| kubectl-mcp | Cluster introspection + safe kubectl ops |
| grafana-mcp | Grafana dashboards + Loki/Prom queries |
| prometheus-mcp | Direct PromQL access |
| paperless-mcp | Paperless-ngx document search |
| netbox-mcp | NetBox IPAM / DCIM |
| github-mcp | GitHub repo + PR ops |
| n8n-mcp | n8n workflow control |
| omada-mcp | TP-Link Omada controller |
| searxng-mcp | Privacy search through SearXNG |
| arr-mcp | Library-search interface to *arr apps |
| time-mcp | Time / timezone utilities (rwlove/time-mcp native-SHTTP build) |
๐ง AI agent pipeline
How local AI agents run, get work, ask for human approval, and produce reports โ all without putting data in someone else's cloud unless a task genuinely needs it.
flowchart TB
subgraph Inputs[Inputs]
AM[AlertManager alerts]
Op[Operator chat / voice]
Cron[n8n cron + webhooks]
end
subgraph Frontends[Frontends]
OWUI[Open WebUI]
HA[Home Assistant<br/>voice + conversation]
N8N[n8n workflows]
end
subgraph Orchestration[Orchestration]
Holmes[HolmesGPT<br/>alert RCA]
LG[LangGraph Agents<br/>agent fleet]
KC[KubeClaw<br/>retiring]
end
subgraph Inference[Inference]
Ollama[Ollama on P40<br/>qwen2.5:7b/14b]
Claude[Claude API<br/>escalation only]
end
subgraph Tools[Tools]
Gw[MCP Gateway<br/>Authelia-gated JWT]
Servers[14ร MCP servers<br/>HA ยท Immich ยท k8s ยท Grafana ยท โฆ]
end
subgraph Outputs[Outputs]
Z[Zulip<br/>approvals + chat]
P[Pushover<br/>high-priority alerts]
V[(langgraph-vault<br/>drafts + reports)]
DB[(Postgres CNPG<br/>checkpoints + memory)]
end
AM --> Holmes
Op --> OWUI
Op --> HA
Cron --> N8N
OWUI --> Ollama
HA --> Ollama
N8N --> LG
Holmes --> Ollama
LG --> Ollama
LG -.-> Claude
KC --> Ollama
Holmes --> N8N
LG --> Gw
KC --> Gw
OWUI --> Gw
Gw --> Servers
N8N --> Z
N8N --> P
LG --> V
LG --> DB
Agent fleet (LangGraph)
A single rwlove/langgraph-agents FastAPI service runs the fleet. Each agent is a LangGraph graph with its own persona, tool set, and cost cap. Postgres-checkpointed state lets long-running plans survive restarts.
| Agent | Role |
|---|---|
supervisor | Routes work to specialist agents; opens approvals |
researcher | Web + repo + vault research |
coder | Code reading, drafting, PR descriptions |
reviewer | Reviews drafts before they reach the operator |
triager | Classifies inbound items, assigns owner agent |
reporter | Daily digests, summaries, status rollups |
note-maker | Captures decisions + facts back into the vault |
homelab-engineer | Cluster ops, HelmRelease drafting, PR-shaped output |
smart-home-engineer | Home Assistant entities, automations, ESPHome configs |
ml-tuner | Frigate, Immich CLIP, model tuning |
errand-runner | One-shot real-world tasks (purchases, lookups, scheduling) |
property-coordinator | 3532 Foxhall workstreams (contractors, deck, pool) |
health-tracker | Local-only โ never escalated to Claude API |
doc-writer (Scribner) | Sweeps repos for stale docs; drafts README + docs/ patches as diffs when commits land |
Pipeline stages
- Inbox โ
langgraph-inbox.jsonworkflow ingests requests from chat, AlertManager, or scheduled triggers. - Triage โ
triagerclassifies and assigns to a specialist agent. - Plan โ agent drafts an action plan (goals, steps, tool calls, expected cost) into Postgres state.
- Approval (HITL) โ for anything non-trivial,
langgraph-approval-postsends a signed Zulip message + Pushover ping with the plan summary;langgraph-approval-receivewaits on the reply;langgraph-awaiting-user-sweepchases stuck tasks. - Execute โ agent runs tool calls through the MCP gateway. Cost caps enforced by
langgraph-cost-cap-watcher($5/task, $10/agent/day, $30/global/day). - Report โ output written to
langgraph-vault(drafts / finals), summarized into thereporteragent's daily Zulip digest (langgraph-daily-digest).
Local-first by design
| Tier | Backend | When used |
|---|---|---|
| 1 | qwen2.5:7b on Ollama (P40) | Fast / simple agents (triager, note-maker drafts) |
| 2 | qwen2.5:14b on Ollama (P40) | Default for everything else |
| 3 | Claude API (escalation) | Only on explicit uncertainty markers, repeated local-retry failure, novel/long-context work, or requires_cloud tag |
health-tracker and errand-runner are pinned local-only โ they never escalate, even if quality suffers, because the data class isn't suitable for off-site inference.
Voice-to-action: power button โ HA Assist โ agents โ Obsidian
The most common way work enters the fleet โ hold the phone's power button, say "inbox <whatever I'm thinking>", and the cluster takes it from there.
flowchart LR
Btn[๐ฑ Hold power button<br/>Pixel: 'Hold for Assistant'] --> Assist[HA Companion app<br/>set as default assistant]
Assist -->|audio stream| HA[Home Assistant<br/>Assist pipeline]
HA --> Whisper[Whisper STT<br/>wyoming-services on P40]
Whisper --> Sentence[Sentence trigger:<br/>'inbox {content}']
Sentence --> Ollama[conversation.ollama_voice<br/>qwen3:8b]
Ollama --> Rest[HA rest_command<br/>POST + Authelia JWT]
Rest --> Hook[n8n: langgraph-inbox]
Hook --> LG[langgraph-agents /inbox]
LG --> Triage[triager classifies]
Triage -->|capture only| Note[note-maker]
Triage -->|plan + act| Spec[specialist agent<br/>drafts plan]
Spec -->|needs input| Zulip[๐ฌ Zulip approval<br/>+ Pushover ping]
Zulip -->|reply| Receive[approval-receive]
Receive --> Spec
Spec --> Done[outcome to vault]
Note --> Inbox[/vault/inbox/YYYY-MM-DD-โฆmd/]
Done --> Outputs[/vault/outputs/{drafts,finals}//]
Inbox --> Couch[(obsidian-couchdb)]
Outputs --> Couch
Couch -->|LiveSync| Phone[๐ฑ Obsidian on phone<br/>same vault]
The path
- Hold power button. Pixel's "Hold for Assistant" gesture is bound to the HA Companion app as the default digital assistant. The Assist UI opens with the mic hot.
- Speak. Audio streams to the cluster โ no on-phone STT. The trigger phrase is
inbox <body>; everything afterinboxis the note. - STT in cluster. The Assist pipeline routes the audio to Whisper (
wyoming-services, GPU-accelerated on the P40). - Intent + LLM. A sentence trigger matches
inbox {content}and hands{content}toconversation.ollama_voice(qwen3:8b on Ollama, tool-calling enabled). The conversation agent's only job here is to confirm the intent and call the rest_command โ it does not interpret the content. - Auth'd POST. An HA
rest_commandPOSTs tohttps://langgraph-inbox.${SECRET_DOMAIN}/webhookwith{ source:"voice", user:"rob", content:"<transcript>" }. The request carries an Authelia client_credentials JWT issued to a dedicatedha-voice-inboxOIDC client โ same daily-rotated signing-key machinery the MCP gateway already uses. Envoy'sSecurityPolicyvalidates the JWT against Authelia's JWKS at the gateway. - n8n langgraph-inbox. Normalizes the payload and POSTs to
/inboxonlanggraph-agents. - Triager classifies. Research question, household errand, homelab change, property task, or note-to-self โ and picks the specialist agent.
- Capture path โ note-maker writes the file to
/vault/inbox/YYYY-MM-DD-HHMM-<slug>.mdon thelanggraph-vault-rwPVC. Single writer, no race with the phone. - Plan-and-act path โ specialist drafts a plan into Postgres + a draft under
/vault/outputs/drafts/. HITL approval via the existing Zulip + Pushover loop when needed (see triggers above). - Round-trip to the phone.
obsidian-couchdbwatches the vault PVC and replicates new files through Self-hosted LiveSync โ the note from step 8, plus any drafts/finals from step 9, appear in the Obsidian app on the phone within a sync cycle. Same surface the dictation started on.
The loop closes locally and on one surface: power-button โ speak โ outcome appears in the vault. Whisper, Ollama, n8n, and the agents all run in the cluster; the only off-site dependency is claude.com if the local fleet escalates a task.
Alert triage (production today)
HolmesGPT is the one agent already running in production:
- AlertManager โ HolmesGPT webhook (via
alertmanager-holmesgpt-pushover.json) on every firing alert - HolmesGPT queries Prometheus, Loki, and the cluster directly to build a root-cause hypothesis
- Result posted back as a Pushover message + Zulip thread; n8n sanitizes raw tool-call descriptors out of the agent text before delivery
Current state (2026-05-16)
- HolmesGPT โ live, handling cluster alerts daily.
- LangGraph fleet โ plumbed but cold (
ENABLE_CLAUDE_API: false, no production triggers). Gated on NVIDIA Spark / Ascent GX10 arrival (~2026-05-20), which becomes the primary Ollama backend before the fleet goes hot. - KubeClaw โ running in parallel during the LangGraph transition; scheduled for retirement once LangGraph is validated.
โ๏ธ Cloud dependencies
| Service | Use | Cost |
|---|---|---|
| 1Password | Secret backend for External Secrets | ~$65 / yr |
| Cloudflare | Domain, DNS, tunnel, WAF rate-limiting | Free |
| GitHub | Repo hosting + CI | Free |
| Mailgun | Outbound mail relay (via Maddy) | Free (Flex) |
| Pushover | Push notifications for AlertManager + apps | $10 one-time |
| Frigate+ | Trained ML model for Frigate NVR | $50 / yr |
| AWS S3 Glacier Deep Archive | Offsite DR for Immich + Paperless (objects + DB backups) | ~$1โ5 / mo (varies) |
| ~$10โ15 / mo |
๐ก๏ธ Operational pillars
๐พ Tiered storage durability
Four tiers, picked by what the data has to survive โ node loss, Ceph loss, cluster loss, or full site loss. Databases get ceph-block + BarmanโGarage; irreplaceable state goes to Longhorn with NFS-shipped weekly + monthly backups; S3-shaped workloads use Garage; bulk media rides direct NFS. Full decision tree: .agents/instructions/storage-class.instructions.md.
๐ Secrets โ zero plain-text in Git
All 109 ExternalSecrets resolve through External Secrets Operator from 1Password. Application credentials are templated into ExternalSecret resources and never live in YAML. Cross-namespace mirrors use the reflector pattern when consumer charts hard-code secret names.
๐ชช Authentication โ single sign-on everywhere
Authelia (with LLDAP) is the OIDC identity provider; per-app oauth2-proxy instances enforce auth at Envoy Gateway. 24 apps sit behind SSO today. The mcp-gateway validates Authelia-issued JWTs with a daily-rotated signing key for MCP tooling.
๐ญ Observability โ metrics, logs, AI triage
kube-prometheus-stack scrapes everything; Loki ingests pod logs; Grafana stitches the dashboards. AlertManager fans alerts to Pushover and to HolmesGPT, which runs LLM-driven root-cause investigation against the cluster and posts findings back via n8n.
๐ฎ GPU workloads
A single NVIDIA P40 (24 GB VRAM) on worker8 drives Ollama (local LLM), ComfyUI (image gen), Whisper STT, Immich's CLIP face/pet recognition, and the immich-pet-tagger fork pinned to a P40-compatible PyTorch build. Driver lifecycle is handled by the NVIDIA GPU Operator.
๐ Disaster recovery
Per-app rclone CronJobs ship Immich originals and Paperless documents โ plus their Garage-stored Postgres backups โ to encrypted AWS S3 with a 1-day Glacier Deep Archive transition. Recovery procedure is documented at Offsite recovery and was last validated 2026-05-05.
๐ช๏ธ Strict GitOps
Every change reaches the cluster through Git. Flux suspends are a deliberate manual signal โ paused Kustomizations are not "broken," they're intentional pauses for in-flight maintenance and are documented in conventions, not reverted on sight.
๐ Documentation
The full operator handbook lives in the mdBook site: https://rwlove.github.io/home-ops/.
Frequently referenced pages:
- Cluster rebuild
- Initialization & teardown
- Cluster upgrade
- Power outage recovery
- Limits & requests philosophy
- Debugging playbook
- Offsite recovery
- Immich restore to new CNPG database
- NVIDIA P40 GPU setup
- master1 etcd disk swap
- GitHub webhook
Repo-local conventions (auto-loaded by AI agents from .agents/instructions/):
- Storage class selection ยท HelmRelease security defaults ยท ConfigMap layout ยท Sorting rules ยท Schema correction ยท Persona
๐ Acknowledgements
Inspired by the k8s-at-home community. @whazor maintains the excellent k8s-at-home search โ a great way to discover how others configure the same Helm releases.