Egress Restriction Design Proposal
Status: Phases 1–3 implemented 2026-05-18. Detection-first hybrid (approach E) live; per-app narrowing (approach F) is opportunistic and ongoing. Tier 3 of the network security roadmap. Owner: home-ops Last updated: 2026-05-18
Implementation notes (2026-05-18)
The Hubble metrics in this cluster don't expose what the original design assumed:
hubble_dns_queries_totalhas aquerylabel but no source-pod attribution out of the box — only the cilium-agent pod observing the query. Fixed by addinglabelsContext=source_namespace,source_podto thedns:metric inkubernetes/apps/kube-system/cilium/app/values.yaml.hubble_flows_processed_totalhas nodestination_fqdnlabel at all in Cilium 1.19 — that pairing exists in the flow logs but not in the Prometheus export. For Phase 3 (egress-volume anomaly) we usecontainer_network_transmit_bytes_totalfrom cAdvisor instead, since byte counts aren't in any Hubble metric.
The shipped recording rules + alerts live in
kubernetes/apps/observability/kube-prometheus-stack/app/prometheusrule-egress-detection.yaml.
The Phase-0 "baseline JSON exported to git" step is deferred — the
detection alert fires on novel destinations relative to a live 24h
window rather than a frozen baseline file, which trades some signal
quality for not needing to commit and refresh a per-app FQDN
manifest.
Problem statement
The 2026-05 NetworkPolicy rollout (networkpolicy_rollout_plan.md)
locked down ingress and intra-cluster egress across 19 namespaces, but
the outbound-to-internet posture remains coarse. A grep of
kubernetes/apps/ shows:
- 51 CNP files use
toEntities. - 50 of them include
world(almost always paired with:443). - Only 12 files use
toFQDNs, and most of those still fall back totoEntities: [world]for any canonical (non-CNAME) FQDN per theproject_cilium_matchpattern_fqdn_limits.mdworkaround.
In effect, every workload that needs any outbound HTTPS today has unrestricted outbound HTTPS. A compromised pod inside the cluster can exfiltrate to any address on the public internet on 443 (and 465 for smtp-relay, etc.).
The proximate cause is not laziness — it is matchPattern's silent
failure on K8s-DNS-search-augmented canonical FQDNs. The repo's CNP
authors consistently chose "broad allow that actually works" over
"narrow allow that silently breaks." That tradeoff was correct under
the rollout's time pressure but leaves real exposure.
This document evaluates approaches for closing that gap without re-introducing the silent-failure modes that drove the original choice.
Goal
Meaningful egress restriction with these constraints:
- No regression of the user's verification step. The 2026-05-18
full rollback (see
networkpolicy_rollout_plan.md§ "Full rollback 2026-05-18") established that real browser-path verification is the only acceptable gate for policy changes. Any egress design must be testable from the user's actual workflow, not a simulated one. - No new piece of infrastructure to operate. Single operator, home-lab scale. A new always-on proxy pod or sidecar that the operator has to monitor, upgrade, and triage is a non-starter absent strong justification.
- Stays within Cilium primitives where possible. Cilium 1.19 is the cluster's CNI; standing up a parallel egress data plane (e.g. a separate proxy mesh) duplicates that investment.
- Compatible with GitOps + per-app PR rhythm. Whatever ships
should be expressible in
kubernetes/components/or per-app overlays, not require out-of-band cluster state. - The L7 DNS proxy in
network-policy/baseline/allow-dnsstays intact. It is load-bearing for every FQDN egress rule and for Cilium's FQDN cache; removal silently breaks all toFQDNs.
Current egress inventory
Grouped from the 51 CNPs by destination shape:
| Category | Example apps | Today's allow shape | Tractability |
|---|---|---|---|
| Package registries — CNAME-fronted | actions-runner-system (github.com, *.githubusercontent.com), media/recyclarr | matchPattern: <fqdn> works | High (already narrow) |
| Provider APIs — canonical | external-secrets→1P Connect cloud, smtp-relay→mailgun, langgraph→Anthropic/Pushover, cert-manager→ACME, ai/khoj→HuggingFace | matchPattern + .* workaround OR world:443 | Low (matchPattern silently fails on canonical) |
| Container registries — pull-through via ZOT | n/a (in-cluster ZOT) | covered by toEndpoints | Already restricted |
| S3 to AWS (offsite backups) | immich, paperless rclone CronJobs | world:443 | Medium (AWS S3 IP ranges are published but rotate) |
| LAN-internal hosts (host identity) | longhorn→beast NFS, snmp/omada→LAN devices | toEntities: [host, remote-node, world] mix | Already non-internet but world is the catch-all |
| Truly broad / user-driven | home-assistant, esphome, n8n, node-red, searxng, glance, glance-user, open-webui, runners | world:443 (sometimes + 80) | Inherently broad |
| ESPHome compile / runners arbitrary code | esphome/code, actions-runner-system/runners | world + matchPattern: "*" | Inherently broad |
Distinct legitimate external destinations (deduped across all apps):
roughly 25-30 FQDN families. The bulk of world:443 rules collapse to
~6 categories: GitHub/GHCR, HuggingFace + R2, 1Password (cloud + Connect),
Cloudflare API, ACME endpoints, Mailgun, Pushover, Anthropic. The
rest — home-assistant, esphome, searxng, n8n, runners, ESPHome
compile, recipe scrapers, etc. — are inherently broad because the
operator-installable surface is user-driven.
So the cluster has two qualitatively different egress populations:
- Bounded apps (a known FQDN list, even if Cilium can't narrow it today) — ~70% of the CNPs by file count.
- Inherently unbounded apps — home, automation, dashboards, user-driven aggregators, code-execution sandboxes. ~30%.
A meaningful design must address the bounded population while leaving the unbounded population workable.
Approaches considered
A. toCIDR lists per service
Maintain toCIDR allowlists per app: api.cloudflare.com → maintained
list of CF anycast prefixes, acme-v02.api.letsencrypt.org → ACME's
published ranges, etc.
Pro: Bypasses matchPattern entirely. Cilium handles CIDR matching
reliably (when not against host-net identities — see
project_cilium_ipblock_apiserver.md, but kube-apiserver isn't the
target here). Static, GitOps-able, auditable.
Con: Maintenance burden is the killer. Cloudflare's IP ranges rotate. Anthropic uses AWS/GCP. HuggingFace fronts via CloudFront. 1Password Connect's cloud endpoint is on AWS. The provider lists either get out of date (silent breakage on rotation) or require a Renovate-style update bot pointed at upstream IP-range manifests. None of the providers we'd target publish those manifests in a machine-readable form we already consume.
Verdict: Discard. The maintenance signal-to-noise ratio is worse
than today's world:443 — instead of "broad allow that works," we get
"narrow allow that silently breaks on provider IP rotation," which is
exactly the failure mode that drove the original world:443 choice.
B. Cluster-wide HTTP CONNECT proxy
Stand up a forwarding HTTPS proxy (e.g. Squid, Envoy, or a
purpose-built one) in a dedicated namespace. Every app's egress CNP
allows only toEndpoints: <proxy>. The proxy enforces an FQDN
allowlist at L7, and pods configure HTTPS_PROXY=http://proxy:3128.
Pro: Centralizes the allowlist. L7 FQDN matching at the proxy is not subject to Cilium's K8s-DNS-search-augmentation quirk — Squid sees the literal CONNECT host header. Cleanly auditable: one log stream shows every external request the cluster makes.
Con: Violates constraint #2 (no new infrastructure). It's a new
SPOF in the egress path; if the proxy pod crashloops or its allowlist
ConfigMap has a typo, every internet-needing workload in the
cluster breaks simultaneously. Also: HTTPS_PROXY env var support is
inconsistent — many apps respect it (curl, Python requests, npm); many
don't (Go's net/http only if explicitly configured, some Java HTTP
clients, anything statically built against a non-proxy-aware library).
Apps that ignore the proxy bypass it entirely, recreating today's
posture silently. Some apps (esphome compile, runners) intentionally
fan out to arbitrary destinations and would need bypass rules
anyway.
Verdict: Discard. Operating a new always-on proxy and the allowlist-as-config story for a single-operator cluster is a step function in operational complexity, and the bypass-via-non-proxy-aware clients gap reproduces the same silent-failure mode we're trying to escape.
C. DNS-based with longer cache TTL + accept matchPattern caveats
Keep toFQDNs as the primary mechanism. Configure Cilium's DNS proxy
with longer minimum TTLs (so the FQDN→IP cache survives provider DNS
TTL jitter). Accept that matchPattern will continue to fail on
canonical FQDNs and audit-fix per app when it surfaces.
Pro: No new infrastructure. Leverages existing primitives.
Con: This is the status quo we're trying to escape. The
project_cilium_matchpattern_fqdn_limits.md failure mode is not about
cache lifetime — it's about the pattern matcher not seeing the
augmented FQDN. Longer TTL doesn't fix it. Continuing the per-app
audit-and-widen-to-world cycle is what got us to "50 of 51 use world."
Verdict: Discard. Doesn't move the posture forward.
D. Cilium 1.20+ FQDN improvements
Cilium 1.20 (released 2025-Q4) shipped improvements to the FQDN proxy,
including better handling of DNS search domain expansion and the new
--dns-proxy-enable-transparent-mode option that intercepts DNS at
the eBPF layer rather than via iptables redirect. The 1.21 line
(in-development, target 2026-Q2) adds explicit support for
matchName against canonical FQDNs (release-notes commit
cilium/cilium#36418, the upstream fix for the exact failure mode
documented in project_cilium_matchpattern_fqdn_limits.md).
Pro: Fixes the root cause of why we fell back to world:443. Same
data plane, no new infrastructure. CNPs stay in the same shape; we
just stop hitting the silent-failure trap.
Con: Requires a Cilium upgrade (we're on 1.19.4). 1.19 → 1.20 is not a config-only bump — it touches every node's CNI agent and a few agent-config defaults changed. Risk is non-trivial in a one-operator cluster. 1.21 isn't GA yet, so the real matchName-canonical fix is ~6+ months out. In the interim, 1.20 alone doesn't close the gap; it just makes the matchPattern hack more reliable for the apps that were already on it.
Verdict: Hold. Track upstream; revisit when 1.21 reaches stable. The Cilium upgrade is the right cluster work to plan, but as a separate workstream — it shouldn't gate this design.
E. Hybrid: keep world:443, layer Hubble alerting for unusual destinations
Leave the CNP posture as-is. Use Hubble's flow data — already exported to the metrics pipeline — to detect unusual egress destinations per app. Define a baseline (the FQDNs each app legitimately reaches over ~7 days) and alert when a flow's destination falls outside that baseline.
Pro: Detection-not-prevention is the right primitive when prevention has been demonstrated to silently fail. No new infrastructure (Hubble + kube-prometheus-stack + Alertmanager are already running; Alertmanager → ntfy/Pushover routing exists). Catches the exfiltration scenario the threat model actually cares about (compromised pod talks to attacker-controlled C2) without breaking legitimate broad-egress apps. Reusable across the bounded and unbounded populations. Failure mode is "alert noise" rather than "app silently down" — the operationally-safer direction.
Con: Doesn't prevent exfiltration to an unknown destination; the attacker gets one trip out before the alert fires. The "baseline" has to be built and maintained (per-app FQDN sets drift as configs change). Alert-fatigue risk if the baseline isn't tight — searxng, home-assistant, n8n will routinely surface novel destinations.
Verdict: Adopt as primary. Detection plus per-app narrowing (approach F below) where genuinely tractable.
F. Narrow what's narrow-able; leave the rest broad
A focused, targeted second pass on the existing CNPs:
- For apps with a stable, finite, documented external FQDN set
(the bounded population): convert
world:443totoFQDNs: matchName: <fqdn>per FQDN, using the pairedmatchPattern: foo.comandmatchPattern: foo.com.*workaround for canonical FQDNs. Already the pattern in actions-runner-controller, recyclarr, pump-cv, github-mcp, paperless-ai. - For apps with inherently unbounded egress (home-assistant,
esphome, n8n, node-red, searxng, glance, glance-user, open-webui,
runners, esphome/code, home-assistant/code): leave
world:443and document the design decision in a per-CNP comment. - For apps that fall in between (e.g. ai/ollama, ai/comfyui pulling
models from huggingface + R2 + pypi): keep
world:443but tighten the port set — drop port 80 if it isn't actively used (most apps don't need it).
Pro: Concrete, incremental, no infrastructure cost. Each app's narrowing is one PR with per-app browser-path verification (the gate established in the 2026-05-18 rollback retrospective).
Con: Doesn't reduce the count of world:443 rules much — the
unbounded population is structurally broad. Most of the value comes
from the bounded apps that already use matchPattern correctly.
The "audit and narrow" pass is real work for modest CNP-count
reduction.
Verdict: Adopt as secondary, paired with E. Approach E provides the meaningful security gain; approach F is documentation hygiene that surfaces design intent in each CNP.
G. Perimeter (brain firewalld) egress filtering
The cluster's actual internet egress all routes through brain
(192.168.6.1, the home router). brain runs firewalld and already has
the OOB SSH pinhole from the 2026-05-14 work. Apply L4 egress filtering
at brain for known-bad destinations (geo-blocked countries, known C2
prefixes from threat-intel feeds) or known-good allowlists per source
host.
Pro: Filtering at the perimeter is a defense-in-depth layer that
catches cluster-side bypass scenarios (a compromised CNI agent that
ignores CNPs, a misconfigured pod-gateway leak). Single chokepoint to
maintain. brain config is already version-controlled in
rwlove/lovenet-network-configuration.
Con: brain firewalld can't see into Cilium's per-pod identity — every cluster-egress flow looks like "Kubernetes node IP → external IP" from brain's perspective. The granularity is per-node, not per-pod or per-app. So perimeter filtering can do "drop traffic to known-malicious prefixes" but not "allow comfyui to reach huggingface but not anywhere else." Different threat model, different control. Doesn't address the per-pod compromise scenario this roadmap is targeting.
Verdict: Out of scope here, in-scope for a different roadmap tier. The right layer for threat-intel-feed blocking, but not for per-app egress restriction.
Recommendation
Adopt E + F as a paired strategy. Detection-first via Hubble alerting, with opportunistic per-app narrowing where the FQDN set is known-stable.
Rationale tied to the cluster's constraints:
- One operator on call. Prevention controls that silently fail create user-discovered outages (the 2026-05-18 rollback proved this). Detection controls fail in a noisier, more recoverable direction (alert fires, operator investigates, no app is broken).
- GitOps + per-app PR rhythm. Both halves fit. The Hubble baselining is a one-shot dashboard + alert-rule PR. Per-app narrowing is one CNP PR per app with the established browser-verification gate.
- No new infrastructure. Hubble, kube-prometheus-stack, Alertmanager, and the Pushover Provider are all already running. No new pod, no new mesh, no new control plane.
- Compatible with the eventual Cilium 1.21 upgrade. When the
matchName-canonical fix lands upstream, this design's "narrow
what's narrow-able" pass converges naturally — the
matchPattern + .*hack collapses to cleanmatchNamelines andworld:443rules drop further. The Hubble alerting stays useful regardless. - Threat-model-aligned. The realistic threat we're hardening against is a compromised app phoning home or exfiltrating credentials — not a sophisticated CNI-evading rootkit (where the right control is at the host or perimeter, see approach G). Detection catches the realistic case; we're not pretending to solve the harder one with CNP changes.
Phased rollout sketch
Phase 0 — Baseline measurement (1-2 weeks, passive)
Build the Hubble destination baseline. No CNP changes. Two deliverables:
- Per-app egress-FQDN Grafana panel. Source: Hubble flow data
already in the metrics pipeline. Group by
source_pod_namespace + source_pod_labels.app.kubernetes.io/name → destination_fqdn. Show the unique FQDN set per app, with first-seen and last-seen timestamps. Use the existing observability stack — no new exporters. - Per-app baseline JSON exported to git. Once the panel settles
(≥7 days of data), snapshot the per-app FQDN sets into a
kubernetes/components/network-policy/egress-baselines/directory (one file per app, FQDN list). This is the expected egress manifest, version-controlled so changes are reviewable.
Success criterion: Baseline captured for every app currently using
toEntities: [world], with no false-positive surprises (e.g. apps
talking to FQDNs that the operator didn't know about — investigate
each before declaring baseline).
Phase 1 — Single-app pilot: detection alerting (1 week)
Pick one bounded-population app with a stable, well-understood
FQDN set. Recommendation: media/recyclarr — already on
matchPattern + .*, low blast radius, no real-time user dependency,
flow volume is bounded (it scrapes TRaSH-Guides on a schedule).
Steps:
- PR: add a Prometheus recording rule that counts per-pod egress flows to destinations outside the recyclarr baseline JSON.
- PR: add an Alertmanager rule that fires (
severity=warning, routed to Pushover) when that count is non-zero over a 5-minute window. - Wait 7 days. Alert should never fire on legitimate recyclarr behavior. If it does, investigate — either the baseline missed something legitimate (update the baseline) or the app is doing something unexpected (investigate further).
Success criterion: 7-day quiet period. No false-positive alerts. Operator gains confidence the detection mechanic works.
Phase 2 — Bounded-app rollout (4-6 weeks, one app per PR)
Apply the same detection pattern to the rest of the bounded population, one app per PR, in order of lowest-blast-radius first. Candidate order:
- recyclarr (Phase 1 pilot)
- external-secrets→1P Connect cloud
- cert-manager→ACME
- actions-runner-controller operator (not the runners — runners are unbounded)
- mcp-system/github-mcp
- mcp-system/immich-mcp
- media/lidarr, media/sonarr, media/radarr (per-app, sequential)
- observability/* exporters
Each PR: one new alert rule + the per-app baseline JSON. No CNP changes.
In parallel, opportunistic approach-F PRs where an app's world:443
rule can collapse to a small matchPattern list with confidence —
but only when the operator has bandwidth to do the
browser-verification step on each.
Phase 3 — Unbounded-app posture (1 week, design + documentation)
For the unbounded population (home-assistant, esphome, n8n, node-red, searxng, glance, glance-user, open-webui, runners, /code variants):
- Don't try to baseline. The baseline would churn constantly and generate alert fatigue.
- Document the design decision in a comment block in each CNP:
"Unbounded egress is intentional — see
docs/src/egress_restriction_design.md§ Unbounded apps." This makes the broad allow legible as design rather than oversight. - Consider a coarser alert (per-app egress-volume anomaly: "home-assistant just sent 100x its 24h rolling-average egress bytes"). Catches the bulk-exfil scenario without needing an FQDN allowlist. Cheap to implement; tune the threshold over time.
Phase 4 — Convergence with Cilium upgrade (when 1.21 lands)
When Cilium 1.21 ships with the matchName-canonical fix:
- Cilium upgrade as its own workstream (not in this roadmap).
- Once upgraded, sweep the
matchPattern + .*hack across all CNPs back to cleanmatchName: <fqdn>. Mechanical change, one PR per namespace, reuse the existing PR-per-namespace pattern. - Re-evaluate which
world:443rules can collapse to narrowtoFQDNsrules now that matchName works on canonical FQDNs. - Keep the Hubble detection layer regardless — it catches exfiltration scenarios that CNP narrowing cannot.
What this design explicitly does not do
- Does not stand up a forwarding proxy (approach B).
- Does not introduce a maintained CIDR allowlist for provider endpoints (approach A).
- Does not propose a Cilium upgrade as part of this work (approach D). Track upstream, plan separately.
- Does not touch brain firewalld for cluster-egress filtering (approach G). That's a different control at a different layer for a different threat.
- Does not add a new piece of infrastructure for the operator to maintain — only new alert rules + version-controlled baseline manifests, both within the existing observability stack.
References
docs/src/networkpolicy_rollout_plan.md— the underlying CNP rollout, including the 2026-05-18 full-rollback retrospective.project_cilium_matchpattern_fqdn_limits.md— why matchPattern silently fails on canonical FQDNs and why we ended up withworld:443everywhere.project_cilium_l4_port_targetport.md— the socket-lb-rewrite gotcha that breaks naive Pattern A overlays.feedback_netpol_rollout_needs_per_app_browser_test.md— the verification-gate decision that constrains any future netpol change.project_cilium_ipblock_apiserver.md— whytoCIDR: 0.0.0.0/0doesn't work for some destinations; relevant context for why approach A is fragile beyond just IP-rotation.