Skip to content

externalIPs DNAT rules are not installed when clusterIP is None #131497

@ttc0419

Description

@ttc0419

Taking over this issue to emphasize the gap which caused it:

Apparently, validation does not flag an error when:

  • type=ClusterIP, clusterIP=None, and loadBalancerIP is set
  • type=ClusterIP, clusterIP=None, and externalIPs is set

Note: we DO flag an error when trying to set loadBalancerSourceRanges.

Unfortunately, fixing validation bugs is always risky since there's very likely somebody who is using that configuration and it's not failing, even if it doesn't do what they intended.

What we CAN do is add API warnings for these cases.

Original report: "externalIPs DNAT rules are not installed when clusterIP is None"

What happened?

Consider the following service:

apiVersion: v1
kind: Service
metadata:
  name: kube-bench-dev
spec:
  clusterIP: None
  selector:
    instance: kube-bench-dev
  ports:
  - name: tcp-80
    port: 80
    protocol: TCP
    targetPort: 80
  externalIPs:
  - 192.168.64.253
% kubectl get service kube-bench-dev 
NAME             TYPE        CLUSTER-IP   EXTERNAL-IP      PORT(S)   AGE
kube-bench-dev   ClusterIP   None         192.168.64.253   80/TCP    6s

But no DNAT rules for the external IP:

table ip kube-proxy {
	comment "rules for kube-proxy"
	set cluster-ips {
		type ipv4_addr
		comment "Active ClusterIPs"
		elements = { 172.16.0.1, 172.16.0.173,
			     172.16.0.220 }
	}

	set nodeport-ips {
		type ipv4_addr
		comment "IPs that accept NodePort traffic"
		elements = { 192.168.64.2 }
	}

	map no-endpoint-services {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "vmap to drop or reject packets to services with no endpoints"
	}

	map no-endpoint-nodeports {
		type inet_proto . inet_service : verdict
		comment "vmap to drop or reject packets to service nodeports with no endpoints"
	}

	map firewall-ips {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "destinations that are subject to LoadBalancerSourceRanges"
	}

	map service-ips {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "ClusterIP, ExternalIP and LoadBalancer IP traffic"
		elements = { 172.16.0.173 . tcp . 80 : goto service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http,
			     192.168.64.254 . tcp . 80 : goto external-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http,
			     172.16.0.1 . tcp . 443 : goto service-2QRHZV4L-default/kubernetes/tcp/https,
			     172.16.0.173 . tcp . 443 : goto service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https,
			     172.16.0.220 . tcp . 443 : goto service-FMTKUH45-kube-system/ingress-nginx-controller-admission/tcp/https-webhook,
			     192.168.64.254 . tcp . 443 : goto external-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https }
	}

	map service-nodeports {
		type inet_proto . inet_service : verdict
		comment "NodePort traffic"
	}

	chain filter-prerouting {
		type filter hook prerouting priority dstnat - 10; policy accept;
		ct state new jump firewall-check
	}

	chain filter-input {
		type filter hook input priority -110; policy accept;
		ct state new jump nodeport-endpoints-check
		ct state new jump service-endpoints-check
	}

	chain filter-forward {
		type filter hook forward priority -110; policy accept;
		ct state new jump service-endpoints-check
		ct state new jump cluster-ips-check
	}

	chain filter-output {
		type filter hook output priority dstnat - 10; policy accept;
		ct state new jump service-endpoints-check
		ct state new jump firewall-check
	}

	chain filter-output-post-dnat {
		type filter hook output priority dstnat + 10; policy accept;
		ct state new jump cluster-ips-check
	}

	chain nat-prerouting {
		type nat hook prerouting priority dstnat; policy accept;
		jump services
	}

	chain nat-output {
		type nat hook output priority dstnat; policy accept;
		jump services
	}

	chain nat-postrouting {
		type nat hook postrouting priority srcnat; policy accept;
		jump masquerading
	}

	chain nodeport-endpoints-check {
		ip daddr @nodeport-ips meta l4proto . th dport vmap @no-endpoint-nodeports
	}

	chain service-endpoints-check {
		ip daddr . meta l4proto . th dport vmap @no-endpoint-services
	}

	chain firewall-check {
		ip daddr . meta l4proto . th dport vmap @firewall-ips
	}

	chain services {
		ip daddr . meta l4proto . th dport vmap @service-ips
		ip daddr @nodeport-ips meta l4proto . th dport vmap @service-nodeports
	}

	chain masquerading {
		meta mark & 0x00004000 == 0x00000000 return
		meta mark set meta mark ^ 0x00004000
		masquerade fully-random
	}

	chain cluster-ips-check {
		ip daddr @cluster-ips reject comment "Reject traffic to invalid ports of ClusterIPs"
	}

	chain mark-for-masquerade {
		meta mark set meta mark | 0x00004000
	}

	chain reject-chain {
		comment "helper for @no-endpoint-services / @no-endpoint-nodeports"
		reject
	}

	chain endpoint-KUBDMD37-default/kubernetes/tcp/https__192.168.64.2/6443 {
		ip saddr 192.168.64.2 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.2:6443
	}

	chain service-2QRHZV4L-default/kubernetes/tcp/https {
		ip daddr 172.16.0.1 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-KUBDMD37-default/kubernetes/tcp/https__192.168.64.2/6443 }
	}

	chain endpoint-5UYISHKM-kube-system/ingress-nginx-controller/tcp/http__192.168.64.68/80 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:80
	}

	chain service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http {
		ip daddr 172.16.0.173 tcp dport 80 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-5UYISHKM-kube-system/ingress-nginx-controller/tcp/http__192.168.64.68/80 }
	}

	chain external-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http {
		jump mark-for-masquerade
		goto service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http
	}

	chain endpoint-VRCVTPLF-kube-system/ingress-nginx-controller/tcp/https__192.168.64.68/443 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:443
	}

	chain service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https {
		ip daddr 172.16.0.173 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-VRCVTPLF-kube-system/ingress-nginx-controller/tcp/https__192.168.64.68/443 }
	}

	chain external-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https {
		jump mark-for-masquerade
		goto service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https
	}

	chain endpoint-XIULVOT6-kube-system/ingress-nginx-controller-admission/tcp/https-webhook__192.168.64.68/8443 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:8443
	}

	chain service-FMTKUH45-kube-system/ingress-nginx-controller-admission/tcp/https-webhook {
		ip daddr 172.16.0.220 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-XIULVOT6-kube-system/ingress-nginx-controller-admission/tcp/https-webhook__192.168.64.68/8443 }
	}

	chain endpoint-R3GEKHA3-default/kube-bench-dev/tcp/tcp-80__192.168.64.66/80 {
	}

	chain service-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80 {
	}
}

What did you expect to happen?

External IP service DNAT rules should be installed, like when clusterIP is not None:

table ip kube-proxy {
	comment "rules for kube-proxy"
	set cluster-ips {
		type ipv4_addr
		comment "Active ClusterIPs"
		elements = { 172.16.0.1, 172.16.0.173,
			     172.16.0.220, 172.16.0.242 }
	}

	set nodeport-ips {
		type ipv4_addr
		comment "IPs that accept NodePort traffic"
		elements = { 192.168.64.2 }
	}

	map no-endpoint-services {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "vmap to drop or reject packets to services with no endpoints"
	}

	map no-endpoint-nodeports {
		type inet_proto . inet_service : verdict
		comment "vmap to drop or reject packets to service nodeports with no endpoints"
	}

	map firewall-ips {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "destinations that are subject to LoadBalancerSourceRanges"
	}

	map service-ips {
		type ipv4_addr . inet_proto . inet_service : verdict
		comment "ClusterIP, ExternalIP and LoadBalancer IP traffic"
		elements = { 172.16.0.173 . tcp . 80 : goto service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http,
			     172.16.0.242 . tcp . 80 : goto service-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80,
			     192.168.64.253 . tcp . 80 : goto external-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80,
			     192.168.64.254 . tcp . 80 : goto external-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http,
			     172.16.0.1 . tcp . 443 : goto service-2QRHZV4L-default/kubernetes/tcp/https,
			     172.16.0.173 . tcp . 443 : goto service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https,
			     172.16.0.220 . tcp . 443 : goto service-FMTKUH45-kube-system/ingress-nginx-controller-admission/tcp/https-webhook,
			     192.168.64.254 . tcp . 443 : goto external-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https }
	}

	map service-nodeports {
		type inet_proto . inet_service : verdict
		comment "NodePort traffic"
	}

	chain filter-prerouting {
		type filter hook prerouting priority dstnat - 10; policy accept;
		ct state new jump firewall-check
	}

	chain filter-input {
		type filter hook input priority -110; policy accept;
		ct state new jump nodeport-endpoints-check
		ct state new jump service-endpoints-check
	}

	chain filter-forward {
		type filter hook forward priority -110; policy accept;
		ct state new jump service-endpoints-check
		ct state new jump cluster-ips-check
	}

	chain filter-output {
		type filter hook output priority dstnat - 10; policy accept;
		ct state new jump service-endpoints-check
		ct state new jump firewall-check
	}

	chain filter-output-post-dnat {
		type filter hook output priority dstnat + 10; policy accept;
		ct state new jump cluster-ips-check
	}

	chain nat-prerouting {
		type nat hook prerouting priority dstnat; policy accept;
		jump services
	}

	chain nat-output {
		type nat hook output priority dstnat; policy accept;
		jump services
	}

	chain nat-postrouting {
		type nat hook postrouting priority srcnat; policy accept;
		jump masquerading
	}

	chain nodeport-endpoints-check {
		ip daddr @nodeport-ips meta l4proto . th dport vmap @no-endpoint-nodeports
	}

	chain service-endpoints-check {
		ip daddr . meta l4proto . th dport vmap @no-endpoint-services
	}

	chain firewall-check {
		ip daddr . meta l4proto . th dport vmap @firewall-ips
	}

	chain services {
		ip daddr . meta l4proto . th dport vmap @service-ips
		ip daddr @nodeport-ips meta l4proto . th dport vmap @service-nodeports
	}

	chain masquerading {
		meta mark & 0x00004000 == 0x00000000 return
		meta mark set meta mark ^ 0x00004000
		masquerade fully-random
	}

	chain cluster-ips-check {
		ip daddr @cluster-ips reject comment "Reject traffic to invalid ports of ClusterIPs"
	}

	chain mark-for-masquerade {
		meta mark set meta mark | 0x00004000
	}

	chain reject-chain {
		comment "helper for @no-endpoint-services / @no-endpoint-nodeports"
		reject
	}

	chain endpoint-KUBDMD37-default/kubernetes/tcp/https__192.168.64.2/6443 {
		ip saddr 192.168.64.2 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.2:6443
	}

	chain service-2QRHZV4L-default/kubernetes/tcp/https {
		ip daddr 172.16.0.1 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-KUBDMD37-default/kubernetes/tcp/https__192.168.64.2/6443 }
	}

	chain endpoint-5UYISHKM-kube-system/ingress-nginx-controller/tcp/http__192.168.64.68/80 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:80
	}

	chain service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http {
		ip daddr 172.16.0.173 tcp dport 80 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-5UYISHKM-kube-system/ingress-nginx-controller/tcp/http__192.168.64.68/80 }
	}

	chain external-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http {
		jump mark-for-masquerade
		goto service-TPLZMVKW-kube-system/ingress-nginx-controller/tcp/http
	}

	chain endpoint-VRCVTPLF-kube-system/ingress-nginx-controller/tcp/https__192.168.64.68/443 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:443
	}

	chain service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https {
		ip daddr 172.16.0.173 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-VRCVTPLF-kube-system/ingress-nginx-controller/tcp/https__192.168.64.68/443 }
	}

	chain external-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https {
		jump mark-for-masquerade
		goto service-HNB4FGVK-kube-system/ingress-nginx-controller/tcp/https
	}

	chain endpoint-XIULVOT6-kube-system/ingress-nginx-controller-admission/tcp/https-webhook__192.168.64.68/8443 {
		ip saddr 192.168.64.68 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.68:8443
	}

	chain service-FMTKUH45-kube-system/ingress-nginx-controller-admission/tcp/https-webhook {
		ip daddr 172.16.0.220 tcp dport 443 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-XIULVOT6-kube-system/ingress-nginx-controller-admission/tcp/https-webhook__192.168.64.68/8443 }
	}

	chain endpoint-R3GEKHA3-default/kube-bench-dev/tcp/tcp-80__192.168.64.66/80 {
		ip saddr 192.168.64.66 jump mark-for-masquerade
		meta l4proto tcp dnat to 192.168.64.66:80
	}

	chain service-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80 {
		ip daddr 172.16.0.242 tcp dport 80 ip saddr != 192.168.64.64/26 jump mark-for-masquerade
		numgen random mod 1 vmap { 0 : goto endpoint-R3GEKHA3-default/kube-bench-dev/tcp/tcp-80__192.168.64.66/80 }
	}

	chain external-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80 {
		jump mark-for-masquerade
		goto service-KQA2VLMF-default/kube-bench-dev/tcp/tcp-80
	}
}

How can we reproduce it (as minimally and precisely as possible)?

Apply the yaml

Anything else we need to know?

No response

Kubernetes version

1.33.2

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Metadata

Metadata

Assignees

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.sig/networkCategorizes an issue or PR as relevant to SIG Network.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions