Commit Graph

5 Commits

Author SHA1 Message Date
Phil Sphicas 354deab382 kube-proxy: use HTTP probes instead of exec
The existing liveness and readiness probes for kube-proxy are in need of
adjustment. The current implementation is exec-based, which can be a
resource concern, and is tied heavily to iptables, so is incompatible
with ipvs.

This change removes the exec-based liveness and readiness probes from
the kube-proxy daemonset, and replaces them with HTTP probes of the
healthz endpoint, following the direction that kubernetes seems to be
taking.[0][1]

The values.yaml interface to enable and disable the probes and set various
parameters is also modified to use the helm-toolkit standard snippet.[2]
Notably, the settings previously configurable under livenessProbe.config
are now under pod.probes.proxy.proxy.liveness.params.

0: https://github.com/kubernetes/kubernetes/issues/81630
1: https://github.com/kubernetes/kubernetes/pull/75323
2: https://opendev.org/openstack/openstack-helm-infra/src/branch/master/helm-toolkit/templates/snippets/_kubernetes_probes.tpl

Change-Id: I99ccbc2270a1f8a204417aa410868d04788dc60f
2020-05-24 07:38:55 +00:00
Chris Wedgwood ec41efcb4b [proxy] robustness tweak for liveness probe
"wc -l foo" output has two columns causing subtle breakage that shows
up as sporadic cryptic errors at times

Change-Id: I1f708ed011a48a2fbca6af8f4d021005d2296bfd
2020-02-17 23:47:52 +00:00
Mark Burnett eaeb3ae250 Make kube-proxy liveness probe more cautious
This update makes it so list of services without endpoints detected on
the host must be static to cause failure.

This avoids race conditions for large deployments where new services are
being added over several minutes, and trigger probe failures.

Change-Id: Ie65c8613cb85bfdf61d41099540d3499ea1de817
2018-10-10 10:02:45 -05:00
Mark Burnett 83b65b358d Fix: Workaround kube-proxy keeping stale IPs
This updates the liveness probe to fail when there are iptables rules
from kube-proxy that don't appear in existing endpoints.

Change-Id: I376be24566809a653417acfb84cac8f1c4e1a36e
2018-10-09 08:47:40 -05:00
Mark Burnett 69cb269230 Make K8S proxy health check more aggressive
In K8S version 1.10, the proxy can sometimes get stuck believing that
some services do not have any endpoints.  This seems to be triggered by
network instability, though the proxy doesn't seem to recover on its
own, while bouncing the pod fixes the issue.

This change adds a naive means of detecting and recoverying from this
(`iptables-save | grep 'has no endpoints'` in the liveness probe) that
may occasionally have false positives.  As such, the liveness probe is
configured very conservatively to avoid triggering CrashLoopBackoff in
the event of a false positive.

Finally, there is a whitelist feature to help avoid false positives for
services that are known to legitimately have empty endpoints during the
course of normal operation (e.g. Patroni might manage such an endpoint
list).

Change-Id: I29a770fab70b1fb79db59ef5408f40b2af1c01f9
2018-09-05 13:46:03 -05:00