promenade/charts
Mark Burnett dcac36c8cf Fix: Avoid etcd bootstrap race
This adds a sleep to avoid a tight restart loop for etcd when running in
bootstrap mode (e.g. to spin up etcd for calico).

This doesn't seem to have manifested before, but I saw it while
troubleshooting an environment yesterday, and I'm surprised it hasn't
been seen before.

The issue manifests as repeated teardown and replacement of the
bootstrapping <svc>-etcd-<hostname> pod put in place by the anchor.  The
log messages in the etcd container of the pod will say that etcd is
terminating because it got SIGTERM, and a large number of pause
containers will be left behind and visible in `docker ps -a`.  The
constant pod replacement was racing with how quickly kubernetes would
see the healthy (non-anchor) etcd pod allowing the anchor to be able to
reach etcd over the kubernetes service to check its health.  A successful
health check by the anchor ends the bootstrapping phase, exiting the
race.

I'm confident there's a better approach to clean this section of code
up; however, the concern with this PS is to address the problematic
tight loop, allowing a more rigorous improvement to come later.

Change-Id: I0e3181194cfcd376967672b47a5e126103b4dfe4
2018-09-07 07:52:44 -05:00
..
apiserver Opening apiserver Via Ingress 2018-08-10 08:16:50 -05:00
controller_manager Remove unused image references 2018-07-23 11:17:41 -05:00
coredns Fix incorrect use of wget in CoreDNS health 2018-06-14 10:34:42 -05:00
etcd Fix: Avoid etcd bootstrap race 2018-09-07 07:52:44 -05:00
haproxy Add test pods labels. 2018-07-11 08:04:29 -05:00
promenade Update Keystone API ports in Promenade chart 2018-08-23 22:40:09 +00:00
proxy Add liveness probe to kube-proxy 2018-07-23 11:17:41 -05:00
scheduler Enable etcd helm test to run on non-ready nodes 2018-07-19 13:29:18 -05:00
.gitignore Add initial Makefile 2017-10-31 12:46:23 -05:00