From dcac36c8cf3a7862cb7236cb837f8ce9a8d04f9d Mon Sep 17 00:00:00 2001
From: Mark Burnett <mark.m.burnett@gmail.com>
Date: Fri, 7 Sep 2018 07:52:44 -0500
Subject: [PATCH] Fix: Avoid etcd bootstrap race

This adds a sleep to avoid a tight restart loop for etcd when running in
bootstrap mode (e.g. to spin up etcd for calico).

This doesn't seem to have manifested before, but I saw it while
troubleshooting an environment yesterday, and I'm surprised it hasn't
been seen before.

The issue manifests as repeated teardown and replacement of the
bootstrapping <svc>-etcd-<hostname> pod put in place by the anchor.  The
log messages in the etcd container of the pod will say that etcd is
terminating because it got SIGTERM, and a large number of pause
containers will be left behind and visible in `docker ps -a`.  The
constant pod replacement was racing with how quickly kubernetes would
see the healthy (non-anchor) etcd pod allowing the anchor to be able to
reach etcd over the kubernetes service to check its health.  A successful
health check by the anchor ends the bootstrapping phase, exiting the
race.

I'm confident there's a better approach to clean this section of code
up; however, the concern with this PS is to address the problematic
tight loop, allowing a more rigorous improvement to come later.

Change-Id: I0e3181194cfcd376967672b47a5e126103b4dfe4
---
 charts/etcd/templates/bin/_etcdctl_anchor.tpl | 1 +
 1 file changed, 1 insertion(+)
diff --git a/charts/etcd/templates/bin/_etcdctl_anchor.tpl b/charts/etcd/templates/bin/_etcdctl_anchor.tpl
index b0d8b762..c17fca3e 100644
--- a/charts/etcd/templates/bin/_etcdctl_anchor.tpl
+++ b/charts/etcd/templates/bin/_etcdctl_anchor.tpl
@@ -76,6 +76,7 @@ while true; do
         ETCD_INITIAL_CLUSTER=${ETCD_NAME}=https://\$\(POD_IP\):{{ .Values.network.service_peer.target_port }}
         ETCD_INITIAL_CLUSTER_STATE=new
         create_manifest "$ETCD_INITIAL_CLUSTER" "$ETCD_INITIAL_CLUSTER_STATE" "$MANIFEST_PATH"
+        sleep {{ .Values.anchor.period }}
         continue
     fi
     {{- end }}