* remove healthcheck sidecar, perform probes in etcd
container itself, failing liveness probes in sidecar
do not restart problematic etcd container;
* verify that etcdctl member list cmd in anchor is
always successfull;
* adjust ETCDCTL_ENDPOINTS env in etcd container to
POD_IP variable instead of localhost (127.0.0.1);
* add liveness/readiness probes to auxiliary etcd as
well as properly passing etcd configuration variables
as strings;
* monitor current leader in initial etcd cluster, in case
if aux member is current leader pass it to permenant
member, same check applies for aux suicide process;
* etcd aux pod will be alive unless all permanent nodes
come up and join the cluster plus apiserver no longer
relies on aux members;
* add 5 seconds sleep between aux member remove for more
smooth transition process.
Signed-off-by: Ruslan Aliev <raliev@mirantis.com>
Change-Id: I7918072a6ba5a6b22b359d1616def8c31425462d
Since after v3.5.6 etcd-io switched to a
distroless base image. Etcd anchor pods
are now using etcd-utility and etcd is
running a sidecar for health checks.
Change-Id: I198dca1209097de4d60a53a7568f0c4790679599
This PS adds a possibility to limit (to throttle) the number of
simultaneously uploaded backups while keeping the logic on the client
side using flag files on remote side.
Change-Id: I753faab8f3d934346d54e38bfc94cec3a8f79385
This PS updates python modules and code to match Airflow 2.6.2:
- bionic py36 gates were removed
- python code corrected to match new modules versions
- selection of python modules versions was perfoemed based on
airflow-2.6.2 constraints
Change-Id: I9c3e139b3437414a61af7e7c0b7d7e533fadefda
Address changes and deprecations in Kubernetes v1.21=>v1.23
controller-manager:
* --authorization-kubeconfig and --authentication-kubeconfig must be set
* liveness/readiness probes must use HTTPS
* the default port has been changed to 10257
kubelet:
* --dynamic-config-dir has been deprecated, will not move to GA
* --cni-bin-dir has been deprecated, will be removed with dockershim
* --cni-conf-dir has been deprecated, will be removed with dockershim
* --network-plugin has been deprecated, will be removed with dockershim
https: //github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.23.md#deprecation
https: //kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
https: //github.com/kubernetes/enhancements/tree/master/keps/sig-node/281-dynamic-kubelet-configuration
Change-Id: Ia996d7c14d81d1d8b8067f11c02ffb4ce90eb49a
Pick up the helm-toolkit DB backup enhancement in etcd
to add capability to retry uploading backup to remote server.
Change-Id: If6ea347a4c2c55f14f35d95681aaf482d0a6103c
1) Include framework for remote etcd backups.
2) Use porthole etcdctl utility image for backups.
3) Move helm-toolkit pin to latest commit.
4) Add a keystone user for RGW.
5) Add a secret for Swift API access.
6) Add a secret for backup/restore configuration.
Change-Id: Ica549c3b6bc00ca55540b8ffedd4c46af0d8d25e
This updates the coredns, haproxy and etcd chart to include the pod
security context on the pod template.
This also adds the container security context to set
readOnlyRootFilesystem flag
Change-Id: I9b5b0ea83acd4c5656577d8cbc684a5031ca0111
Allows extra environment variables to be applied to the etcd pods. Can
be used to apply tuning parameters, enable experimental flags, etc.
Change-Id: I9d82514b6e3a292edc472d885c0a61d5c81199f5
- Rewrite some anchor scripting to support dash
- 'function' not supported, refactor POSIX function declarations
- Rewrite aux monitor to support dash
- Same
Change-Id: If44c59be2f30fd30c1a668bc27e58b37575610b5
This commit enables configuration of probes
for etcd pod by manipulating/overriding values in
values.yaml or through manifests.
Change-Id: I69eabd13f8ea8b97a33281ad993ec2e88b9280bc
- If an etcd member has corrupted data or has somehow
been removed from a cluster, the anchor does not currently
recover. This change adds a threshold of X monitoring loops
after which the anchor will remove the member from the cluster
and recreate it.
Note: This is safe due to etcd's strict quorum checking on
runtime reconfiguration, see [0].
[0] https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/configuration.md#--strict-reconfig-check
Change-Id: Id2ceea7393c46bed9fa5e3ead37014e52c91eac3
This updates the etcd chart to include the pod
security context on the pod template.
This also adds the container security context to set
readOnlyRootFilesystem flag to false
Change-Id: I34a8ab3e850779192491b9b127a82b82f05fa00b
By design, the anchor pods clean up after their static pods
(and associated secrets/configs) via a hook when they the anchor
pods are stopped, to make sure that cruft is not left lying around
(or running) when an anchor pod is no longer scheduled to a host.
However, it's been observed that on a host under high load, e.g.
if one or two other control plane hosts are down, then the anchor
pods may be stopped in an unplanned manner. This results in
service unavailability for the anchored static manifest pods.
This change makes that cleanup behavior configurable (following the
pattern already implemented in the haproxy chart) but leaves it on by
by default.
Change-Id: Iab14510ef8ea5b9e400e0f744231811117029887
The probe script is not being mounted into this pod, causing failures at runtime.
This reverts commit a2e452ae42.
Change-Id: If005ff4244159262c88bfcd85bf2c48caf4b279b
This commit is to add liveness probe to calico-etcd-anchor pod
and both liveness/readiness probe to calico-etcd pod.
Change-Id: I2f856fa9d73152073accd753e715558457ff59e2
- Changed backup path to /var/backups/etcd
- Chanded backup filename to service name to support multiple releases
- Removed additional etcd from cronjob name
Change-Id: I1fabdfe1dccd8e170090eec0a69b2598e1e3e422
Signed-off-by: Sreejith Punnapuzha <Sreejith.Punnapuzha@outlook.com>
This is an effort to impletment etcd backup.
This will create a k8s cron job to take a regular backup.
Change-Id: If2c89ac01540c0f13f9b57a6833a8ea770379717
Signed-off-by: Sreejith Punnapuzha <Sreejith.Punnapuzha@outlook.com>
This change updates the following components in the Promenade charts,
docs, and example bootstrap configuration:
Kubernetes 1.10.11 -> 1.11.6
CoreDNS 1.1.2 -> 1.1.3 (per k8s 1.11 recommendations)
Etcd 3.2.14 -> 3.2.18 (per k8s 1.11 recommendations)
Tiller 2.10.0 -> 2.12.1 (per Helm k8s support)
This change has been tested by the Promenade resiliency gate.
Change-Id: Ia70de212dd2d50c6638578b92c750a4d5c791229
- Update Makefile to more closely match UCP standards
- Add resource limits to any Pods missing them
Change-Id: Ia791a6b207c2baca7dd3141be71aef513c916661
- During genesis there was a race condition on the genesis node leaving
and other nodes joining.
- Updated etcd anchor to update the config when a host is not healthy.
fixes #54
Change-Id: I0ba2c831c73cc3136ee635e7d0c0efcc8b009858
* etcd - bump to 3.2.14 (latest stable)
* calico - bump to 2.6.5 (latest 2.6 series)
* replace :master with :latest in tests (master is no longer a published
tag by CICD)
Change-Id: I82df5038a139658aed015bc2f53eab6e79a15c40
This change includes several interconnected features:
* Migration to Deckhand-based configuration. This is integrated here,
because new configuration data were needed, so it would have been
wasted effort to either implement it in the old format or to update
the old configuration data to Dechkand format.
* Failing faster with stronger validation. Migration to Deckhand
configuration was a good opportunity to add schema validation, which
is a requirement in the near term anyway. Additionally, rendering
all templates up front adds an additional layer of "fail-fast".
* Separation of certificate generation and configuration assembly into
different commands. Combined with Deckhand substitution, this creates
a much clearer distinction between Promenade configuration and
deployable secrets.
* Migration of components to charts. This is a key step that will
enable support for dynamic node management. Additionally, this paves
the way for significant configurability in component deployment.
* Version of kubelet is configurable & controlled via download url.
* Restructuring templates to be more intuitive. Many of the templates
require changes or deletion due to the migration to charts.
* Installation of pre-configured useful tools on hosts, including calicoctl.
* DNS is now provided by coredns, which is highly configurable.
Change-Id: I9f2d8da6346f4308be5083a54764ce6035a2e10c