- armada-airskiff-deploy is voting gate again
- fixed falcon.API deprecation - -> falcon.App
- fixed collections.abc.defaultdict not found error
- fixed tox4 requirements
- implemented requirements-frozen.txt approach to make allike as other
Airship projects
- uplifted docker version in the image building and publishing gate
Change-Id: I337ec07cd6d082acabd9ad65dd9eefb728a43b12
Bumping k8s client to v25.3.0
Cronjob batch v1beta1 no longer available in k8s 1.25
Update tox.ini file to be compatible with v4
Change-Id: Iac79c52c97c9ef1223ae8d502da1572ef8d068fa
Helm 3 breaking changes (likely non-exhaustive):
- crd-install hook removed and replaced with crds directory in
chart where all CRDs defined in it will be installed before
any rendering of the chart
- test-failure hook annotation value removed, and test-success
deprecated. Use test instead
- `--force` no longer handles recreating resources which
cannot be updated due to e.g. immutability [0]
- `--recreate-pods` removed, use declarative approach instead [1]
[0]: https://github.com/helm/helm/issues/7082
[1]: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
Signed-off-by: Sean Eagan <seaneagan1@gmail.com>
Change-Id: I20ff40ba55197de3d37e5fd647e7d2524a53248f
This reverts commit c75898cd6a.
Airship 2 ended up using the Flux helm-controller instead:
https://github.com/fluxcd/helm-controller
So this is no longer needed. Removing it to get rid of tech
debt to ease introduction of Helm 3 support.
This retains the part of the commit which extracts the
chart download logic to its own handler as this is still useful.
Change-Id: Icb468be2d4916620fd78df250fd038ab58840182
In the occasion of a pod being evicted due to low resource availability,
armada keeps waiting for the Evicted pod to be ready. This commit
removes that behavior since Kubernetes will spun up a new pod.
Story: 2008645
Task: 41906
Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I7263eebe357b0952375d538555536dc9f7cceff4
Armada uses a Kubernetes watch to implement its chart wait logic. This
can be a fairly long-lived connection to the Kubernetes API server, and
is vulnerable to disruption (if, for example, the kubernetes apiserver
chart is being upgraded).
This change allows Armada to retry the wait for some specific errors,
including the establishment of a new watch, until the overall chart
timeout is reached.
https://github.com/kubernetes-client/python/issues/972
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
Change-Id: I3e68a54becadd5b2a2343960a120bdc3de8e8515
The mechanism to disable waits is to explictly set
wait.resources: []
After a recent update [https://review.opendev.org/#/c/697728/], this
configuration results in the default waits (pods and jobs).
This change restores the original functionality.
Change-Id: If5d09f649ca037855c88f928aee6b4dc35ca8b48
Airship 2 is using Argo for workflow management, rather
than the builtin Armada workflow functionality. Hence, this
adds an apply_chart CLI command to apply a single chart at
a time, so that Argo can manage the higher level orchestration.
Airship 2 is also using kubernetes as opposed to Deckhand as the
document store. Hence this adds an ArmadaChart kubernetes CRD,
which can be consumed by the apply_chart CLI command. The chart
`dependencies` feature is intentionally not supported by the CRD,
as there are additional complexities to make that work, and ideally
this feature should be deprecated as charts should be building in
there dependencies before consumption by Armada.
Functional tests are included to excercise these features
against a minikube cluster.
Change-Id: I2bbed83d6d80091322a7e60b918a534188467239
This patch set adds in a missing space in a log message so it can be more
easily parsed, as the previous fields were separated by space.
Change-Id: I96cceb644c8193909a91fb42e13c43db0f83ba8d
Signed-off-by: Tin Lam <tin@irrational.io>
From recently merged document updates in [0] there is a desire to
standardize the Airship project python codebase. This is the effort
to do so for the Armada project.
[0] https://review.opendev.org/#/c/671291/
Change-Id: I4fe916d6e330618ea3a1fccfa4bdfdfabb9ffcb2
Under some wait condition, Armada returns a tuple that contains an extraneous
element that causes a python error. This patch set fixes that so the method
always return a 2-tuple instead sometimes a 3-tuple.
Change-Id: I4c4dfcf03e63f03ad2adc083d39909cf4b47a27f
Signed-off-by: Tin Lam <tin@irrational.io>
If a progress deadline is exceeded while waiting on deployments, Armada
returns three values when two are expected, resulting in a ValueError
exception. This change properly formats the return value.
Change-Id: I49e6c2a022b3bb9bf8d6a01cd2ef261f52eaa426
This excludes the following generated objects from wait logic:
1. cronjob-generated jobs: these are not directly part of the release,
so better not to wait on them. if there is a desire to wait on initial
cronjob success, we can add a separate "type: cronjob" wait for this
for that purpose.
2. job-generated pods: for the purposes of waiting on jobs, one should
ensure their configuration includes a "type: job" wait. Once
controller-based waits are included by default we can also consider
excluding controller-owned pods from the "type: pod" wait, as those
will be handled by the controller-based waits then.
Change-Id: Ibf56c6fef9ef72b62da0b066c92c5f29ee4ecb5f
This removees a fail-safe that allowed releases which did not
contain pods (intentionally) to still succeed after a best effort
to wait for them until timeout.
Now that we have the ability to disable waiting on resource types
`wait.resources` [0], this fail-safe is no longer needed.
Now when resources are not found, armada will fail with a message
for the user to check their `wait.resources` and labels and
configure as needed. This way we can prompt the user to remove
unnecessary waiting from their deployments.
There is also a longer term plan to make these configurations less
often needed [1].
[0]: https://review.openstack.org/#/c/603901/
[1]: https://review.openstack.org/#/c/636440/
Change-Id: I859326470ecba49f2301705409c51312a601e653
Much of these may be unnecessary, but this code was adapted from
go code which handles uninitialized values better via "zero values",
also the k8s python client docs show most of these fields as
"optional".
Hence, initializing leaf values in these model objects to avoid
further surprises.
Change-Id: Ib646b56dfe1ff83f0ecbedaf73fcde8ffa2be0cf
Currently, Armada checks if the observed generation of a resource is
zero for resource wait operations; however, the value can be None in
some cases. This change verifies that the value is not zero and exists.
Change-Id: Ib81be3468e73c72b4f20c11e18120d8a5b845e59
When running helm tests for a chart release multiple times in a site,
if the previous test pod is not deleted, then the test pod creation
can fail due to a name conflict. Armada/helm support immediate test pod
cleanup, but using this means that upon test failure, the test pod logs will
not be available for debugging purposes. Due to this, the recommended approach
for deleting test pods in Armada has been using `upgrade.pre.delete` actions.
So chart authors can accomplish test pod deletion using this
feature, however, it often takes awhile, usually not until they test upgrading
the chart for chart authors to realize that this is necessary and to get it
implemented.
This patchset automates deletion of test pods directly before running tests by
using the `wait.labels` field in the chart doc when they exist to find all pods
in the release and then using their annotations to determine if they are test
pods and deleting them if so.
A later patchset is planned to implement defaulting of the wait labels when
they are not defined.
Change-Id: I2092f448acb88b5ade3b31b397f9c874c0061668
When waiting on resources that share labels with existing test pods,
an upgrade can fail due to a wait operation on the existing test pods.
This change skips wait operations on test resources by filtering them
using Helm hooks.
Change-Id: I465d3429216457ea8d088064cafa74b2b0d9b8cb
Flake8 3.6.0 now warns about both line break after and *before* binary
operator, you have to choose whether you use W503 or W504. Disable the
newer W504.
Fix "F841 local variable 'e' is assigned to but never used".
Handle warnings about invalid escape sequence in regex.
Handle invalid escape sequence in string.
Change-Id: I68efbde4e9dd2e6e9455d91313eb45c9c79d35ce
This adds a `wait.resources` key to chart documents which allows
waiting on a list of k8s type+labels configurations to wait on.
Initially supported types are pods, jobs, deployments, daemonsets, and
statefulsets. The behavior for controller types is similar to that of
`kubectl rollout status`.
If `wait.resources` is omitted, it waits on pods and jobs (if any exist)
as before.
The existing `wait.labels` key still have the same behavior, but if
`wait.resources` is also included, the labels are added to each resource
wait in that array. Thus they serve to specify base labels that apply
to all resources in the release, so as to not have to duplicate them.
This may also be useful later for example to use them as labels to wait
for when deleting a chart.
Controller types additionaly have a `min_ready` field which
represents the minimum amount of pods of the controller which must
be ready in order for the controller to be considered ready. The value
can either be an integer or a percent string e.g. "80%", similar to e.g.
`maxUnavailable` in k8s. Default is "100%".
This also wraps up moving the rest of the wait code into its own module.
Change-Id: If72881af0c74e8f765bbb57ac5ffc8d709cd3c16
This patchset changes the wait logic as follows:
- Move wait logic to own module
- Add framework for waiting on arbitrary resource types
- Unify pod and job wait logic using above framework
- Pass resource_version to k8s watch API for cleaner event tracking
- Only sleep for `k8s_wait_attempt_sleep` when successes not met
- Update to use k8s apps_v1 API where applicable
- Allow passing kwargs to k8s APIs
- Logging cleanups
This is in preparation for adding wait logic for other types of resources
and new wait configurations.
Change-Id: I92e12fe5e0dc8e79c5dd5379799623cf3f471082