Commit Graph

27 Commits

Author SHA1 Message Date
Ruslan Aliev d9e2248172 Add configurable support of armada-operator for armada-api
Signed-off-by: Ruslan Aliev <raliev@mirantis.com>
Change-Id: I76fb41062d152bf360a85d781c19ab5b204769b8
2024-02-12 11:09:18 -06:00
Sergiy Markin 386a686e69 [focal] Python modules sync with Airship project
- armada-airskiff-deploy is voting gate again
- fixed falcon.API deprecation - -> falcon.App
- fixed collections.abc.defaultdict not found error
- fixed tox4 requirements
- implemented requirements-frozen.txt approach to make allike as other
  Airship projects
- uplifted docker version in the image building and publishing gate

Change-Id: I337ec07cd6d082acabd9ad65dd9eefb728a43b12
2023-04-21 23:49:14 +00:00
SPEARS, DUSTIN (ds443n) 099de8aaf4 Bump k8s client to v25.3.0
Bumping k8s client to v25.3.0
Cronjob batch v1beta1 no longer available in k8s 1.25
Update tox.ini file to be compatible with v4

Change-Id: Iac79c52c97c9ef1223ae8d502da1572ef8d068fa
2023-01-18 11:25:05 -05:00
Sean Eagan 68747d0815 Use helm 3 CLI as backend
Helm 3 breaking changes (likely non-exhaustive):

- crd-install hook removed and replaced with crds directory in
  chart where all CRDs defined in it will be installed before
  any rendering of the chart
- test-failure hook annotation value removed, and test-success
  deprecated. Use test instead
- `--force` no longer handles recreating resources which
  cannot be updated due to e.g. immutability [0]
- `--recreate-pods` removed, use declarative approach instead [1]

[0]: https://github.com/helm/helm/issues/7082
[1]: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments

Signed-off-by: Sean Eagan <seaneagan1@gmail.com>
Change-Id: I20ff40ba55197de3d37e5fd647e7d2524a53248f
2021-10-04 21:40:26 -05:00
Sean Eagan 5f1ffbbbbe Revert "Airship 2 support features"
This reverts commit c75898cd6a.

Airship 2 ended up using the Flux helm-controller instead:
https://github.com/fluxcd/helm-controller

So this is no longer needed. Removing it to get rid of tech
debt to ease introduction of Helm 3 support.

This retains the part of the commit which extracts the
chart download logic to its own handler as this is still useful.

Change-Id: Icb468be2d4916620fd78df250fd038ab58840182
2021-09-30 17:22:16 -05:00
Thiago Brito 1e5d781fe9 Fixing Armada waits for Evicted pods undefinetly
In the occasion of a pod being evicted due to low resource availability,
armada keeps waiting for the Evicted pod to be ready. This commit
removes that behavior since Kubernetes will spun up a new pod.

Story: 2008645
Task: 41906
Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I7263eebe357b0952375d538555536dc9f7cceff4
2021-03-22 16:45:08 -03:00
Phil Sphicas 6b2c7245de Reestablish watch and retry wait for some errors
Armada uses a Kubernetes watch to implement its chart wait logic. This
can be a fairly long-lived connection to the Kubernetes API server, and
is vulnerable to disruption (if, for example, the kubernetes apiserver
chart is being upgraded).

This change allows Armada to retry the wait for some specific errors,
including the establishment of a new watch, until the overall chart
timeout is reached.

https://github.com/kubernetes-client/python/issues/972
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))

Change-Id: I3e68a54becadd5b2a2343960a120bdc3de8e8515
2020-04-19 18:54:03 +00:00
Phil Sphicas ae1281d874 Fix: wait.resources: [] should disable waits
The mechanism to disable waits is to explictly set
    wait.resources: []

After a recent update [https://review.opendev.org/#/c/697728/], this
configuration results in the default waits (pods and jobs).

This change restores the original functionality.

Change-Id: If5d09f649ca037855c88f928aee6b4dc35ca8b48
2020-04-19 18:54:03 +00:00
Sean Eagan c75898cd6a Airship 2 support features
Airship 2 is using Argo for workflow management, rather
than the builtin Armada workflow functionality. Hence, this
adds an apply_chart CLI command to apply a single chart at
a time, so that Argo can manage the higher level orchestration.

Airship 2 is also using kubernetes as opposed to Deckhand as the
document store. Hence this adds an ArmadaChart kubernetes CRD,
which can be consumed by the apply_chart CLI command. The chart
`dependencies` feature is intentionally not supported by the CRD,
as there are additional complexities to make that work, and ideally
this feature should be deprecated as charts should be building in
there dependencies before consumption by Armada.

Functional tests are included to excercise these features
against a minikube cluster.

Change-Id: I2bbed83d6d80091322a7e60b918a534188467239
2020-03-25 13:56:32 -05:00
Tin Lam 253da9331f [LOG] Fix log message
This patch set adds in a missing space in a log message so it can be more
easily parsed, as the previous fields were separated by space.

Change-Id: I96cceb644c8193909a91fb42e13c43db0f83ba8d
Signed-off-by: Tin Lam <tin@irrational.io>
2020-01-31 21:50:30 -06:00
HUGHES, ALEXANDER (ah8742) b787c418e3 Standardize Armada code with YAPF
From recently merged document updates in [0] there is a desire to
standardize the Airship project python codebase.  This is the effort
to do so for the Armada project.

[0] https://review.opendev.org/#/c/671291/

Change-Id: I4fe916d6e330618ea3a1fccfa4bdfdfabb9ffcb2
2019-07-31 10:16:15 -05:00
Sean Eagan b5432ea394 Fix statefulset wait null pointer scenario
The `rollingUpdate` key is not always defined, hence need to guard
against this.

Change-Id: Ieaae680b724621fe5e5e46533c293427ecd697bc
2019-06-26 13:20:50 -05:00
Sean Eagan 5ffa12fabe [v2 docs] Overhaul wait API
See the v1-v2 migration guide updates in this commit for details.

Change-Id: I6a8a69f8392e8065eda039597278c7dfe593a4fd
2019-05-13 16:52:44 +00:00
Sean Eagan 9a43213198 Fix log message
Change-Id: I552b031e4f23a394853b46ff73dc53a0fecbdb39
2019-04-09 15:23:09 -05:00
Tin Lam 353b52f92b Fix an Armada wait error
Under some wait condition, Armada returns a tuple that contains an extraneous
element that causes a python error. This patch set fixes that so the method
always return a 2-tuple instead sometimes a 3-tuple.

Change-Id: I4c4dfcf03e63f03ad2adc083d39909cf4b47a27f
Signed-off-by: Tin Lam <tin@irrational.io>
2019-04-09 01:16:13 +00:00
Drew Walters e75bb2d90e wait: Fix deployment progress deadline message
If a progress deadline is exceeded while waiting on deployments, Armada
returns three values when two are expected, resulting in a ValueError
exception. This change properly formats the return value.

Change-Id: I49e6c2a022b3bb9bf8d6a01cd2ef261f52eaa426
2019-04-03 15:46:12 +00:00
Zuul 20e270a58b Merge "Exclude generated objects from wait logic" 2019-03-08 20:51:26 +00:00
Sean Eagan c838b2def0 Exclude generated objects from wait logic
This excludes the following generated objects from wait logic:

1. cronjob-generated jobs: these are not directly part of the release,
   so better not to wait on them. if there is a desire to wait on initial
   cronjob success, we can add a separate "type: cronjob" wait for this
   for that purpose.
2. job-generated pods: for the purposes of waiting on jobs, one should
   ensure their configuration includes a "type: job" wait. Once
   controller-based waits are included by default we can also consider
   excluding controller-owned pods from the "type: pod" wait, as those
   will be handled by the controller-based waits then.

Change-Id: Ibf56c6fef9ef72b62da0b066c92c5f29ee4ecb5f
2019-03-04 16:04:33 -06:00
Sean Eagan 3807db1b6e Fail wait when no resources found
This removees a fail-safe that allowed releases which did not
contain pods (intentionally) to still succeed after a best effort
to wait for them until timeout.

Now that we have the ability to disable waiting on resource types
`wait.resources` [0], this fail-safe is no longer needed.

Now when resources are not found, armada will fail with a message
for the user to check their `wait.resources` and labels and
configure as needed. This way we can prompt the user to remove
unnecessary waiting from their deployments.

There is also a longer term plan to make these configurations less
often needed [1].

[0]: https://review.openstack.org/#/c/603901/
[1]: https://review.openstack.org/#/c/636440/

Change-Id: I859326470ecba49f2301705409c51312a601e653
2019-03-01 17:41:50 -06:00
Sean Eagan 623c5056d8 Defend against uninitialized fields in k8s objects
Much of these may be unnecessary, but this code was adapted from
go code which handles uninitialized values better via "zero values",
also the k8s python client docs show most of these fields as
"optional".

Hence, initializing leaf values in these model objects to avoid
further surprises.

Change-Id: Ib646b56dfe1ff83f0ecbedaf73fcde8ffa2be0cf
2019-02-08 11:24:57 -06:00
Drew Walters 56ee364b75 wait: Verify observed_generation exists
Currently, Armada checks if the observed generation of a resource is
zero for resource wait operations; however, the value can be None in
some cases. This change verifies that the value is not zero and exists.

Change-Id: Ib81be3468e73c72b4f20c11e18120d8a5b845e59
2019-02-07 20:07:20 -06:00
Sean Eagan c31a961bf1 Automate deletion of test pods
When running helm tests for a chart release multiple times in a site,
if the previous test pod is not deleted, then the test pod creation
can fail due to a name conflict. Armada/helm support immediate test pod
cleanup, but using this means that upon test failure, the test pod logs will
not be available for debugging purposes. Due to this, the recommended approach
for deleting test pods in Armada has been using `upgrade.pre.delete` actions.
So chart authors can accomplish test pod deletion using this
feature, however, it often takes awhile, usually not until they test upgrading
the chart for chart authors to realize that this is necessary and to get it
implemented.

This patchset automates deletion of test pods directly before running tests by
using the `wait.labels` field in the chart doc when they exist to find all pods
in the release and then using their annotations to determine if they are test
pods and deleting them if so.

A later patchset is planned to implement defaulting of the wait labels when
they are not defined.

Change-Id: I2092f448acb88b5ade3b31b397f9c874c0061668
2019-01-28 13:19:09 -06:00
Drew Walters 1a28e6b72f wait: Remove test pods from wait
When waiting on resources that share labels with existing test pods,
an upgrade can fail due to a wait operation on the existing test pods.
This change skips wait operations on test resources by filtering them
using Helm hooks.

Change-Id: I465d3429216457ea8d088064cafa74b2b0d9b8cb
2018-11-06 21:58:54 +00:00
Andreas Jaeger cb737354f0 Fix Flake8 3.6.0 errors
Flake8 3.6.0 now warns about both line break after and *before* binary
operator, you have to choose whether you use W503 or W504. Disable the
newer W504.

Fix "F841 local variable 'e' is assigned to but never used".

Handle warnings about invalid escape sequence in regex.

Handle invalid escape sequence in string.

Change-Id: I68efbde4e9dd2e6e9455d91313eb45c9c79d35ce
2018-10-26 09:23:42 -04:00
Sean Eagan 28f919d60f Fix log message formatting error
Logging does not support new style format strings.

Change-Id: I8fcdb8a1034066a41a46ba3a6fc45dd3b0257c99
2018-10-15 08:55:25 -05:00
Sean Eagan 9fad5cff0a Add chart API to wait on k8s resource types/labels
This adds a `wait.resources` key to chart documents which allows
waiting on a list of k8s type+labels configurations to wait on.
Initially supported types are pods, jobs, deployments, daemonsets, and
statefulsets. The behavior for controller types is similar to that of
`kubectl rollout status`.

If `wait.resources` is omitted, it waits on pods and jobs (if any exist)
as before.

The existing `wait.labels` key still have the same behavior, but if
`wait.resources` is also included, the labels are added to each resource
wait in that array. Thus they serve to specify base labels that apply
to all resources in the release, so as to not have to duplicate them.
This may also be useful later for example to use them as labels to wait
for when deleting a chart.

Controller types additionaly have a `min_ready` field which
represents the minimum amount of pods of the controller which must
be ready in order for the controller to be considered ready. The value
can either be an integer or a percent string e.g. "80%", similar to e.g.
`maxUnavailable` in k8s. Default is "100%".

This also wraps up moving the rest of the wait code into its own module.

Change-Id: If72881af0c74e8f765bbb57ac5ffc8d709cd3c16
2018-10-05 16:48:32 -05:00
Sean Eagan a9d55ab052 Clean up and refactor wait logic
This patchset changes the wait logic as follows:

- Move wait logic to own module
- Add framework for waiting on arbitrary resource types
- Unify pod and job wait logic using above framework
- Pass resource_version to k8s watch API for cleaner event tracking
- Only sleep for `k8s_wait_attempt_sleep` when successes not met
- Update to use k8s apps_v1 API where applicable
- Allow passing kwargs to k8s APIs
- Logging cleanups

This is in preparation for adding wait logic for other types of resources
and new wait configurations.

Change-Id: I92e12fe5e0dc8e79c5dd5379799623cf3f471082
2018-09-25 12:48:25 -05:00