Commit Graph

45 Commits

Author SHA1 Message Date
SPEARS, DUSTIN (ds443n) 099de8aaf4 Bump k8s client to v25.3.0
Bumping k8s client to v25.3.0
Cronjob batch v1beta1 no longer available in k8s 1.25
Update tox.ini file to be compatible with v4

Change-Id: Iac79c52c97c9ef1223ae8d502da1572ef8d068fa
2023-01-18 11:25:05 -05:00
Phil Sphicas c5d39f27ca Create lock CRD as apiextensions.k8s.io/v1 object
Kubernetes v1.22 stopped serving the apiextensions.k8s.io/v1beta1 API
version of CustomResourceDefinition.

This change ensures that the locks.armada.process CRD is created using
the apiextensions.k8s.io/v1 API.

The kubernetes client package is also updated to take advantage of the
dynamic client.

Change-Id: Icd518ab5cbb78e8b15f63d19c51b5f5b9a67e995
2022-03-09 16:36:40 -08:00
Sean Eagan 68747d0815 Use helm 3 CLI as backend
Helm 3 breaking changes (likely non-exhaustive):

- crd-install hook removed and replaced with crds directory in
  chart where all CRDs defined in it will be installed before
  any rendering of the chart
- test-failure hook annotation value removed, and test-success
  deprecated. Use test instead
- `--force` no longer handles recreating resources which
  cannot be updated due to e.g. immutability [0]
- `--recreate-pods` removed, use declarative approach instead [1]

[0]: https://github.com/helm/helm/issues/7082
[1]: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments

Signed-off-by: Sean Eagan <seaneagan1@gmail.com>
Change-Id: I20ff40ba55197de3d37e5fd647e7d2524a53248f
2021-10-04 21:40:26 -05:00
DeJaeger, Darren (dd118r) 9aadc14777 Armada improved logging, uplift dependency
This PS:

1) Looks to improve specific logging in Armada, so that
it's easier to debug deployment related issues
2) Uplifts the k8s Python dependency to 12.0.0
3) Enforces 'watch' timeouts more strictly, as the call to
the Kubernetes Python watch function seemed unreliable.
4) Adds a field selector to the 'watch' stream to look for
the DELETE action to have been completed on the specific
pod/job/cronjob, rather than looking across the whole
namespace or via labels. This will narrow what the watch
is looking at, making the logs less busy.

Change-Id: I1952b0db32fb0b56ffffcddeae0532beb5a27b67
2021-06-24 10:53:06 -04:00
Sean Eagan 268d7a3958 Move kubernetes client to >=11.0.0
This version had a breaking api change [0], which this
aligns with. This version also adds support for kubernetes
1.14 and 1.15 apis.

[0]: https://github.com/kubernetes-client/python/blob/master/CHANGELOG.md#v1100

Change-Id: I01866bd5739e4eebb3166cb583d07efb046360aa
2020-03-20 08:49:45 -05:00
Phil Sphicas 6a9b3bf9c9 Fix: Use Foreground deletion
Armada intends to use `propagationPolicy: Foreground` when deleting
resources. However, the empty V1DeleteOptions object in the body of the
delete API call takes priority over the propogation_policy specified as
a query param, resulting in the per-resouce default. For Job and CronJob
resources, the default is Orphan, and for others it is Background.

This change includes the desired propogation_policy in V1DeleteOptions.

Reference: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#setting-the-cascading-deletion-policy

Change-Id: Iffee12b426ba1e7741eb5bd687ca1b2c11cb071d
2019-11-14 16:12:43 +00:00
HUGHES, ALEXANDER (ah8742) b787c418e3 Standardize Armada code with YAPF
From recently merged document updates in [0] there is a desire to
standardize the Airship project python codebase.  This is the effort
to do so for the Armada project.

[0] https://review.opendev.org/#/c/671291/

Change-Id: I4fe916d6e330618ea3a1fccfa4bdfdfabb9ffcb2
2019-07-31 10:16:15 -05:00
Zuul 573b3885e0 Merge "Move to kubernetes python client 9.0.0" 2019-04-23 14:01:26 +00:00
Sean Eagan 0a87c5a16a Move to kubernetes python client 9.0.0
This version [0] makes the adal package optional [1], which also
removes its dependency on the cryptography package.

It also fixes a thread pool issue [2] allowing us to remove the
workaround we had in place.

[0]: https://github.com/kubernetes-client/python/blob/master/CHANGELOG.md#v900
[1]: https://github.com/kubernetes-client/python-base/pull/108
[2]: https://github.com/kubernetes-client/gen/pull/91

Change-Id: I55aa8b97483b118fbde7e11df817ad8330da9bf1
2019-04-05 15:01:57 -05:00
Michael Beaver 48920224cc Support in Armada for locking Tiller
This creates a new mechanism in Armada to enable functions to only be
run once across multiple instances of Armada working with the same
Kubernetes cluster. This is accomplished by utilizing custom resources
via the Kubernetes API.

This also introduces new config defaults that can be used to configure
the lock timeout, expiration, and update interval.

Some notes on how the lock works:
    * Functions to be locked can add the new decorator
        * The optional name parameter can be used to create multiple
          types of locks which can coexist
    * If the lock is unable to be acquired before the timeout a new
      exception is raised
    * The lock is updated regularly while the decorated function is
      still running
    * If a lock already exists it will only be overwritten if the
      duration since its last update is longer than the expiration time

For now this locking method is being used for components that require
write access to Tiller so that simultaneous write operations are
avoided.

Change-Id: Iee07da9a233ee2e2a54c6bc4881185388b377c05
2019-03-22 13:56:50 -05:00
Shoaib Nasir 7fb3b8d9ca Add support in Armada CLI to pass user bearer tokens to tiller
Added a new option --bearer-token TEXT in the Armada CLI to allow
the users or applications to pass kubernetes-api bearertokens via
tiller to the kubernetes cluster. This is to allow armada to interact
with a kubernetes cluster that has been configured with an external
Auth-Backend like Openstack-keystone or OpenId Connect.

Bearer Tokens are Auth tokens issued by the identity backends
such as keystone which represent a users authorized access.
For better understanding of bearer tokens, an example case
of how they works can be found here
https://kubernetes.io/docs/reference/access-authn-authz/authentication/#putting-a-bearer-token-in-a-request
https://docs.docker.com/registry/spec/auth/token/

Change-Id: I03623c7d3b58eda421a0660da8ec3ac2e86915f0
Signed-off-by: Shoaib Nasir <shoaib.nasir@windriver.com>
2019-02-01 15:33:18 -05:00
Sean Eagan 673b1ed4bc Workaround kubernetes python client deadlock issue
The kubernetes python client has a bug [1] which results in frequent
deadlocks while being cleaned up, which causes armada to hang at the
end of execution.

This patchset works around that issue by mocking out the associated
thread pools, since they are only needed for async kubernetes api
calls, which armada does not use.

[1]: https://github.com/kubernetes-client/python/issues/411

Change-Id: I71fbfbe355347ae2ddd02ffd26d881368320246b
2018-11-26 15:58:39 -06:00
Sean Eagan a9d55ab052 Clean up and refactor wait logic
This patchset changes the wait logic as follows:

- Move wait logic to own module
- Add framework for waiting on arbitrary resource types
- Unify pod and job wait logic using above framework
- Pass resource_version to k8s watch API for cleaner event tracking
- Only sleep for `k8s_wait_attempt_sleep` when successes not met
- Update to use k8s apps_v1 API where applicable
- Allow passing kwargs to k8s APIs
- Logging cleanups

This is in preparation for adding wait logic for other types of resources
and new wait configurations.

Change-Id: I92e12fe5e0dc8e79c5dd5379799623cf3f471082
2018-09-25 12:48:25 -05:00
Zuul 2bd301efaa Merge "Wait for jobs to complete" 2018-07-23 14:55:38 +00:00
Marshall Margenau ad790b98d7 Wait for jobs to complete
- Wait for jobs to show as completed, instead of relying on pods
  associated with the job to show healthy, as the pods can go
  healthy or be removed while the job is still processing. Armada
  would continue forward as soon as all pods in current scope
  show as healthy.
- Refactor delete pod action a bit, including removing unused code.
- Fixed bug in waiting for pods to delete (in tiller handler L274).
  Bug caused a hung state while deleting pods as a pre-update hook,
  by passing timeout value in the incorrect position.

Change-Id: I2a942f0a6290e8337fd7a43c3e8c9b4c9e350a10
2018-07-20 19:29:33 +00:00
Marshall Margenau c7c7dc671c Removing dead code.
Change-Id: I7121a6d29691cf8d3e779f2afe9ada7d263c6c9d
2018-07-18 16:04:26 -05:00
Felipe Monteiro 9dad7c17c9 chore(docstring): Fix up improper sphinx syntax in docstrings
This fixes up improper sphinx syntax in docstrings by making
the following corrections:

  * params => param
  * :param - => :param

Change-Id: I1ff457d609128ae7c5fac2c7190f5ff1a88315b3
2018-06-22 21:35:29 +00:00
Marshall Margenau f235512d57 Adding yapf config, plus formatted code.
- Adding yapf diff to pep8 target
- Adding yapf tox target to do actual format

** The rest of this PS contains formatted code only, no other changes

Change-Id: Idfef60f53565add2d0cf65bb8e5b91072cf0aded
2018-06-22 14:56:04 -05:00
Marshall Margenau a8dff22e57 Log on k8s watch if no events occur
- will WARN when no watch events occur, when they're expected

Change-Id: I1bee7b52bf86b7688fb320a1e105fd077712009c
2018-05-31 16:53:28 -05:00
Marshall Margenau d770640b95 Revise wait timeouts plus dry-run.
- revise wait on namespace+label, only wait on ns+label for
  charts we've touched in the current apply loop
- skipping any actions that would change system during dry-run
- skip 'test' and 'wait' during dry-run
- tweaking some logs for insight and readability

Change-Id: I1223f01690832c26ce2faa96e7e64620cf413ac9
2018-05-30 16:19:35 -05:00
Tin Lam 8d1521e96c style(pep8): remove E722 exclusion
This patch set removes E722 pep8 exclusion that allows for "bare"
except: statement.

Change-Id: Icdce885366541b88aabbef35166cf196a588676b
Signed-off-by: Tin Lam <tin@irrational.io>
2018-05-08 23:11:18 -05:00
Sean Eagan 2b714888c4 Delete cron jobs too on pre-upgrade job delete actions
Adds a 'cronjob' key for pre-upgrade delete actions to delete cron jobs.
The 'job' key now also deletes cron jobs as well, since existing clients
were expecting that behavior.

Change-Id: Id320710a935976c9c1320c25049b7f22ee4136ba
2018-05-07 16:37:01 +00:00
Marshall Margenau 331b443e58 fix(timeout): timeout logic error during apply
Change-Id: If317829d83e285a8e87bb8ba9878a0eee27b5c34
2018-05-03 11:23:49 -05:00
Marshall Margenau dc508d5012 fix(timeouts): Address timeout handling issues
- fixing wait handling in multiple areas
      -- wait for deleted pods before continuing Apply update
      -- cleaning up and delineating wait for charts vs chartgroups
      -- timeout exceptions to stop execution
    - api/cli 'timeout' param now applies to all Charts
    - api/cli 'wait' param now applies to all Charts
    - update some docs
    - several TODOs to be addressed in future PS

Closes #199

Change-Id: I5a697508ce6027e9182f3f1f61757319a3ed3593
2018-05-01 08:45:56 -05:00
Marshall Margenau 60b8a37f47 bug(deleted jobs) Armada deleting jobs during upgrade
- Additional logging to try to expose bug around deleted jobs
  during an upgrade.
- Cleaner chart diff logging.

Change-Id: I5edfa1857aec417203e73565a39082328e3b677b
2018-04-13 00:30:21 -04:00
Marshall Margenau 3430283865 feat(logging): Enhance logging and update grpcio
Enhance request logging (and scrub sensitive headers)
Enhance Tiller logging
Update grpcio, unpin from 1.6.0rc1

Plus a couple typo fixes
Plus a couple unused vars

Change-Id: I8afd679f6716c6e1af234a59ac44ba1fdc73cdc8
2018-03-09 11:36:57 -05:00
Felipe Monteiro 4fc77ddb27 Wait for pods to become ready in a specific namespace.
Also tweaking wait retry/time logic.

Change-Id: I1f92de896e2d4b80ffe744b936bc47816be8b404
2018-03-01 14:42:50 -06:00
Mark Burnett e0b04e829b Fix: Issue where Armada hangs waiting for pods
This also hopefully provides better logging when waiting for pods.

Closes #194

Change-Id: I3704ff004c35c8ecf90555d16e42f15d24284492
2018-02-15 16:02:21 -06:00
Scott Hussey 9c73661c8b Change job deletion logic
Update handler for chart pre-upgrade Jobs deletion
to rely on Kubernetes propagationPolicy for deleting
child Pods so that more generic labels can exist in
an Armada manifest without impacting job-unrelated
pods

- Update K8s API integration to use propagationPolicy for job delete
- Make default propagationPolicy 'Foreground'
- Update documents to clarify structure of specifying pre-upgrade hooks
- Fix tox file to support running unit tests behind a HTTP proxy

Change-Id: I650543cfe05cc6a9661ab375e831bb425b7eeeab
2017-11-22 15:38:22 -06:00
gardlt 7b26e59422 feat(cli): using-click-framework
- using click framework
- added api client
- allow interactions between code and service endpoints
- documention on the command line
- updated gitignore

Change-Id: Ibe359025f5b35606d876c29fa88e04048f276cc8
2017-11-02 20:59:57 +00:00
gardlt a99fc4ad6c bug(wait): fixing how we wait on chart and group
- added abilty to choose labels to wait on
- created wait object

Change-Id: Ia3b6f7bd7b6ef15779b087c613d69f4f6a7b41e9
2017-11-02 19:51:18 +00:00
gardlt 0663a308d9 feat(tiller): updating-helm-version-2.6.0
- updated hapi lib
- implemented wait-resource-is-ready

Change-Id: Ia547bec0c83e5dca19c87a99dd2cdbe413d78c06
2017-10-27 16:08:47 +00:00
Pete Birley 746cbd0bd8 Fix(linting): Make Armada pep8 compliant
This patch set makes Armada pep8 compliant. Note the hapi/** is
autogenerated and therefore should be excluded from linting.

Change-Id: I123eefb543f9bd9cf0bc6bd98ed95646d8d72cc3
2017-09-29 11:46:58 -04:00
Alexis Rivera De La Torre d5f4378731 feat(armada): adding helm testing framework
- added helm test framework to armada
- added helm test status

Closes #151

Change-Id: I417cae04b4595ad0d4fd05889d90c83907607c47
2017-09-20 21:54:39 +00:00
Omar Rivera 498cf6c98f Fix oslo_config and oslo_log configurations
* Ensure that configurations are done via the global `cfg` object
* Ensure that the logger is configure through the global object
* Upload a configuration sample file with DEFAULT section having
  the armada.conf and oslo_log namespace
2017-08-07 21:43:32 -05:00
gardlt d175e5ef92 [feat] upgrading daemonsets
- rolling update for daemonsets
- search pods by labels
2017-08-01 11:30:54 -05:00
drewwalters96 96661239cb [Feat] Add common error/exception handling
- Add main exception handler
- Add more detailed, individual exceptions for common Armada failure points
- Add exception documentation
2017-07-27 16:22:16 -05:00
Tim Heyer 684f533eab Modify upgrade deletion to delete by labels
-Pods are now deleted directly by label after deleteing job
-Solves the issue of Armada leaving dangling pods
-Wait for new pod after deletion
2017-07-25 16:31:41 -05:00
Alexis Rivera DeLa Torre e845c57370 Revert "Modify upgrade deletion to delete by labels"
This reverts commit 1663cc02c9.
2017-07-25 16:12:54 -05:00
Alexis Rivera DeLa Torre 84946764fb Revert "[WIP] Wait for new pod after deletion"
This reverts commit 03f8813ff8.
2017-07-25 16:12:54 -05:00
Tim Heyer 03f8813ff8 [WIP] Wait for new pod after deletion
- Add functionality to wait for new pod to deploy
 after deleting the old pod
2017-07-25 16:04:51 -05:00
Tim Heyer 1663cc02c9 Modify upgrade deletion to delete by labels
-Pods are now deleted directly by label after deleteing job
-Solves the issue of Armada leaving dangling pods
2017-07-25 16:04:51 -05:00
drewwalters96 928c5f81da [Feat] Add support for Oslo logging configuration
- Adds Oslo logging libraries
- Enables logging configuration with a config file
- Enables debug logging with --trace flag
- Supports Docker logs
- Adds logging for tiller
2017-06-27 16:04:30 -05:00
Tim Heyer 9ed13f388e Implement wait for timeout feature and unit test
-Wait for all charts to deploy before exiting
-Add wait flag and custom timeout flag
2017-06-22 16:02:25 -05:00
gardlt 68d95bdcc5 [feature] restructure-clean-up-project
* updating file-structure
* update docker file
* update develop docs
* update api and cmd
2017-06-12 09:06:17 -05:00