Bumping k8s client to v25.3.0
Cronjob batch v1beta1 no longer available in k8s 1.25
Update tox.ini file to be compatible with v4
Change-Id: Iac79c52c97c9ef1223ae8d502da1572ef8d068fa
Kubernetes v1.22 stopped serving the apiextensions.k8s.io/v1beta1 API
version of CustomResourceDefinition.
This change ensures that the locks.armada.process CRD is created using
the apiextensions.k8s.io/v1 API.
The kubernetes client package is also updated to take advantage of the
dynamic client.
Change-Id: Icd518ab5cbb78e8b15f63d19c51b5f5b9a67e995
Helm 3 breaking changes (likely non-exhaustive):
- crd-install hook removed and replaced with crds directory in
chart where all CRDs defined in it will be installed before
any rendering of the chart
- test-failure hook annotation value removed, and test-success
deprecated. Use test instead
- `--force` no longer handles recreating resources which
cannot be updated due to e.g. immutability [0]
- `--recreate-pods` removed, use declarative approach instead [1]
[0]: https://github.com/helm/helm/issues/7082
[1]: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
Signed-off-by: Sean Eagan <seaneagan1@gmail.com>
Change-Id: I20ff40ba55197de3d37e5fd647e7d2524a53248f
This PS:
1) Looks to improve specific logging in Armada, so that
it's easier to debug deployment related issues
2) Uplifts the k8s Python dependency to 12.0.0
3) Enforces 'watch' timeouts more strictly, as the call to
the Kubernetes Python watch function seemed unreliable.
4) Adds a field selector to the 'watch' stream to look for
the DELETE action to have been completed on the specific
pod/job/cronjob, rather than looking across the whole
namespace or via labels. This will narrow what the watch
is looking at, making the logs less busy.
Change-Id: I1952b0db32fb0b56ffffcddeae0532beb5a27b67
Armada intends to use `propagationPolicy: Foreground` when deleting
resources. However, the empty V1DeleteOptions object in the body of the
delete API call takes priority over the propogation_policy specified as
a query param, resulting in the per-resouce default. For Job and CronJob
resources, the default is Orphan, and for others it is Background.
This change includes the desired propogation_policy in V1DeleteOptions.
Reference: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#setting-the-cascading-deletion-policy
Change-Id: Iffee12b426ba1e7741eb5bd687ca1b2c11cb071d
From recently merged document updates in [0] there is a desire to
standardize the Airship project python codebase. This is the effort
to do so for the Armada project.
[0] https://review.opendev.org/#/c/671291/
Change-Id: I4fe916d6e330618ea3a1fccfa4bdfdfabb9ffcb2
This creates a new mechanism in Armada to enable functions to only be
run once across multiple instances of Armada working with the same
Kubernetes cluster. This is accomplished by utilizing custom resources
via the Kubernetes API.
This also introduces new config defaults that can be used to configure
the lock timeout, expiration, and update interval.
Some notes on how the lock works:
* Functions to be locked can add the new decorator
* The optional name parameter can be used to create multiple
types of locks which can coexist
* If the lock is unable to be acquired before the timeout a new
exception is raised
* The lock is updated regularly while the decorated function is
still running
* If a lock already exists it will only be overwritten if the
duration since its last update is longer than the expiration time
For now this locking method is being used for components that require
write access to Tiller so that simultaneous write operations are
avoided.
Change-Id: Iee07da9a233ee2e2a54c6bc4881185388b377c05
Added a new option --bearer-token TEXT in the Armada CLI to allow
the users or applications to pass kubernetes-api bearertokens via
tiller to the kubernetes cluster. This is to allow armada to interact
with a kubernetes cluster that has been configured with an external
Auth-Backend like Openstack-keystone or OpenId Connect.
Bearer Tokens are Auth tokens issued by the identity backends
such as keystone which represent a users authorized access.
For better understanding of bearer tokens, an example case
of how they works can be found here
https://kubernetes.io/docs/reference/access-authn-authz/authentication/#putting-a-bearer-token-in-a-requesthttps://docs.docker.com/registry/spec/auth/token/
Change-Id: I03623c7d3b58eda421a0660da8ec3ac2e86915f0
Signed-off-by: Shoaib Nasir <shoaib.nasir@windriver.com>
The kubernetes python client has a bug [1] which results in frequent
deadlocks while being cleaned up, which causes armada to hang at the
end of execution.
This patchset works around that issue by mocking out the associated
thread pools, since they are only needed for async kubernetes api
calls, which armada does not use.
[1]: https://github.com/kubernetes-client/python/issues/411
Change-Id: I71fbfbe355347ae2ddd02ffd26d881368320246b
This patchset changes the wait logic as follows:
- Move wait logic to own module
- Add framework for waiting on arbitrary resource types
- Unify pod and job wait logic using above framework
- Pass resource_version to k8s watch API for cleaner event tracking
- Only sleep for `k8s_wait_attempt_sleep` when successes not met
- Update to use k8s apps_v1 API where applicable
- Allow passing kwargs to k8s APIs
- Logging cleanups
This is in preparation for adding wait logic for other types of resources
and new wait configurations.
Change-Id: I92e12fe5e0dc8e79c5dd5379799623cf3f471082
- Wait for jobs to show as completed, instead of relying on pods
associated with the job to show healthy, as the pods can go
healthy or be removed while the job is still processing. Armada
would continue forward as soon as all pods in current scope
show as healthy.
- Refactor delete pod action a bit, including removing unused code.
- Fixed bug in waiting for pods to delete (in tiller handler L274).
Bug caused a hung state while deleting pods as a pre-update hook,
by passing timeout value in the incorrect position.
Change-Id: I2a942f0a6290e8337fd7a43c3e8c9b4c9e350a10
This fixes up improper sphinx syntax in docstrings by making
the following corrections:
* params => param
* :param - => :param
Change-Id: I1ff457d609128ae7c5fac2c7190f5ff1a88315b3
- Adding yapf diff to pep8 target
- Adding yapf tox target to do actual format
** The rest of this PS contains formatted code only, no other changes
Change-Id: Idfef60f53565add2d0cf65bb8e5b91072cf0aded
- revise wait on namespace+label, only wait on ns+label for
charts we've touched in the current apply loop
- skipping any actions that would change system during dry-run
- skip 'test' and 'wait' during dry-run
- tweaking some logs for insight and readability
Change-Id: I1223f01690832c26ce2faa96e7e64620cf413ac9
This patch set removes E722 pep8 exclusion that allows for "bare"
except: statement.
Change-Id: Icdce885366541b88aabbef35166cf196a588676b
Signed-off-by: Tin Lam <tin@irrational.io>
Adds a 'cronjob' key for pre-upgrade delete actions to delete cron jobs.
The 'job' key now also deletes cron jobs as well, since existing clients
were expecting that behavior.
Change-Id: Id320710a935976c9c1320c25049b7f22ee4136ba
- fixing wait handling in multiple areas
-- wait for deleted pods before continuing Apply update
-- cleaning up and delineating wait for charts vs chartgroups
-- timeout exceptions to stop execution
- api/cli 'timeout' param now applies to all Charts
- api/cli 'wait' param now applies to all Charts
- update some docs
- several TODOs to be addressed in future PS
Closes #199
Change-Id: I5a697508ce6027e9182f3f1f61757319a3ed3593
- Additional logging to try to expose bug around deleted jobs
during an upgrade.
- Cleaner chart diff logging.
Change-Id: I5edfa1857aec417203e73565a39082328e3b677b
Enhance request logging (and scrub sensitive headers)
Enhance Tiller logging
Update grpcio, unpin from 1.6.0rc1
Plus a couple typo fixes
Plus a couple unused vars
Change-Id: I8afd679f6716c6e1af234a59ac44ba1fdc73cdc8
Update handler for chart pre-upgrade Jobs deletion
to rely on Kubernetes propagationPolicy for deleting
child Pods so that more generic labels can exist in
an Armada manifest without impacting job-unrelated
pods
- Update K8s API integration to use propagationPolicy for job delete
- Make default propagationPolicy 'Foreground'
- Update documents to clarify structure of specifying pre-upgrade hooks
- Fix tox file to support running unit tests behind a HTTP proxy
Change-Id: I650543cfe05cc6a9661ab375e831bb425b7eeeab
- using click framework
- added api client
- allow interactions between code and service endpoints
- documention on the command line
- updated gitignore
Change-Id: Ibe359025f5b35606d876c29fa88e04048f276cc8
This patch set makes Armada pep8 compliant. Note the hapi/** is
autogenerated and therefore should be excluded from linting.
Change-Id: I123eefb543f9bd9cf0bc6bd98ed95646d8d72cc3
* Ensure that configurations are done via the global `cfg` object
* Ensure that the logger is configure through the global object
* Upload a configuration sample file with DEFAULT section having
the armada.conf and oslo_log namespace
- Adds Oslo logging libraries
- Enables logging configuration with a config file
- Enables debug logging with --trace flag
- Supports Docker logs
- Adds logging for tiller