armada

Commit Graph

Author	SHA1	Message	Date
Sean Eagan	68747d0815	Use helm 3 CLI as backend Helm 3 breaking changes (likely non-exhaustive): - crd-install hook removed and replaced with crds directory in chart where all CRDs defined in it will be installed before any rendering of the chart - test-failure hook annotation value removed, and test-success deprecated. Use test instead - `--force` no longer handles recreating resources which cannot be updated due to e.g. immutability [0] - `--recreate-pods` removed, use declarative approach instead [1] [0]: https://github.com/helm/helm/issues/7082 [1]: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments Signed-off-by: Sean Eagan <seaneagan1@gmail.com> Change-Id: I20ff40ba55197de3d37e5fd647e7d2524a53248f	2021-10-04 21:40:26 -05:00
Sean Eagan	58c0df5201	Extract pre-update actions out of tiller handler This is a pre-requisite for Helm 3 integration, so that these actions run regardless of whether we are going through the tiller handler. Change-Id: I97d7bcc823d11b527fcdaa7967fcab62af1c8161	2021-09-30 17:22:16 -05:00
Sean Eagan	c75898cd6a	Airship 2 support features Airship 2 is using Argo for workflow management, rather than the builtin Armada workflow functionality. Hence, this adds an apply_chart CLI command to apply a single chart at a time, so that Argo can manage the higher level orchestration. Airship 2 is also using kubernetes as opposed to Deckhand as the document store. Hence this adds an ArmadaChart kubernetes CRD, which can be consumed by the apply_chart CLI command. The chart `dependencies` feature is intentionally not supported by the CRD, as there are additional complexities to make that work, and ideally this feature should be deprecated as charts should be building in there dependencies before consumption by Armada. Functional tests are included to excercise these features against a minikube cluster. Change-Id: I2bbed83d6d80091322a7e60b918a534188467239	2020-03-25 13:56:32 -05:00
Sean Eagan	1d9d645a5e	Remove makeMockThreadSafe() It doesn't appear to be compatible with newer versions of python and the mock library, and wasn't working correctly anyways. Change-Id: I117d01bed40849587b2d0337aad56fccdf77e192	2020-02-06 08:55:25 -06:00
Sean Eagan	5d2447560b	Support builtin chart dependencies This adds support for using the same builtin chart dependencies [0] as `helm install\|upgrade ...` would use. [0]: https://helm.sh/docs/developing_charts/#chart-dependencies Change-Id: Ifc541dc273fa2a5c5b4e43125f468ea3fdb0f379	2019-08-22 08:13:03 -05:00
Sean Eagan	0721ed43aa	Implement Prometheus metric integration This implements Prometheus metric integration, including metric definition, collection, and exportation. End user documentation for supported metric data and exportation interface is included. Change-Id: Ia0837f28073d6cd8e0220ac84cdd261b32704ae4	2019-08-15 16:12:17 +00:00
HUGHES, ALEXANDER (ah8742)	b787c418e3	Standardize Armada code with YAPF From recently merged document updates in [0] there is a desire to standardize the Airship project python codebase. This is the effort to do so for the Armada project. [0] https://review.opendev.org/#/c/671291/ Change-Id: I4fe916d6e330618ea3a1fccfa4bdfdfabb9ffcb2	2019-07-31 10:16:15 -05:00
Roman Gorshunov	d404e3c034	Change various URLs for the OpenDev migration Change-Id: I3d345cfe1b3cf6134f5aad69ce639ddd21dc101f	2019-07-26 16:32:02 +02:00
Sean Eagan	5ffa12fabe	[v2 docs] Overhaul wait API See the v1-v2 migration guide updates in this commit for details. Change-Id: I6a8a69f8392e8065eda039597278c7dfe593a4fd	2019-05-13 16:52:44 +00:00
Sean Eagan	8a50591dbf	Introduce v2 docs This introduces v2 docs in order to allow users to opt in to breaking changes, while still supporting v1 docs for a time so folks can migrate. At some point v1 doc support will be removed. This initial version of v2 docs is experimental. Further breaking changes will be made before v2 docs are finalized. A v1-v2 migration guide is included in the documentation. This also refactors the internal data model to include the full document structure, such as `metadata` and `schema`, so that different behavior can be acheived for v1, v2, etc. Change-Id: Ia0d44ff4276ef4c27f78706ab02c88aa421a307f	2019-04-16 10:15:21 -05:00
Michael Beaver	3625acc1aa	Fix gate failures for Docker and Pep8 - Zuul updated ansible to 2.7, no longer uses missing variables. - Using an if to try and address. - Fixes a few formatting problems that are causing the gates to fail Docker fix based on Aaron Sheffield's PS for Pegleg: https://review.openstack.org/#/c/645631/ Change-Id: I14e8f3aac0af7a3abc4e2b6c4ece292a24bc4c6a	2019-03-22 13:46:32 -05:00
Michael Beaver	7f26bbcd59	Fix pep8 errors This addresses the pep8 errors that are causing gate failures Change-Id: Id92dbbf527af1953026f17ddb3f2d79f0a635284	2019-02-14 20:52:38 -06:00
Sean Eagan	47ebd27cad	Add configurability of delete timeout Previously the timeout for deleting chart releases was 300s and not configurable, this patchset makes it so via a new `delete.timeout` property in the `armada/Chart/v1` schema. Helm releases deleted which do not correspond to documents in this schema still do not use a configurable timeout. Those will be considered separately. This also includes a minor logging fix. Change-Id: Ia588faaafd18a3ac00eed3cda2f0556ffcec82c9	2019-01-29 16:49:01 -06:00
Sean Eagan	c31a961bf1	Automate deletion of test pods When running helm tests for a chart release multiple times in a site, if the previous test pod is not deleted, then the test pod creation can fail due to a name conflict. Armada/helm support immediate test pod cleanup, but using this means that upon test failure, the test pod logs will not be available for debugging purposes. Due to this, the recommended approach for deleting test pods in Armada has been using `upgrade.pre.delete` actions. So chart authors can accomplish test pod deletion using this feature, however, it often takes awhile, usually not until they test upgrading the chart for chart authors to realize that this is necessary and to get it implemented. This patchset automates deletion of test pods directly before running tests by using the `wait.labels` field in the chart doc when they exist to find all pods in the release and then using their annotations to determine if they are test pods and deleting them if so. A later patchset is planned to implement defaulting of the wait labels when they are not defined. Change-Id: I2092f448acb88b5ade3b31b397f9c874c0061668	2019-01-28 13:19:09 -06:00
Sean Eagan	6f76f8bec7	bugfix: Looking in wrong place for upgrade options Fixes a bug where Armada Was looking for upgrade options (force, recreate_pods currently) underneath `upgrade` directly rather than `upgrade.options` where they are defined in the schema. Change-Id: Ia95129a19c87f5d59eaefccd04a7ac9e2acb0b3b	2019-01-18 15:57:52 -06:00
Drew Walters	adfe3ae505	test: Refactor test handler While authoring [0], it was discovered that Armada has duplicate logic for deciding if Helm test cleanup should be enabled as well as the tests themselves. Because of this, changes to test logic (e.g. adding pre-test actions), requires changing all traces of the repeated logic, which can lead to inconsistent behavior if not properly addressed. This change moves all test decision logic to a singular Test handler, implemented by the `Test` class. This change does NOT change the expected behavior of testing during upgrades; however, tests initiated from the API and CLI will not execute when testing a manifest if they are disabled in a chart, unless using the `--enable-all` flag. [0] https://review.openstack.org/617834 Change-Id: I1530d7637b0eb6a83f048895053a5db80d033046	2018-11-29 17:30:57 +00:00
Sean Eagan	7af22df7dc	Implement tiller gRPC channel clean up We have seen issues with dangling threads in Armada. This is likely due to a bug [0] in the version of gRPC that we were pinned to. This patchset: - moves us to the latest versions of the gRPC python libraries which add a new `channel.close()` method to cleanup channels. - implements the python context manager api in the tiller handler - uses the context manager api to explicitly scope tiller channel creation and cleanup to each Armada API and CLI call. This also fixes a couples issues with error handling introduced in [1]. [0]: https://github.com/grpc/grpc/issues/14338 [1]: https://review.openstack.org/#/c/610384 Change-Id: I2577a20fc76c397aa33157dc12a0e1d36f49733e	2018-11-12 13:32:52 -06:00
Zuul	d35896537b	Merge "Add caching and cleanup of chart tarballs"	2018-11-05 19:08:25 +00:00
Sean Eagan	69b43983e9	Run wait/test even if chart not updated Previously if a chart is not updated, it would simply be skipped over. Now, the wait/tests are run in this case to ensure the chart success criteria is/was actually satisfied. It does still skip tests if there is a last test result recorded as successful already, as an optimization. Change-Id: I5dc95fe0f16fe0989761e771c77d2c4fa8f6e7ea	2018-10-31 09:53:12 -05:00
Sean Eagan	cbb8ed33e1	Add caching and cleanup of chart tarballs Caching and cleanup of git repository chart sources was previously implemented. This adds these features for tarball sources as well. This also implements transitive chart dependency sourcing. Previously only a single level of dependencies were being downloaded, which would lead to an error when multiple dependency levels exist. Change-Id: I988e473a6ea29331e036d26c3ec7269374e0188f	2018-10-29 16:02:44 -05:00
Sean Eagan	6b96bbf28d	Correctly identify latest release This fixes the following issues with listing releases from tiller, which could cause Armada to be confused about the state of the latest release, and do the wrong thing. - Was not filtering out old releases, so we could find both a FAILED and DEPLOYED release for the same chart. When this is the case it likely means the FAILED release is the latest, since otherwise armada would have purged the release (and all its history) upon seeing the FAILED release in a previous run. The issue is that after the purge it would try to upgrade rather than re-install, since it also sees the old DEPLOYED release. Also if a release gets manually fixed (DEPLOYED) outside of armada, armada still sees the old FAILED release, and will purge the fixed release. - Was only fetching DEPLOYED and FAILED releases from tiller, so if the latest release has another status Armada won't see it at all. This changes to: - Fetch releases with all statuses. - Filter out old releases. - Raise an error if latest release has status other than DEPLOYED or FAILED, since it's not clear what other action to take in this scenario. Change-Id: I84712c1486c19d2bba302bf3420df916265ba70c	2018-10-19 09:14:15 -05:00
Sean Eagan	d229d52292	Parallelize unsequenced chart group deployments This changes unsequenced chart group deployments, such that each chart in the group is deployed in parallel, including the install/upgrade, wait, and tests. Previously, whether and when to wait was entangled with whether or not the chart group was sequenced, since running helm install/upgrade's native wait (which cannot be run later) and armada's labels based wait, delayed (or even prevented in the case of failure) the next chart from being deployed, which is the intention for sequenced, but not for unsequenced. With this patchset, sequencing and waiting are now orthogonal. Hence we can now allow the user to explictly specify whether to wait, which this patchset does for the case of helm's native wait via a new `wait.native.enabled` flag, which defaults to true. Previously, armada's labels-based wait sometimes occurred both between charts and at the end of the chart group. It now occurs once directly after chart deployment. Previously, passing armada's --wait was documented to be equivalent to forcing sequencing of chart groups, however helm tests did not run in sequence as they normally would with sequenced chart groups, they now do. Since chart deploys can now occur in parallel, log messages for each become interleaved, and thus when armada is deploying a chart, log messages are updated to contain identifying information about which chart deployment they are for. Change-Id: I9d13245c40887712333aaccfb044dcdc4b83988e	2018-10-03 10:27:49 -05:00
anthony.bellino	63e63a3c30	Fix for yapf v0.24.0 This PS fixes yapf v0.24.0 errors. Also updates tox.ini and test-requirements.txt accordingly. Change-Id: If1dd0e51d328ad976bf0a7bfd512425c4da4ac0a	2018-09-11 20:39:25 +00:00
Zuul	e4a270b06d	Merge "Move to semantic diffing of charts"	2018-08-21 18:47:08 +00:00
Sean Eagan	5c5ddf8e8c	Move to semantic diffing of charts We were seeing false positives when diffing charts to determine whether an upgrade was necessary. Previously we were serializing the charts and values and diffing those, but these serializations often output things in different and non-deterministic order, hence the false positives. This removes the ordering concerns by puttings things in maps instead of lists, and comparing those semantically rather than via serialization. This also improves the diff output to be easier to read. It also stops caring about diffs in Chart.yaml. Change-Id: I4c92c2e7c814178c374623ea52d717bdb9f72b11	2018-08-20 16:04:11 -05:00
anthony.bellino	fbbfdef4ed	Validation refactor This PS refines the logic in override.update_manifests when validating documents to properly deduce the correct exception that needs to be thrown. Added appropriate logic in armada.py to handle the exceptions thrwon by override.update_manifests. Also validate_armada_documents now logs appropriate error/debug messages. Introduced ArmadaNegativeHandlerTestCase class in test_armada.py, along with updating/adding unit tests in test_override.py. Change-Id: I84051ae4901011093f987479861df5f89561bb2c	2018-08-20 17:27:34 +00:00
Marshall Margenau	cd0242780e	Run helm tests by default - Armada will now run helm tests by default, and the charts must disable tests if they choose. A helm test that does not exist is still a happy-path resolution. - Documentation and schema updates to signify new deault behavior. - Preparing to deprecate `test_charts` in ChartGroup processing. Change-Id: I8e51e33a5d9559b11c2b75e122ecfc97af084ca4	2018-07-17 09:18:39 -05:00
Sean Eagan	2a1a94828d	Change chart `test` key to object and support cleanup flag Previously the chart `test` key was a boolean. This changes it to an object which initially supports an `enabled` flag (covering the previous use case) and adds support for helm's test cleanup option (underneath an `options` key which mirrors what we have for `upgrade`). Existing charts will continue to function the same, with cleanup always turned on, and ability to use the old boolean `test` key for now. When using the new `test` object however, cleanup defaults to false to match helm's interface and allow for test pod debugging. Test pods can be deleted on the next armada apply as well, to allow for debugging in the meantime, by adding `pre`-`upgrade`-`delete` actions for the test pod. The `test` commands in the API and CLI now support `cleanup` options as well. Change-Id: I92f8822aeaedb0767cb07515d42d8e4f3e088150	2018-06-27 10:47:02 -05:00
Marshall Margenau	f235512d57	Adding yapf config, plus formatted code. - Adding yapf diff to pep8 target - Adding yapf tox target to do actual format ** The rest of this PS contains formatted code only, no other changes Change-Id: Idfef60f53565add2d0cf65bb8e5b91072cf0aded	2018-06-22 14:56:04 -05:00
Sean Eagan	d91dd8ad70	Fix and overhaul helm test integration The helm test integration was severely broken, this fixes it by: * correctly handle tiller test call response * removes unnecessary call to tiller to get release content * removes unnecessary call to k8s to check for test pod completion * moves common logic into a test handler * adds test coverage for the above * adds logging for test results streamed from tiller Change-Id: I09062387a1abc2fc3f6960f987c97248d9e1cb69	2018-06-21 14:41:52 -05:00
Marshall Margenau	6546139155	Implement `protected` parameter The `protected` parameter will be used to signify that we should never purge a release in FAILED status. You can specify the `continue_processing` param to either skip the failed release and continue on, or to halt armada execution immediately. - Add protected param to Chart schema and documentation. - Implement protection logic. - Moved purging of FAILED releases out of pre-flight and into sync for finer control over protected params. This means failed releases are now purged one at a time instead of all up front. - Added purge and protected charts to final client `msg` return. - Fix: Added missing dry-run protection in tiller delete resources. Change-Id: Ia893a486d22cc1022b542ab7c22f58af12025523	2018-06-17 20:04:53 -05:00
Marshall Margenau	52bf21989f	Fix release name bug The release name was being treated as multiple different values to mean the same thing, when paired with the 'release_prefix'. This commit addresses the bug, changing all instances to use the 'release' value instead of 'chart_name' or others. Note: This is an impacting change, in the sense that it will cause more reliable behavior in Armada's Apply processing which could have actual impact while upgrading components installed with a previous version of Armada. Previuosly undeleted FAILED releases may now be deleted, and armada test and delete actions may now run as expected where they didn't run before. Change-Id: I9893e506274e974cdc8826b1812becf9b89a0ab6	2018-06-15 11:21:38 -05:00
Sean Eagan	ae690ef828	Expose helm's upgrade/rollback force and recreate pods flags This exposes helm's force and recreate pods flags for upgrade and rollback. It exposes in the chart manifest an options field underneath the upgrade field to hold options to pass through to helm, and initializes it with these two flags. Since rollback is currently a standalone operation which does not consume manifests, these flags are directly exposed as api and cli arguments there. Change-Id: If65c1e97d437d9cf9d5838111fd485c80c76aa1d	2018-06-13 11:28:20 -05:00
Felipe Monteiro	f27ab29db7	[test] Increase armada.handlers.armada test coverage This is a multi-part PS because these patches may include small fix-ups to the code base itself, so the intention is to keep the patches small and easily reversible. This patchset introduces the following: * html coverage report (execute tox -e cover then open index.html under htmlcov folder which is created by py.test) * adds additional unit tests for pre_flight_ops * adds more robust assertions for those tests Change-Id: Ib29d7d8d0c3b686a36c5a87fc46d4594bb1838a6	2018-06-11 10:15:28 -04:00
Marshall Margenau	d770640b95	Revise wait timeouts plus dry-run. - revise wait on namespace+label, only wait on ns+label for charts we've touched in the current apply loop - skipping any actions that would change system during dry-run - skip 'test' and 'wait' during dry-run - tweaking some logs for insight and readability Change-Id: I1223f01690832c26ce2faa96e7e64620cf413ac9	2018-05-30 16:19:35 -05:00
Marshall Margenau	dc508d5012	fix(timeouts): Address timeout handling issues - fixing wait handling in multiple areas -- wait for deleted pods before continuing Apply update -- cleaning up and delineating wait for charts vs chartgroups -- timeout exceptions to stop execution - api/cli 'timeout' param now applies to all Charts - api/cli 'wait' param now applies to all Charts - update some docs - several TODOs to be addressed in future PS Closes #199 Change-Id: I5a697508ce6027e9182f3f1f61757319a3ed3593	2018-05-01 08:45:56 -05:00
Marshall Margenau (mm8789)	efd42dfab2	bug(chartbuilder): uncaught exceptions on bad manifests Armada was able to throw exceptions all the way up to invocation. To address: - remove 'supermutes dotify', which was throwing exceptions - refactor chartbuilder after removing dotify - rework some helm wait/timeout logic, exposed during bug squash - rename some variables to make their function more clear Note: This has potentially breaking changes to existing charts, in the sense that documents previously validated (improperly) may now give errors. Change-Id: I9a6c99aa8ba3d666405e0ccf49a847fd01807b69	2018-03-29 15:15:00 -04:00
Marshall Margenau	964aed2973	feat(validation) Validation messaging - Validation messaging to match UCP convention - Adding some missing fields to Chart validation schema - Minor update: Adding debug logging to each CLI call - Fixing some typos and exception messages Change-Id: I7dc1165432c8b3d138cabe6fd5f3a6e1878810ae	2018-03-15 21:19:43 -04:00
gardlt	3b879fc846	Improved document validation BREAKING CHANGE: Armada will no longer support recursive monolithic documents such that a Manifest fully defines ChartGroups inline and ChartGroups fully define Charts inline. Only name-based references to other documents is supported. - Author document schemas in standalone JSON schema files - Update validation to return all failures available - Removed unit tests for support of recursive monolithic documents Change-Id: Idb91fa552d3d7a3d7d525609d505fe7380443238	2018-02-23 11:11:09 -05:00
drewwalters96	5b75f0a9b4	feat(source): Add support for SSH key authentication - Add support for SSH key auth using existing config file value - Add authentication exceptions - Remove redundant git error handling from Armada handler Closes #169 Change-Id: Ia0f61e0b74893289bb90560a743a243393d89c56	2018-02-13 16:10:34 -05:00
Felipe Monteiro	732af63051	Add unit tests for main Armada handler This PS adds unit tests for the armada handler in armada.handlers.armada. These unit tests are among the most important as the Armada handler itself interacts with most every other part of the application. Note that the current unit tests are not only incorrect (they don't pass) but they skip unconditionally as well. So this PS is needed to correctly implement the intent behind the original unit tests herein. Change-Id: Iecb3e540e1d52eb5a25d9f6825b3d0f5339ede2a	2018-02-02 22:42:09 +00:00
Omar Rivera	498cf6c98f	Fix oslo_config and oslo_log configurations * Ensure that configurations are done via the global `cfg` object * Ensure that the logger is configure through the global object * Upload a configuration sample file with DEFAULT section having the armada.conf and oslo_log namespace	2017-08-07 21:43:32 -05:00
gardlt	d175e5ef92	[feat] upgrading daemonsets - rolling update for daemonsets - search pods by labels	2017-08-01 11:30:54 -05:00
gardlt	afb7fe83ab	[feat] adding standard armada manifest - adding tmeplate armada manifest - create Armada Manifest - updated validation for new documents - updated testing - updated docs	2017-07-26 14:44:05 -05:00
Tim Heyer	701ac2fa8f	Allow Armada to clone OpenStack infra repos -No longer attempts to ping repo -Move all source locating (git or local) to preflight -Only unique repos are cloned now -Catches any source/url errors in preflight and raises exception	2017-07-14 09:19:42 -05:00
drewwalters96	62c05f569a	[Bug] Update default timeout and logic - Change default timeout to 3600 - Timeout specified with CLI has priority - Timeout specified with yaml has second priority	2017-07-14 09:11:40 -05:00
gardlt	55c3a0fa8f	[feat] adding sequenced deployments * added new structure to the armada yaml	2017-06-30 07:40:10 -05:00
drewwalters96	f4303037cc	[Bug] Adds required Oslo configuration to API and unit tests - Register Oslo logging configurations in server.py, test_armada.py, test_chartbuilder.py, and test_tiller.py - Add mock to test-requirements.txt - Update .travis.yml - Add mock case for tiller IP	2017-06-28 18:03:27 -05:00
Tim Heyer	a22d2d0bb0	Implement native tiller timeout support -Update tiller.py and armada.py to support native tiller timeout -Update documentation with the new yaml timeout keyword -Update tiller version to 2.4.2 -Create tests for timeout ability, as well as structure for further test development -Fix gRPC message size bug	2017-06-27 15:46:30 -05:00

49 Commits