shipyard/shipyard_airflow/plugins
Anthony Lin 017faba69f Update K8s Preflight Check Operator
It seems that it is possible for pods to go into 'MatchNodeSelector'
status after a hard reboot of the node due to reasons mentioned in
[0]. The current preflight check opertor will fail the health checks
even when the entire cluster goes back to normal after the hard reboot
as it will flag any pod(s) that are not in 'Succeeded' or 'Running'
state. This means that our workflow will stop and go into failed state.

This patch set is meant to take care of such scenario and to relax the
health check requirements for the k8s cluster by logging the information
of such pods instead of failing the workflow (note that the status of such
pods will resemble [1]).

[0] https://github.com/kubernetes/kubernetes/issues/52902

[1]

 'status': {'conditions': None,
            'container_statuses': None,
            'host_ip': None,
            'init_container_statuses': None,
            'message': 'Pod Predicate MatchNodeSelector failed',
            'phase': 'Failed',
            'pod_ip': None,
            'qos_class': None,
            'reason': 'MatchNodeSelector',
            'start_time': datetime.datetime(2018, 3, 30, 15, 49, 39, tzinfo=tzlocal())}}

Change-Id: Idb1208d93cddc01cd0375a5ac2e6e73dd3dfad61
2018-04-04 17:36:30 -04:00
..
__init__.py Add deploy site DAG skeleton 2017-08-15 16:23:42 -05:00
airflow_task_state_operators.py Add bandit, coverage and docs to tox 2017-10-25 22:36:39 -05:00
armada_base_operator.py [fix] Upgrade airflow worker aggressively 2018-03-30 19:59:30 -05:00
armada_get_releases.py Refactor Armada Operator 2018-03-26 09:30:21 +00:00
armada_get_status.py Refactor Armada Operator 2018-03-26 09:30:21 +00:00
armada_post_apply.py [fix] Upgrade airflow worker aggressively 2018-03-30 19:59:30 -05:00
armada_validate_design.py Refactor Armada Operator 2018-03-26 09:30:21 +00:00
check_k8s_node_status.py Add Backoff time before checking cluster join 2017-12-08 08:38:53 +00:00
check_k8s_pod_status.py Add Operator to Check Node Status 2017-12-07 12:21:35 -05:00
concurrency_check_operator.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
deckhand_base_operator.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
deckhand_client_factory.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
deckhand_get_design.py Fix finding the correct configdocs version 2018-02-26 21:50:35 -06:00
deckhand_retrieve_rendered_doc.py Refactor Deckhand Operator 2018-02-21 22:09:26 -05:00
deckhand_validate_site.py [bug] Make validation status handling more lenient 2018-03-23 10:21:58 -05:00
deployment_configuration_operator.py [Fix] Armada Operator/Dag - Task Id and Xcom Key 2018-03-27 05:47:26 +00:00
drydock_base_operator.py Enhance Error Logging for Drydock Operator 2018-03-23 08:36:01 -04:00
drydock_deploy_nodes.py Refactor Drydock Operator 2018-03-20 01:53:19 +00:00
drydock_destroy_nodes.py Refactor Drydock Operator 2018-03-20 01:53:19 +00:00
drydock_prepare_nodes.py Bug Fix - Drydock Operator Prepare Nodes 2018-03-24 02:35:17 +00:00
drydock_prepare_site.py Refactor Drydock Operator 2018-03-20 01:53:19 +00:00
drydock_validate_design.py [bug] Make validation status handling more lenient 2018-03-23 10:21:58 -05:00
drydock_verify_site.py Refactor Drydock Operator 2018-03-20 01:53:19 +00:00
get_k8s_pod_port_ip.py Refactor Armada Operator 2018-03-26 09:30:21 +00:00
k8s_preflight_checks_operator.py Update K8s Preflight Check Operator 2018-04-04 17:36:30 -04:00
openstack_operators.py Add bandit, coverage and docs to tox 2017-10-25 22:36:39 -05:00
placeholder_operator.py Add deploy site DAG skeleton 2017-08-15 16:23:42 -05:00
promenade_base_operator.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
promenade_check_etcd.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
promenade_clear_labels.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
promenade_decommission_node.py Refactor Promenade Operator 2018-02-15 09:39:39 -05:00
promenade_drain_node.py Shipyard deployment configuration 2018-03-12 13:31:11 -05:00
promenade_shutdown_kubelet.py Refactor Promenade Operator 2018-02-15 09:39:39 -05:00
rest_api_plugin.py Linting: Make Shipyard mostly pep8 compliant 2017-09-29 10:58:58 -05:00
service_endpoint.py [W.I.P] Refactor Promenade Operator 2018-02-12 07:07:08 +00:00
service_session.py [W.I.P] Refactor Promenade Operator 2018-02-12 07:07:08 +00:00
service_token.py Refactor Deckhand Operator 2018-02-21 22:09:26 -05:00
ucp_preflight_check_operator.py Update Armada Operator 2018-01-08 15:09:54 -05:00
xcom_puller.py [fix] Upgrade airflow worker aggressively 2018-03-30 19:59:30 -05:00
xcom_pusher.py Add Airflow Worker Upgrade Workflow 2018-03-16 10:18:43 -04:00