Merge "Doc updates for install and troubleshooting"
This commit is contained in:
commit
184f1e2ea8
|
@ -1,21 +1,37 @@
|
|||
Getting Started
|
||||
===============
|
||||
|
||||
Note: This document is meant to give a general understanding of how Promenade
|
||||
could be exercised in a development environment or for general learning and
|
||||
understanding. For holistic UCP deployment procedures, refer to `Treasuremap <https://github.com/att-comdev/treasuremap>`_
|
||||
|
||||
Basic Deployment
|
||||
----------------
|
||||
|
||||
This approach is quick to get started, but generates the scripts used for
|
||||
joining up-front rather than generating them in the API as needed.
|
||||
|
||||
Setup
|
||||
^^^^^
|
||||
Setup Build Machine
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
On the machine you wish to use to generate deployment files, install docker:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
sudo apt -y install docker.io
|
||||
|
||||
This can be the same machine you intend to be the Genesis host, or it may be
|
||||
a separate build machine.
|
||||
|
||||
Generate Build files
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To create the certificates and scripts needed to perform a basic deployment,
|
||||
you can use the following helper script:
|
||||
you can use the following helper script on your build machine:
|
||||
|
||||
.. code-block:: bash
|
||||
.. code-block:: console
|
||||
|
||||
./tools/simple-deployment.sh examples/basic build
|
||||
sudo ./tools/simple-deployment.sh examples/basic build
|
||||
|
||||
This will copy the configuration provided in the ``examples/basic`` directory
|
||||
into the ``build`` directory. Then, it will generate self-signed certificates
|
||||
|
@ -23,18 +39,31 @@ for all the needed components in Deckhand-compatible format. Finally, it will
|
|||
render the provided configuration into directly-usable ``genesis.sh`` and
|
||||
``join-<NODE>.sh`` scripts.
|
||||
|
||||
Genesis Host Provision
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Install Ubuntu 16.04 on the machine intended to be the genesis host. Ensure
|
||||
the host has outbound internet access and DNS resolution.
|
||||
Ensure that the hostname matches the hostname specified in the Genesis.yaml
|
||||
file used to build the above configurations.
|
||||
|
||||
Execution
|
||||
^^^^^^^^^
|
||||
|
||||
Perform the following steps to execute the deployment:
|
||||
|
||||
1. Copy the ``genesis.sh`` script to the genesis node and run it.
|
||||
1. Copy the ``genesis.sh`` script to the genesis node and run it as sudo. In the
|
||||
event of runtime errors, refer to :doc:`troubleshooting/genesis`
|
||||
2. Validate the genesis node by running ``validate-genesis.sh`` on it.
|
||||
3. Join master nodes by copying their respective ``join-<NODE>.sh`` scripts to
|
||||
3. Nodes for which ``join-<NODE>.sh`` scripts have been generated should be
|
||||
provisioned at this point, and need to have network connectivity to the
|
||||
genesis node. (This could be a manual Ubuntu provision, or a Drydock-
|
||||
initiated PXE boot in the case of a full fledged UCP deployment).
|
||||
4. Join master nodes by copying their respective ``join-<NODE>.sh`` scripts to
|
||||
them and running them.
|
||||
4. Validate the master nodes by copying and running their respective
|
||||
5. Validate the master nodes by copying and running their respective
|
||||
``validate-<NODE>.sh`` scripts on each of them.
|
||||
5. Re-provision the Genesis node
|
||||
6. Re-provision the Genesis node
|
||||
|
||||
a) Run the ``/usr/local/bin/promenade-teardown`` script on the Genesis node:
|
||||
b) Delete the node from the cluster via one of the other nodes ``kubectl delete node <GENESIS>``.
|
||||
|
@ -42,7 +71,7 @@ Perform the following steps to execute the deployment:
|
|||
d) Join the genesis node as a normal node using its ``join-<GENESIS>.sh`` script.
|
||||
e) Validate the node using ``validate-<GENSIS>.sh``.
|
||||
|
||||
6. Join and validate all remaining nodes using the ``join-<NODE>.sh`` and
|
||||
7. Join and validate all remaining nodes using the ``join-<NODE>.sh`` and
|
||||
``validate-<NODE>.sh`` scripts described above.
|
||||
|
||||
|
||||
|
|
|
@ -33,4 +33,5 @@ Promenade Configuration Guide
|
|||
design
|
||||
getting-started
|
||||
configuration/index
|
||||
troubleshooting/index
|
||||
api
|
||||
|
|
|
@ -0,0 +1,78 @@
|
|||
Genesis Troubleshooting
|
||||
=======================
|
||||
|
||||
genesis.sh
|
||||
----------
|
||||
|
||||
Kubernetes services failures
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Before the Armada manifests are applied, the genesis.sh script will bring basic
|
||||
kubernetes services online by starting docker containers for these services.
|
||||
|
||||
One of the first services to be brought up is the kubernetes API. If it fails to
|
||||
come up, you may see a repeated error as follows from the genesis.sh script:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
.The connection to the server apiserver.kubernetes.promenade:6443 was
|
||||
refused - did you specify the right host or port?
|
||||
|
||||
Check that the hostname in your Genesis.yaml matches the hostname of the
|
||||
machine you are trying to install onto. If they do not match, change one to
|
||||
match the other. If you change Genesis.yaml, then re-generate the Promenade
|
||||
payloads.
|
||||
|
||||
If the hostnames match, check the container logs under /var/log/pods to see the
|
||||
reason for the provisioning failure. (``kubectl logs`` function will not be
|
||||
available if the API container is not running).
|
||||
|
||||
Armada failures
|
||||
^^^^^^^^^^^^^^^
|
||||
|
||||
When executing genesis.sh, you may encounter failures from Armada in the
|
||||
provisioning of other containers. For example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
CRITICAL armada [-] Unhandled error: armada.exceptions.tiller_exceptions.ReleaseException: Failed to Install release: barbican
|
||||
|
||||
Use ``kubectl logs`` on the failed pod to determine the reason for the failure.
|
||||
E.g.:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
sudo kubectl logs barbican-api-5b8bccdf8f-x7sld --namespace=ucp
|
||||
|
||||
Other errors may point to configuration errors. For example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
CRITICAL armada [-] Unhandled error: armada.exceptions.source_exceptions.GitLocationException: master is not a valid git repository.
|
||||
|
||||
In this case, the git branch name was inadvertently substituted for the git URL
|
||||
in one of the chart definitions in ``bootstrap-armada.yaml``.
|
||||
|
||||
Post-run failures
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
At its conclusion, the genesis script will output the list of containers
|
||||
provisioned and their status, as reported by kubernetes. It is possible that
|
||||
some containers may not be in a Running state. E.g.:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
ucp promenade-api-6696769cd-qwpzf 0/1 ImagePullBackOff 0 10h
|
||||
|
||||
For general failures, ``kubectl logs`` may be used as in the previous section.
|
||||
In this case, it was necessary to run ``kubectl describe`` on the pod to get the
|
||||
details of the image pull failure. E.g.:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
kubectl describe pod promenade-api-7dc54d47c-qw27m --namespace=ucp
|
||||
|
||||
In this particular incident report, the problem was a missing certificate on the
|
||||
bare metal node which caused the image download to fail. Installing the
|
||||
certificate, restarting the docker service, and then waiting for the container
|
||||
to retry resolved this particular issue.
|
|
@ -0,0 +1,9 @@
|
|||
Troubleshooting
|
||||
===============
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Troubleshooting
|
||||
|
||||
genesis
|
||||
|
Loading…
Reference in New Issue