Recovering from Cluster Installation Failure
At times during cluster deployment, errors occur and the configuration step must be retried. To do that, a playbook is provided to stop all services and delete the data-plane and control-plane data and configuration.
For each command below, two variations are provided. Choose the "Ansible Method" if you are installing via Ansible, or follow the "Docker Method" if you are using Docker for installation.
Additionally, each Docker example requires the correct Docker URL: docker.lightbitslabs.com/lightos-3-(Minor Ver)-(Rev)-rhl-(8/9)/lb-ansible:v9.1.0
. Note that the repo name requires substitution. Refer to the Lightbits Installation Customer Addendum for the correct repo name.
Cleanup command for a full cluster:
# If using Ansible Method, run this:
cd ~/light-app
ansible-playbook -i ansible/inventories/cluster_example/hosts playbooks/cleanup-lightos-playbook.yml -t cleanup
# If using Docker Method, run this:
cd ~/light-app
docker run -it --rm --net=host \
-e UID=`id -u` \
-e GID=`id -g` \
-e UNAME=$USER \
-v `pwd`:/ansible \
-w /ansible \
-e ANSIBLE_LOG_PATH=/ansible/ansible.log \
docker.lightbitslabs.com/lightos-3-(Minor Ver)-(Rev)-rhl-(8/9)/lb-ansible:v9.13.0 \
sh -c 'ansible-playbook -i ansible/inventories/cluster_example/hosts \
playbooks/cleanup-lightos-playbook.yml -t cleanup'
Cleanup for one Lightbits server:
# If using Ansible Method, run this:
cd ~/light-app
ansible-playbook -i ansible/inventories/cluster_example/hosts playbooks/cleanup-lightos-playbook.yml -t cleanup --limit <server_name>
# If using Docker Method, run this:
cd ~/light-app
docker run -it --rm --net=host \
-e UID=`id -u` \
-e GID=`id -g` \
-e UNAME=$USER \
-v `pwd`:/ansible \
-w /ansible \
-e ANSIBLE_LOG_PATH=/ansible/ansible.log \
docker.lightbitslabs.com/lightos-3-(Minor Ver)-(Rev)-rhl-(8/9)/lb-ansible:v9.13.0 \
sh -c 'ansible-playbook -i ansible/inventories/cluster_example/hosts \
playbooks/cleanup-lightos-playbook.yml -t cleanup --limit <server_name>'
Replace <server_name> with the name of the server that will be removed as listed in the hosts file, so it can be of the form: server00, server01, etc.
Reconfigure command:
# If using Ansible Method, run this:
cd ~/light-app
ansible-playbook -i ansible/inventories/cluster_example/hosts playbooks/configure-lightos-playbook.yml
# If using Docker Method, run this:
cd ~/light-app
docker run -it --rm --net=host \
-e UID=`id -u` \
-e GID=`id -g` \
-e UNAME=$USER \
-v `pwd`:/ansible \
-w /ansible \
-e ANSIBLE_LOG_PATH=/ansible/ansible.log \
docker.lightbitslabs.com/lightos-3-(Minor Ver)-(Rev)-rhl-(8/9)/lb-ansible:v9.13.0 \
sh -c 'ansible-playbook -i ansible/inventories/cluster_example/hosts \
playbooks/configure-lightos-playbook.yml'
The following is important to better understand cleanup and configure:
- When a Lightbits installation is done via a deploy-lightos playbook - as described in Running the Ansible Installation Playbook - it runs two playbooks in order. First it runs an install playbook, which installs all of the Lightbits dependencies and packages and does a reboot. Then it runs the configure playbook, which sets up all of the Lightbits services.
- The cleanup playbook removes all of the Lightbits configurations. It does not uninstall any of the packages that were installed.
- The configure playbook does not install any Lightbits packages; it simply reconfigures all of them. Only run this if you are certain that the deploy-lightos playbook ran through the install playbook on all servers; otherwise, use the deploy-lightos playbook as described in Running the Ansible Installation Playbook.