Recovering from Cluster Installation Failure

At times during cluster deployment, errors occur and the configuration step must be retried. To do that, a playbook is provided to stop all services and delete the data-plane and control-plane data and configuration.

For each command below, two variations are provided. Choose the "Ansible Method" if you are installing via Ansible, or follow the "Docker Method" if you are using Docker for installation.

Additionally, each Docker example requires the correct Docker URL: docker.lightbitslabs.com/lightos-3-(Minor Ver)-(Rev)-rhl-(8/9)/lb-ansible:v9.1.0. Note that the repo name requires substitution. Refer to the Lightbits Installation Customer Addendum for the correct repo name.

Cleanup command for a full cluster:

Bash
Copy

Cleanup for one Lightbits server:

Bash
Copy

Replace <server_name> with the name of the server that will be removed as listed in the hosts file, so it can be of the form: server00, server01, etc.

Reconfigure command:

Bash
Copy

The following is important to better understand cleanup and configure:

  • When a Lightbits installation is done via a deploy-lightos playbook - as described in Running the Ansible Installation Playbook - it runs two playbooks in order. First it runs an install playbook, which installs all of the Lightbits dependencies and packages and does a reboot. Then it runs the configure playbook, which sets up all of the Lightbits services.
  • The cleanup playbook removes all of the Lightbits configurations. It does not uninstall any of the packages that were installed.
  • The configure playbook does not install any Lightbits packages; it simply reconfigures all of them. Only run this if you are certain that the deploy-lightos playbook ran through the install playbook on all servers; otherwise, use the deploy-lightos playbook as described in Running the Ansible Installation Playbook.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard