Provisioning Grafana and Prometheus

Prometheus gathers statistics from the Lightbits cluster. Grafana in turn represents everything in graphs on dashboards. This monitoring package can monitor several clusters at once, and multiple clusters can be configured.

The following details how to provision a monitoring stack with Prometheus and Grafana. The instructions below are relevant for Lightbits versions 3.4.1 and above.

Ensure that SELinux firewall permissions are permissive before deploying.

Prerequisites

Designate a non-Lightbits server for the Lightbits monitoring solution to be installed. Ensure that the following software is installed on the designated monitoring host.

docker-ce

Hardware specifications:

10 cores
32 GB RAM
128 GB of storage
Connectivity to Lightbits' access network (Lightbits exporter service and API service).

Installing Monitoring Packages

To install the monitoring packages, run the following:

Bash
    
 
sudo yum install lightos-monitoring-images lightos-monitoring-clustering
Copy

For Deb-packaged based OSs (for example, Ubuntu), see Connecting to the Cluster Client DEB Repository, and then run: sudo apt-get install lightos-monitoring-images lightos-monitoring-clustering.

Monitoring Stack Deployment

To start running the Prometheus and Grafana containers, run the following (clustering):

Bash
    
 
/var/lib/monitoring-images/deploy.sh deploy
Copy

Configuring Prometheus

Prometheus should be configured with all of the jobs to scrape, and alert and recording rules. The only thing left to configure Prometheus with is to add all of the targets for the Lightbits cluster to monitor.

Since this information is deployment-specific, each one should follow the provided example and set the host accordingly.

Each Prometheus instance can monitor multiple clusters at the same time.

To add a cluster to monitor or to update an existing cluster, run the commands in the section below.

Adding Prometheus Targets

The following example illustrates how to generate the configuration for a cluster named cluster_1, which has three servers:

rack01-server01
rack01-server02
rack01-server03

You will then need to configure Prometheus to scrape the services that run on all of the nodes.

The following command generates targets.yaml files that define Prometheus endpoints to scrape. See the <file_sd_config> section of the Prometheus Documentation for additional information.

Bash
    
 
/var/lib/monitoring-images/deploy.sh add_cluster \  -c cluster1 \  -i rack01-server01,rack01-server02,rack01-server03
Copy

This action creates the following files:

Bash
    
​x
 
/var/lib/monitoring-clustering/file_sd_configs/api-service/cluster1-targets.yaml​- labels:    job: cluster_1  targets:   - rack01-server01:443  - rack01-server02:443  - rack01-server03:443/var/lib/monitoring-clustering/file_sd_configs/lightbox-exporter/cluster1-targets.yaml​- labels:    job: cluster_1  targets:   - rack01-server01:8090  - rack01-server02:8090  - rack01-server03:8090
Copy

Ensure the the yaml configuration files have the minimum permissions, reset the files permissions in the Prometheus container to be rw-r--r-- and not rw-------.

Since we bind-mounted the /var/lib/monitoring-clustering/file_sd_configs folder to the Prometheus container, this command issues a reload to Prometheus that is configured to collect these endpoints.

Verify that the targets were configured correctly by viewing http://<prometheus_host>:9090/targets.

Removing Prometheus Targets

The following command will undo the former command:

Bash
    
 
/var/lib/monitoring-images/deploy.sh remove_cluster -c cluster1
Copy

This will delete the following files:

/var/lib/monitoring-clustering/file_sd_configs/api-service/cluster1-targets.yaml
/var/lib/monitoring-clustering/file_sd_configs/lightbox-exporter/cluster1-targets.yaml

And issue a reload to Prometheus.

You can verify that the configuration works by:

Navigating to http://<prometheus_host>:9090/config - making sure that the expected configuration is used.
Navigating to http://<prometheus_host>:9090/targets?search= - making sure that the targets configured in the previous step are updated.

Configuring Grafana

Grafana should be configured automatically with the deployed Prometheus instance as Datasource, and all of the dashboards that Lightbits provides to monitor the cluster.

Cleaning Up Deployed Containers

To run the deployment again or clean the machine from artifacts that this operation applied, run the following command:

Bash
    
 
/var/lib/monitoring-images/deploy.sh clean
Copy

Uninstalling Monitoring Packages

To uninstall monitoring packages, run the following command:

Bash
    
 
sudo yum remove lightos-monitoring-images lightos-monitoring-clustering
Copy

For Deb-packaged based OSs (for example, Ubuntu), run: sudo apt-get remove lightos-monitoring-images lightos-monitoring-clustering

Enabling Persistent Journaling

To aid with troubleshooting and support, Lightbits recommends setting the journal to keep logs even after reboots and shutdowns. To ensure that the OS disk does not fill up, Lightbits also recommends updating the log rotation.

Enable persistent journaling:

Bash
    
 
sudo sed -i 's/#Storage.*/Storage=persistent/' /etc/systemd/journald.conf
Copy

Enable log rotation:

Bash
    
 
sudo sed -i 's|    missingok$|    daily\n    rotate 30\n    compress\n    missingok\n    notifempty|g' /etc/logrotate.d/syslog
Copy

Last updated on

Was this page helpful?