Operational Best Practices and Maintenance

The following table details Lightbits operational best practices and maintenance, including common subjects with additional details and references for each.

SubjectDetails
Cluster deployment
  • Leverage commodity servers to deliver flexible, scalable, and high-performance storage capabilities. - with a focus on optimal hardware configurations, network setups, and performance enhancements.

For additional information, see Lightbits Cluster Deployment.

SSD replacement after SSD failure
  • How to replace SSDs - including detecting failures, implications on capacity, adding healthy SSDs, etc.

For additional information, see SSD Failure Handling.

Server maintenance flows
  • Enabling/disabling servers.
  • When to stop/start services.
  • Fail in place, fail in place timeout per server, evict.

For additional information, see Lightbits Server Maintenance and Handling.

Adding a new server to a cluster
  • Scaling the cluster.

For additional information, see Adding a New Server.

Removing a server from a cluster
  • Removing and disabling a server from the cluster.

For additional information, see Removing a Server from a Cluster.

Installation troubleshooting
  • Including cluster installation failure, log artifacts collection, and cleaning Lightbits from servers or a cluster.

For additional information, see Installation Troubleshooting .

General troubleshooting
  • Including disk usage, inactive nodes, and NVMe device issues.

For additional information, see the Introduction.

x86 Servers
  • Lightbits memory utilization and Linux memory management on x86 servers.

For additional information, see the Lightbits Memory Utilization and Linux Memory Management on x86 Servers.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard