Lightbits Supported Events

Lightbits services will generate events for various operations, activities, statuses, errors, or warnings that can occur in a Lightbits cluster. These events can be collected via official Lightbits API, or - from Release 3.0.3 - by scraping them from the Lightbits services log files.

The format and information of events will be identical in either format they collected through (API/logs). Events in logs will be preceded by a dedicated "LBEVENT" tag, to simplify filtering them out of logs.

The default location for all logs of Lightbits services is /var/log/<service-name>.log; however, this may be customized in each cluster install by updating the service role template:

logging:

filename: /var/log/node-manager.log

Event NameEvent CauseEvent CodeCause CodeSeverityEvent TypeReporting ServiceDescription
Node InactiveLocal RAID failure201102HighNodeNode ManagerNode state has changed to inactive due to multiple NVMe device disk failures (EC enabled)/single disk failure (no EC). Note: This event was renamed.
Node InactiveOS health check failed201110HighNodeNode ManagerNode state has changed to inactive due to a kernel panic being detected in the OS.
Node InactiveDuroslight health check failed201105MediumNodeNode ManagerNode state has changed to inactive due to a failure with Duroslight.
Node InactiveHeartbeat to node failed201107MediumNodeCluster ManagerNode state has changed to inactive due to a node being unreachable (network failure/server down).
Node InactiveConnectivity issue201108MediumNodeCluster ManagerNode state has changed to inactive due to connectivity errors in dataplane.
Node InactiveFailedToEnableEncryption2011304HighNodeCluster ManagerNode state has changed to inactive due to connectivity
Node ActiveN/A2001InfoNodeCluster ManagerNode is active.
Node in Permanent Failure State InactiveNode Inactive206112HighNodeCluster ManagerNode entered permanent failure state (duration of node in inactive state exceeded configured time window of inactivity).
Node PowerupStarted2082InfoNodeNode ManagerNode powerup started.
Node PowerupCompleted2083InfoNodeNode ManagerNode powerup completed.
Node PowerupFailed2084HighNodeNode ManagerNode powerup failed.
Node High Storage UsagePerformanceDegredationHighCapacityUtilization204500InfoNodeNode ManagerNode entered degraded performance state due to high storage utilization (> 70%).
Node High Storage UsagePerformanceDegredationHighCapacityUtilizationMd204501MediumNodeNode ManagerNode entered degraded performance state due to high metadata utilization.
Node Read-Only StateNode entered read-only state209504HighNodeNode ManagerNode entered read-only state.
Node Read-Only StateNode exited read-only state209506InfoNodeNode ManagerNode exited the read-only state.
NodeStorageCapacityReadOnlyModeMD204505HighNodeNode ManagerNode metadata is at high utilization (90%), and nearing read-only state.
NodeRaidRebuildStatusInitiated2052MediumNodeNode ManagerLocal Raid rebuild has started.
NodeRaidRebuildStatusCompleted2053InfoNodeNode ManagerLocal Raid rebuild has completed.
NodeRaidRebuildStatusReadOnlyMode (Halted)205504HighNodeNode ManagerLocal Raid rebuild was halted (as the node entered read-only state, no further free storage to complete the rebuild).
NodeRaidRebuildStatusExitReadOnlyMode (Resume)205506MediumNodeNode ManagerLocal Raid rebuild resumed (continued with local Raid rebuild after it was halted).
Node UnattachedN/A2101InfoNodeCluster ManagerNode entered Unattached state. All volume and snapshot resources were migrated off of this node.
Server Clock DriftClock drift detected10001000HighServerNode ManagerDetected a clock drift between the reporting server and the rest of the servers in the cluster.
Server Linux VM Write Cache Configuration ErrorServer Linux VM Write Cache Configuration Error12001200HighServerNode ManagerFor Internal use
Server UpgradeStarted3012InfoServerUpgrade ManagerUpgrade of the server started.
Server UpgradeFinished3013InfoServerUpgrade ManagerUpgrade of the server completed.
Server UpgradeFailed3014HighServerUpgrade ManagerUpgrade of server has failed. Note: Additional failure-specific information will be returned here.
Server Upgrade SkippedServer non-upgradeable3021100MediumServerUpgrade ManagerSkip upgrade operation for server, as the server is non-upgradeable. Note: Additional specific information to skip the cause will be returned here.
UpgradeManagerStartupFailedFailed7004HighServerUpgrade ManagerFailed to complete (upgrade manager failed to start up).
Cluster UpgradeStarted4012InfoServerUpgrade ManagerStarted an upgrade of the cluster.
Cluster UpgradeFinished4013InfoServerUpgrade ManagerCompleted an upgrade of the cluster.
Cluster UpgradeFailed4014HighServerUpgrade ManagerUpgrade of the cluster has failed. Note: Additional failure-specific information will be returned here.
NVMeSSDUnhealthyDeviceHealthReachedMaxReadRetries10011003MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeSSDUnhealthyDeviceHealthReachedMaxWriteRetries10011002MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeSSDUnhealthyDeviceHealthAbortedWriteCmds10011004MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeSSDUnhealthyDeviceHealthAbortedReadCmds10011005MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeSSDUnhealthyDeviceHealthWriteErrors10011007MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeSSDUnhealthyDeviceHealthReadErrors10011006MediumNVMe SSDNode ManagerAn NVMe SSD device is unhealthy.
NVMeDeviceFailedN/A5011HighNVMe SSDNode ManagerAn NVMe SSD device has failed.
NVMeDeviceAddedN/A5021InfoNVMe SSDNode ManagerAdded a new NVMe SSD to a node.
AddNVMeDeviceOperationFailedAddNVMeDeviceTaskFailure503901HighNVMe SSDNode ManagerFailed to add a new NVMe SSD to a node.
Volumes Fully ProtectedN/A6001InfoVolumeCluster ManagerVolumes are in fully protected protection state (the event is issued on entering a new state; the event will return a list of volumes affected by this change of state).
Volumes in Degraded StateN/A6011MediumVolumeCluster ManagerVolumes are in degraded protection state (the event is issued on entering a new state; the event will return a list of volumes affected by this change of state).
Volumes in Read-OnlyN/A6021HighVolumeCluster ManagerVolumes are in read-only protection state (the event is issued on entering a new state; the event will return a list of volumes affected by this change of state).
Volumes are UnavailableN/A6031HighVolumeCluster ManagerVolumes are in unavailable protection state (the event is issued on entering a new state; the event will return a list of volumes affected by this change of state).
ClusterCapacityFullHighClusterStorageUtilization4021001HighClusterCluster ManagerCluster utilization is high.
UnRecoverableDataIntegrityDataIntegrityDuringRebuildDueToInheritanceOfCorruption11001113CriticalVolume/NVMeSSDNode ManagerUnrecoverable data integrity error.
UnRecoverableDataIntegrityDataIntegrityDuringUserReadsDueToInheritanceOfCorruption11001112CriticalVolume/NVMeSSDNode ManagerUnrecoverable data integrity error.
UnRecoverableDataIntegrityDataIntegrityDuringRebuildDueToMalfunctioningDevices11001111CriticalVolume/NVMeSSDNode ManagerUnrecoverable data integrity error.
UnRecoverableDataIntegrityDataIntegrityDuringUserReadsDueToMalfunctioningDevices11001110CriticalVolume/NVMeSSDNode ManagerUnrecoverable data integrity error.
RecoverableDataIntegrityDataIntegrityDuringRebuildDueToInheritanceOfCorruption11011113HighVolume/NVMeSSDNode ManagerRecoverable data integrity error.
RecoverableDataIntegrityDataIntegrityDuringUserReadsDueToInheritanceOfCorruption11011112HighVolume/NVMeSSDNode ManagerRecoverable data integrity error.
RecoverableDataIntegrityDataIntegrityDuringRebuildDueToMalfunctioningDevices11011111HighVolume/NVMeSSDNode ManagerRecoverable data integrity error.
RecoverableDataIntegrityDataIntegrityDuringUserReadsDueToMalfunctioningDevices11011110HighVolume/NVMeSSDNode ManagerRecoverable data integrity error.
UnRecoverableDataIntegrityDataIntegrityDuringNodeRebuild11001116CriticalNode/NVMeSSDNode ManagerDetected data integrity error during node rebuild.
RecoverableDataIntegrityDataIntegrityDuringRecoveryFromGracefulShutdown11011115HighNode/NVMeSSDNode ManagerDetected data integrity error during node rebuild, during recovery from a graceful shutdown.
GarbageCollectionDataIntegrityDataIntegrityDuringGarbageCollection11021118HighNode/NVMeSSDNode ManagerDetected data integrity error during garbage collection processing.
InitializingServerEncryptionFailedFailedToGetKEK14001500MediumServerEncryptionNode ManagerFailed to get the encryption key from the cluster manager.
InitializingServerEncryptionFailedFailedToInitializeTPM14001501MediumServerEncryptionNode ManagerFailed to initialize the TPM key.
InitializingServerEncryptionFailedFailedToReadKEK14001502MediumServerEncryptionNode ManagerFailed to set the encryption key in the cache.
InitializingServerEncryptionFailedFailedToWriteKEK14001503MediumServerEncryptionNode ManagerFailed to save the encryption key.
VolumeEncryptionFailedMissingDEK13011300CriticalCluster EncryptionNode ManagerFailed to encrypt or update a volume on a node, due to a missing DEK.
VolumeEncryptionFailedCorruptedDEK13011301CriticalCluster EncryptionNode ManagerFailed to encrypt or update a volume on a node, due to a corrupted DEK.
EnableClusterEncryptionFailedNotEnoughActiveNodes13021302MediumCluster EncryptionCluster ManagerThere are not enough servers with active nodes up to enable encryption.
EnableClusterEncryptionFailedFailedToDistributeKekToNodes13021303MediumCluster EncryptionCluster ManagerCluster Manager failed to distribute the Cluster Encryption Key to all servers with active nodes.
EnableClusterEncryptionFailedFailedToEnableEncryption13021304MediumCluster EncryptionCluster ManagerFailed to enable encryption on the cluster level.
EnableClusterEncryptionInitiatedFailedToEnableEncryption13032InfoCluster EncryptionCluster ManagerCluster encryption process was initiated.
EnableClusterEncryptionCompletedFailedToEnableEncryption13033InfoCluster EncryptionCluster ManagerThe cluster encryption process completed successfully. Your cluster is encrypted.
ServerDisableOperationCompleted3213InfoServerCluster ManagerServer enable operation completed successfully (move server out of maintenance mode).
ServerDisableOperationFailed to Complete3214MediumServerCluster ManagerServer disable operation completed successfully (move server into maintenance mode).
ServerEnableOperationCompleted3223InfoServerCluster ManagerServer disable operation failed to complete (more specific error cause is returned in event).
ServerCreatedCompleted3113InfoServerCluster Manager/Node ManagerNew server created in the cluster (server added to the cluster).
ServerCreateFailedFailed to Complete3124HighServerCluster ManagerFailed to add a new server to the cluster.
ServerDeletedCompleted3133InfoServerCluster ManagerDeleted a server from the cluster.
ServerEvictionInitiated3512HighServerCluster ManagerStarted an eviction of a server's resources.
ServerEvictionCompleted3513InfoServerCluster ManagerCompleted an eviction of a server's resources.
ServerEvictionFailed to Complete3514HighServerCluster ManagerFailed to complete an eviction of a server's resources.
ServerEvictionAbortedInitiated3522InfoServerCluster ManagerInitiated an abortion of an ongoing server eviction.
ServerEvictionAbortedCompleted3523InfoServerCluster ManagerCompleted issuing an abort of an ongoing eviction.
Cluster root key rotationInitiated13042InfoCluster EncryptionCluster ManagerStarted the process to rotate the cluster root key.
Cluster root key rotationCompleted13043InfoCluster EncryptionCluster ManagerCompleted the process of rotating the cluster root key.
Cluster root key rotationComplete13044MediumCluster EncryptionCluster ManagerFailed to complete the process of rotating the cluster root key. This can be either a fatal error that will completely fail the rotation process, or possibly a non-fatal one, which will only cause the rotation process to either take longer or be stuck without failing it.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard