Logs and Troubleshooting
Backup Failure Detection
For monitoring backup process failures, please refer to the Lightbits on AWS Event Logs section
Backup State
The backup is either running or not. You can view the state of the backup in the Parameter Store.
- In the AWS console, go to the Parameter Store.
- Filter the parameter store by backup-state, and you should have one backup-state parameter for each cluster.

If no backup is running and the exporter servers are down (which is normal between backups):
{"allocator":
{"lastReq":"0001-01-01T00:00:00Z","instances":null},"
Exporter":{"backups":null},"
scheduler":{"backupNextTime":"2022-08-24T16:41:20.965508769Z"},"restore":{"ongoingCommands":null}}
If the exporters are currently running, you can see this in the backup-state, with a list of the instance IDs and their IPs. In addition, you can see if there are any active backup or restore jobs running on the exporter instances.
{"allocator":
{"lastReq":"2022-08-23T05:18:57.495155348Z",
"Instances":[
{"ip":"10.240.35.28",
"Busy":false,
"Id":"i-00132f72678c312fe"
},
{"ip":"10.240.35.197",
"Busy":false,
"Id":"i-059ad195cd8b7afe2"
}
]},"
Exporter":
{"backups":[
{"uuid":"35763758-f411-44cd-b728-f8f31ecc2970",
"asgId":"i-059ad195cd8b7afe2",
"backupStartTime":"2022-08-23T05:17:57.610697817Z",
"retryCount":0
}]},
"Scheduler":
{"backupNextTime":"2022-08-24T05:15:56.917696289Z"},
"restore":{"ongoingCommands":null}
}
If you want to update the backup schedule time, this can be done here. Click Edit in the Parameter Store and edit the backupNextTime. Time should be in 24 hour format. Please be careful when making changes to the parameter store. No other fields should be changed.
If restore is running, this can also been seen in the backup-state (the restore information is at the end):
{"allocator":
{"lastReq":"2022-08-24T19:43:56.296285097Z",
"Instances":[
{"ip":"10.240.131.94",
"Busy":false,
"Id":"i-08f62405e94fae03a"
},
{"ip":"10.240.131.14",
"Busy":false,
"Id":"i-0c3f6215c38db2baa"
}
]},
"Exporter":
{"backups":[]},
"Scheduler":
{"backupNextTime":"2022-08-25T19:02:55.987330394Z","backupOneshot":false},
"Restore":
{"ongoingCommands":[{"cmd":{"s3":{"clusterUUID":"0839bc48-1f33-4ecc-bbbc-e1a72007596b","volumeUUID":"c520cb82-92e2-4171-a058-3aaeaa8dfb93","backupUUID":"5df9c331-a7d0-4f3e-9f57-e01a4974fd40"},"lb":{"name":"v1_restore","size":"1GiB","proj":"default"}},"restoreUUID":"0bd5a0e8-7536-49a0-8df3-b4b602f525f9","instanceId":"i-08f62405e94fae03a","retreies":0}]}}
- The backup status can also be viewed in the Lambda log.

- In the AWS console, go to CloudWatch > Log Groups. Filter for backup and sort by the created time. The log group should be under /aws/lambda.

- Select one of the log files or click Search log group and filter by:
- backup_lambda: Shows each trigger of the Lambda a short message of “Backup Lambda Triggered.” This means that there was no work to do. Any other message means that it performed something in the backup or restore process.
- asg: Provides you with the startup/shutdown of the exporter instances, as preparation for a backup/restore process.

- Clear the filter to review any errors in the Backup Lambda log.
- LogDebug level: In order to retrieve a more detailed log, the Debug environment variable of the Lambda can be set to true (for additional information, see the section on how to update Lambda environment variables).
Backup State
If you are in debug you can see the backup state:
- Scheduling: The backup clone is created and waiting for the backup to start.
- Arm: The backup is allocated to a specific backup exporter instance. It will also include the AWS instance ID of the allocated exporter.
- Ongoing: The backup started and is running data that is being copied to S3.
- Done: The backup finished and the volume is waiting for cleanup.
- Failed: This will be in the Error log level and with the reason it failed.
The information in the volume ACL when running the API list volumes should be ignored and should not be used to get the state of the backup/restore.