Cluster-Level Encryption

You can utilize cluster-level software encryption for your data stored on drives (encryption at rest), such that if any drive is removed from the cluster, the data on it remains encrypted on the drive and cannot be read as plain text. The data is encrypted using AES-XTS-256. The keys are protected using the servers’ Trusted Platform Modules (TPMs), or using software encryption.

This feature requires installing a cluster with 3.12.1 and can only be enabled if the newly installed cluster has no volumes or snapshots. The feature cannot be disabled once activated. For more information on how to enable and use the feature, see lbcli enable cluster-encryption and the REST API documentation.

The data is encrypted with a Data Encryption Key (DEK), which is in turn encrypted with a Key Encryption Key (KEK). In this way, the cluster can reduce the surface of attack by having a dedicated DEK per volume and a separate cluster-level KEK that encrypts all of the DEKs. The KEK can be rotated frequently without the need to re-encrypt all of the data.

Enabling Cluster-Level Encryption

The cluster-encryption (encryption at rest) feature flag is optional and set to false by default. In order to use this feature, you will first need to enable the cluster-encryption feature flag. At this point, encryption is not yet fully functional and the cluster and all of the stored data is unencrypted. You will need to run the Enable Encryption API to activate the process.

The feature flag can be enabled or disabled. As long as you have not enabled the cluster-encryption, you can disable this feature flag. Once your cluster-encryption is enabled, you cannot disable the feature flag.

Enable Feature Flag

The entries in the REST API documentation refer to generic feature flags only, and not to each one individually (i.e., /api/v2/featureFlags/{name}/enable).

CLI

Bash
Copy

REST

Bash
Copy

Cluster-Level Encryption Preconditions

  • Enable the cluster-encryption feature flag.
  • If you plan to use the TPM, you will need to validate that TPM v2.0 is enabled on all servers of the cluster.
  • There are no volumes or snapshots on the cluster.
  • If you have IP tables or any firewall rules between the servers in the cluster, you will need to allow port 4007 between the servers of the cluster for the encryption feature to work properly.
Bash
Copy
  • If you are using firewalld, add the port as follows (note that you will need to restart the firewalld service when finished).
Bash
Copy

Enabling the Encryption API Definition

Cluster-level encryption can only be enabled on clusters that do not have data (volumes or snapshots). If you want to activate encryption on a cluster with volumes or snapshots, they should be deleted first.

Once encryption is enabled, the cluster will generate the required encryption keys. Before creating new volumes, validate that the cluster encryption state is enabled. This can be done using the lbcli get clusterinfo (2.2 and above) API.

There are two methods of storing the KEK securely in the cluster. Software encryption (this is the default keystore if no value is set in the API command), or encrypted by TPM 2.0. If TPM is not supported on all of the servers in the cluster, the enable encryption process will fail and the cluster will remain unencrypted.

Once cluster-level encryption with TPM is enabled, make sure that any servers added to the cluster have TPM 2.0 enabled. You will also need to make sure that if you have any other third-party software on the servers that use TPM, you do not perform any reset or clear commands. This could cause the cluster to lose access to the TPM, and to the encryption key and event in data loss.

If you are using encryption and TPM - when adding a new server to the cluster or replacing a server - you need to make sure the new server has TPM2.0 enabled before installation.

It is recommended to run the enable encryption API only when the cluster is stable and all nodes are active on the same Lightbits cluster version (e.g., all servers are on v 3.14). Below a certain threshold of inactive nodes, the enable encryption process will fail and will have to be triggered again. Running the enable encryption API in the middle of an upgrade process could cause unexpected behavior and is not supported.

Enabling the Encryption API

CLI

As this is an irreversible change, in the CLI you will be prompted to make sure that you want to enable encryption.

Bash
Copy

or

Bash
Copy

REST

Bash
Copy

Data Encryption Key (DEK) Retention Time

Each volume has its own DEK. Once a volume and its snapshots have been deleted, the cluster will also remove the DEK associated with the volume. The DEK is not removed immediately and will be deleted after a user-configurable retention period. The default out-of-the-box retention period is seven days. It is not recommended to set the retention time to lower than seven days or higher than 30 days.

Change this configuration value with caution.

This can be configured using the cluster-configuration CLI. For additional information, see lbcli update cluster-config (2.2 and above).

Example

Set the DEK retention period to 10 days:

Bash
Copy

Viewing Cluster-Level Encryption Information

In order to view cluster-level encryption information, you can use the lbcli get clusterinfo (2.2 and above) API.

Bash
Copy

You can also see the same information using:

Bash
Copy

The output you get has a lot of information regarding the cluster. In the encryptionState section, you will see the following (the example below is with encryption enabled and after a few KEK rotations):

Bash
Copy

EncryptionState

Indicates the encryption state of the cluster-level encryption:

  • Disabled
  • Enabling(in the process of enabling encryption - this can take from several seconds up to a few minutes).
  • Enabled

previousKekGenerations

Note that this does not always have to be consecutive with the current generation. For example, the current generation could be five and the previous generation could be three. In most cases, it will be consecutive. In addition, with the process of KEK rotation, you might see two previous generations.

rotationState

Indicates the stage of the cluster root key rotation process:

  • NoRotationNo rotation is in progress. This is the idle state most of the time.
  • DistributingKEKThis is the first stage of cluster root key rotation, where the new KEK is distributed between the cluster components (usually a short process).
  • EncryptyingDEKsThe new KEK is already in place and the cluster is in the process of re-encrypting all of the existing DEKs in the cluster. This can take time depending on the number of volumes/snapshots in the cluster.

Rotating the Cluster Root Encryption Key

Your organizational security policies determine how often encryption keys should be rotated. Once a policy determines that keys should be rotated, you can invoke an API call that will cause the cluster to create a new KEK and re-encrypt the existing DEKs with the new KEK. The cluster always keeps a copy of the “old” KEK for one generation. During key rotation, there is a period of time (from seconds up to a few hours depending on the number of volumes/snapshots in the cluster) that the system re-encrypts the DEKs with the new KEK, and both KEKs are still valid (both the new one and the old one).

During key rotation, all of the existing data encryption keys are re-encrypted, so it is recommended not to run this API too frequently. There is a set cluster config parameter that defines a default of a 24-hour minimum interval between key rotations. So if you rotated the key successfully, you will only be able to rotate the key again 24 hours later.

To get the state of the rotation process, current encryption key generation, and the last date the key was rotated, you can run the get clusterinfo API.

Cluster root encryption key rotation is fully supported from v3.15.x

It is recommended to run the key rotation API only when the cluster is stable and all nodes are active on the same Lightbits cluster version (e.g., all servers are on 3.14). Below a certain threshold of inactive nodes, the key rotation process will fail and will have to be triggered again. Running the key rotation API in the middle of an upgrade process could cause unexpected behavior and is not supported.

KEK Rotation API

CLI

Bash
Copy

REST

Bash
Copy

Exporting the KEK

It is recommended to store the KEK offline outside of the cluster. The KEK is the key that encrypts all of the data encryption keys in the cluster as well as the data on the disks. It is therefore highly recommended to use the getClusterRootKey API to export the KEK and store it in a key vault or HSM.

In order to keep the KEK secure, you will be required to create a RSA4096 or RSA3072 private/public key pair and submit the public key in the API. The cluster will encrypt the KEK with the given public key and you can then decrypt it if needed with your private key.

Note that RSA-OAEP is used for better security and to validate the HASH of the script. If you need to decrypt the KEK, use RSA OAEP with your private key to decrypt it.

GenerationID is a parameter for KEK rotation. The API can only retrieve KEKs that are currently in the system (the current KEK and the previous generation KEK). By default, if the parameter is not used, you will get back the current KEK. If you want the previous KEK, you can specify the generation in this parameter.

To get the current encryption key generation, you can run the get clusterinfo API.

Input

encryptingKeyGeneration: Leave this empty if you only want the current encryption key.

userPublicKey: A user-generated public key.

Output

KeyObject

encryptedKey: The cluster KEK encrypted with the given public key.

encryptingKeyGenerationGenerationID: The generation of the KEK.

The output will be the cluster root key (KEK) that is encrypted with the given public key.

Exporting the KEK API

CLI

Get the current cluster root key (current KEK):

Bash
Copy

or

Bash
Copy

REST

Bash
Copy

Disabling Encryption

Once cluster-level encryption has been activated, it cannot be disabled. Doing so would result in a scenario where some data remains encrypted while other data does not. This feature will be available in a future release.

Limitations

  1. Encryption cannot be disabled after being enabled.
  2. Encryption can only be enabled on clusters with no data (no volumes or snapshots).
  3. It is recommended to do a fresh install of v3.12 and above, before enabling encryption.
  4. Encryption should not be enabled when the cluster is not fully upgraded.
  5. Encryption will not work properly on clusters configured to use NVRAM journaling.
  6. Encryption is currently not supported for systems running ipv6.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard