Manage Data Replicas

Note

Managing Replication is an EMQX Enterprise feature.

For an EMQX cluster, durable storage achieves high availability through multiple data replicas. If a node crashes, clients can immediately connect to a new node and recover data from the replicas on other nodes. This guide provides instructions on configuring data replication and ensuring high availability for durable storage. This guide consists of instructions for two scenarios: setting up a new EMQX cluster with durable storage and upgrading an existing cluster to enable durable storage.

Initial Cluster Setup

During the initial setup of the cluster, several configuration parameters influence how durable storage is established and data replication starts. These parameters cannot be changed in the runtime, and modifying them will not take any effect once the durable storage is initialized.

Replication Factor

The replication factor, controlled with durable_storage.messages.replication_factor configuration parameter, determines the number of replicas each shard should have across the cluster. The default value is 3.

Setting the replication factor to an odd number is advisable as it influences the quorum size required for successful write operations. A higher replication factor results in more copies of data distributed across the cluster, thereby enhancing high availability. However, it also increases storage and network overhead due to additional communication needed to achieve consensus.

TIP

In smaller clusters, the replication factor is not strictly enforced. For instance, in a two-node cluster, the effective replication factor is 2, as each shard is replicated on both nodes, eliminating the need for further replication. EMQX allocates replicas of each shard to different nodes in the cluster to ensure redundancy.

Number of Shards

The built-in durable storages are split into shards, which are replicated independently from each other. A higher number of shards allows for more parallel publishing and consuming of MQTT messages. However, each shard consumes system resources, such as file descriptors, and increases the volume of metadata stored per session.

The durable_storage.messages.n_shards parameter controls the number of shards, which remains fixed once the durable storage is initialized.

Number of Sites

The durable_storage.messages.n_sites configuration parameter determines the minimum number of sites that must be online for the durable storage to initialize and start accepting writes. Once this minimum is met, the durable storage begins allocating shards to the available sites in a balanced manner.

The default value is 1, meaning each node may initially consider itself the sole site responsible for data storage. This setup is optimized for single-node EMQX clusters. When the cluster forms, one node's view will eventually dominate, causing other nodes to abandon their stored data.

In multi-node clusters, it is recommended to set the number of sites to the initial cluster size to prevent such conflicts. Note that once the durable storage is initialized, this parameter cannot be changed.

Change Existing Cluster

Existing clusters may require reconfiguration due to changes in capacity, durability, client traffic, or the need to decommission old nodes and replace them with new ones. This can be achieved by adding new sites to the set of sites with durable storage replications or removing sites no longer required.

You can use the emqx ctl CLI with the ds subcommand to view the current shard allocation:

shell

$ emqx ctl ds info
SITES:
...
SHARDS:
...

Add Sites

When a new node joins the cluster, it is assigned a Site ID and can be included in the durable storage. Some shard replica responsibilities will be transferred to the new site, which will then start replicating the data.

shell

$ emqx ctl ds join messages <Site ID>
ok

Depending on the cluster's data volume, joining a new site may take some time. While this process does not compromise the availability of durable storage, it may temporarily affect cluster performance due to the background data transfer between sites.

Changes to the replica set are durably stored, ensuring that node restarts or network partitions do not affect the outcome. The cluster will eventually achieve the desired state consistently.

Remove Sites

Removing a site involves transferring shard replica responsibilities away from the site being removed. Similar to adding a site, this process can take time and resources.

shell

$ emqx ctl ds leave messages <Site ID>
ok

Removing a site can cause the effective replication factor to drop below the configured value. For example, if the replication factor is 3 and one site in a 3-node cluster is removed, the replication factor will effectively drop to 2. To avoid this risk when permanently replacing a site, it is recommended to add a new site before decommissioning the old one or perform both operations simultaneously.

Assign Sites

A series of changes to the set of sites holding durable storage replicas can be performed in a single operation.

shell

$ emqx ctl ds set_replicas messages <Site ID 1> <Site ID 2> ...

This approach minimizes the volume of data transferred between sites, while ensuring that the replication factor is maintained if possible.

Recover from Disasters

When disasters occur, knowing how to efficiently recover is crucial to maintaining service continuity. This section provides guidance on recovering from common disaster scenarios.

Complete Loss of a Node

One of the most common disaster scenarios is the complete loss of a node, which can occur due to unrecoverable hardware failure, disk corruption, or even human error.

If a node is completely lost, the cluster's availability is compromised to some extent. It can be restored by reallocating shard replicas to other healthy sites in the cluster.

Acknowledge node loss.
First, make sure to notify the cluster that the node is no longer part of it. Otherwise, the reallocation process will treat it as temporarily offline, potentially causing some transitions to stall indefinitely.
shell
```
$ emqx ctl cluster force-leave emqx@n2.local
```
Initiate shard transitions.
The next step is to restore availability by reallocating the lost node’s shards to other nodes in the cluster. You can use the standard leave command to achieve this. This command can still function even if the node is lost or unreachable, although the transition may take longer to complete.
shell
```
$ emqx ctl ds leave messages 5C6028D6CE9459C7 # Here, 5C6028D6CE9459C7 is the lost node's Site ID
```

Monitor the cluster status.

Wait for all shard transitions to complete successfully. You can check which transitions are still in progress with info command.

shell

$ emqx ctl ds info
<...>

SITES:
.------------------.-------------------.----------.
: Site             : Node              : Status   :
:------------------:-------------------:----------:
: D8894F95DC86DFDB : 'emqx@n1.local'   : up       :
: 5C6028D6CE9459C7 : 'emqx@n2.local'   : (!) LOST :
: <...>

SHARDS:
.------------.----------------------.------------------------.
: DB/Shard   : Replicas             : Transitions            :
:------------:----------------------:------------------------:
:-messages/0-:----------------------:------------------------:
:            : 5C6028D6CE9459C7 (!) : - 5C6028D6CE9459C7 (!) :
:            : <...>                : + D8894F95DC86DFDB     :
: <...>

Ensure there are no more transitions before proceeding to the next step.

Finish up.
Once all shard transitions are complete, you need to inform the cluster that the lost site will never come back.
shell
```
$ emqx ctl ds forget messages 5C6028D6CE9459C7
```
This step is crucial if you plan to replace the lost node with a new one using the original node name. Failing to do so could result in the cluster recognizing the same node name under two different Site IDs, leading to significant confusion and potential issues.

Password-Based Authentication

Monitoring

Access Control

Integration

Management

Integrate with OpenTelemetry

JT/T 808 Gateway

Incompatible Changes between EMQX 4.4 and EMQX 5.1

Incompatible Changes between EMQX 4.4 and EMQX 5.1

Manage Data Replicas ​

Initial Cluster Setup ​

Replication Factor ​

Number of Shards ​

Number of Sites ​

Change Existing Cluster ​

Add Sites ​

Remove Sites ​

Assign Sites ​

Recover from Disasters ​

Complete Loss of a Node ​

Manage Data Replicas

Initial Cluster Setup

Replication Factor

Number of Shards

Number of Sites

Change Existing Cluster

Add Sites

Remove Sites

Assign Sites

Recover from Disasters

Complete Loss of a Node