# Node Evacuation and Cluster Load Rebalancing
MQTT is a stateful long-lived connection access protocol, which means connections will not be easily disconnected once the connection is established. Therefore, upgrading, maintaining, and scaling cluster nodes will become more challenging. EMQX provides node evacuation and cluster load rebalancing functions to facilitate users' cluster operation and maintenance.
# Node Evacuation
When you need to maintain or upgrade a node in the cluster, directly shutting down the node can result in lost connections and sessions, causing data loss. In addition, this type of operation can cause a large number of devices to go offline and reconnect for a period of time, increasing server load and potentially affecting overall business.
Therefore, EMQX provides node evacuation functionality to help migrate all connection and session data from the node to other nodes in the cluster before shutting down, reducing the impact on the overall business.
# How It Works
The node evacuation works in the following sequence:
- The node to be evacuated stops receiving connections.
- The node to be evacuated gradually disconnects its current clients at a preset rate (specified by
conn-evict-rate
). The disconnected clients use the reconnection mechanism to connect to other nodes (the target nodes) in the cluster. The reconnection mechanisms differ depending on the protocol versions:- MQTT v3.1/v3.1.1 clients: specified by load balancing strategy and the client needs to enable the reconnection mechanism;
- MQTT v5.0 clients: specified by the
redirect-to
parameter.
- Wait for the target node to complete reconnection with the clients and take over the sessions (specified by
wait-takeover
). - After the reconnection waiting time has elapsed, the remaining unclaimed sessions on the nodes to be evacuated will be migrated to the target node:
- The node to which the session will be migrated is specified by
migrate-to
; - The speed of session migration is specified by
sess-evict-rate
.
You can stop the evacuation at any time. If the node to be evacuated closes during the evacuation, the evacuation process will be resumed after the node is restarted.
# Start and Stop Node Evacuation via CLI
You can use CLI command to start the node evacuation, get the evacuation status, and stop the node evacuation.
# Start Node Evacuation
You can use the following CLI command to start the node evacuation:
emqx_ctl rebalance start --evacuation \ [--redirect-to "Host1:Port1 Host2:Port2 ..."] \ [--conn-evict-rate CountPerSec] \ [--migrate-to "node1@host1 node2@host2 ..."] \ [--wait-takeover Secs] \ [--sess-evict-rate CountPerSec]
Copied!
2
3
4
5
6
Parameter | Type | Description |
---|---|---|
--redirect-to | String | The redirected server address during reconnection, for MQTT 5.0 clients; Refer to MQTT 5.0 Specification - Server redirection (opens new window) for more details. |
--conn-evict-rate | Positive integer | Client disconnection rate, count/second; 500 connections per second by default |
--migrate-to | String | Space or comma-separated list of nodes to which sessions will be evacuated |
--wait-takeover | Positive integer | Amount of time in seconds to wait before starting session evacuation; Unit: second, 60 seconds by default |
--sess-evict-rate | Positive integer | Client evacuation rate, count/second; 500 sessions per second by default |
Code Example
If you want to migrate the clients on the node emqx@127.0.0.1
to the nodes emqx2@127.0.0.1
and emqx3@127.0.0.1
, you can execute the following command on the node emqx@127.0.0.1
:
./bin/emqx_ctl rebalance start --evacuation \ --wait-takeover 200 \ --conn-evict-rate 30 \ --sess-evict-rate 30 \ --migrate-to "emqx2@127.0.0.1 emqx3@127.0.0.1" Rebalance(evacuation) started
Copied!
2
3
4
5
6
This command will disconnect existing clients at a rate of 30
connections per second. After all connections are disconnected, it will wait for 200
seconds during which client sessions will be migrated to the reconnected nodes. Afterward, the remaining sessions will be migrated at a rate of 30
sessions per second to the emqx2@127.0.0.1
and emqx3@127.0.0.1
nodes.
# Get Evacuation Status
You can use the following CLI command to get the evacuation status:
emqx_ctl rebalance node-status
Copied!
Below is an example of the returned results:
./bin/emqx_ctl rebalance node-status Rebalance type: rebalance Rebalance state: evicting_conns Coordinator node: 'emqx2@127.0.0.1' Connection eviction rate: 5 connections/second Session eviction rate: 5 sessions/second Connection goal: 504.0 Recipient nodes: ['emqx2@127.0.0.1'] Channel statistics: current_connected: 960 current_disconnected_sessions: 35 current_sessions: 995 initial_connected: 1000 initial_sessions: 1000
Copied!
2
3
4
5
6
7
8
9
10
11
12
13
14
# Stop Node Evacuation
You can use the following CLI command to stop evacuation:
emqx_ctl rebalance stop
Copied!
Below is an example of the returned results:
./bin/emqx_ctl rebalance stop Rebalance(evacuation) stopped
Copied!
2
# Start and Stop Node Evacuation via HTTP API
You can also use HTTP API to start the node evacuation and you need to specify the node to be evacuated in the parameters.
# Start Node Evacuation
For example, you want to evacuate the connections and sessions of the emqx1@127.0.0.1
node to the emqx2@127.0.0.1
and emqx3@127.0.0.1
nodes, disconnect existing clients and sessions at a rate of 5
connections per second, and transfer them to emqx2@127.0.0.1
and emqx3@127.0.0.1
nodes.
Code Example
curl -v -u admin:public -H "Content-Type: application/json" -X POST 'http://127.0.0.1:8081/api/v4/load_rebalance/emqx1@127.0.0.1/evacuation/start' -d '{"conn_evict_rate": 5, "sess_evict_rate": 5, "migrate_to": ["emqx3@127.0.0.1", "emqx2@127.0.0.1"]}' {"data":[],"code":0}
Copied!
2
3
The above statement contains the following fields:
Fields | Type | Required Fields | Description |
---|---|---|---|
nodes | String | Yes | Node name |
redirect_to | Positive integer | Yes | The redirected server address during reconnection, for MQTT 5.0 clients only; Refer to MQTT 5.0 Specification - Server redirection (opens new window) for more details. |
conn_evict_rate | Positive integer | Yes | Client disconnection rate, count/second; 500 connections per second by default |
migrate_to | String | Yes | Space or comma-separated list of nodes to which sessions will be evacuated |
wait_takeover | Positive integer | No | Amount of time in seconds to wait before starting session evacuation; Unit: second, 60 seconds by default |
sess_evict_rate | Positive integer | No | Client evacuation rate, count/second; 500 sessions per second by default |
# Stop Node Evacuation
You can stop the node evacuation using the following code example:
curl -v -u admin:public -H "Content-Type: application/json" -X POST 'http://127.0.0.1:8081/api/v4/load_rebalance/emqx1@127.0.0.1/evacuation/stop' {"data":[],"code":0}
Copied!
2
3
# Rebalancing
Due to the same reason as MQTT being a stateful long-lived connection protocol, connections will not be easily disconnected once the connection is established. Even after scaling up the nodes, existing connections do not automatically shift to the newly added nodes. Consequently, if there are not a significant number of new client connections, the additional nodes may remain underutilized for a long period. In such cases, you need to manually migrate connections from high-load nodes to low-load nodes to achieve cluster load balancing.

# How It Works
Rebalancing is a more complicated process since it involves several nodes.
You can initiate a cluster load rebalancing task on any node. EMQX will automatically calculate the necessary connection migration plan based on the current connection load of each node. It will then migrate the corresponding number of connections and sessions from high-load nodes to low-load nodes to achieve load balancing between nodes. The workflow is as follows:
- Calculate the migration plan and divide the nodes involved in rebalancing (specified by
--nodes
) into source nodes and target nodes:- Source nodes: High-load nodes
- Target nodes: Low-load nodes
- Stop accepting new connections on the source nodes.
- Wait for a period of time (specified by
wait-health-check
) until the load balancer (LB) removes the source nodes from the active backend node list. - Gradually disconnect connected clients on the source nodes until the average number of connections matches that of the target nodes.
- Wait for the target nodes to reconnect with clients and take over the sessions (specified by
wait-takeover
). - After the reconnection waiting time exceeds, the source nodes will migrate the remaining unclaimed sessions to the target nodes at the rate specified by
sess-evict-rate
.
At this point, the load rebalancing task is completed, and the source nodes return to their normal state.
TIP
Rebalancing is ephemeral. If one of the participating nodes crashes, the whole process is aborted on all nodes.
# Start and Stop Rebalancing via CLI
You can use CLI command to start the rebalancing, get the rebalancing status, and stop the rebalancing.
# Start Rebalancing
The command for starting the rebalancing includes the following fields:
rebalance start \ [--nodes "node1@host1 node2@host2"] \ [--wait-health-check Secs] \ [--conn-evict-rate ConnPerSec] \ [--abs-conn-threshold Count] \ [--rel-conn-threshold Fraction] \ [--conn-evict-rate ConnPerSec] \ [--wait-takeover Secs] \ [--sess-evict-rate CountPerSec] \ [--abs-sess-threshold Count] \ [--rel-sess-threshold Fraction]
Copied!
2
3
4
5
6
7
8
9
10
11
Fields | Type | Description |
---|---|---|
--nodes | String | Space or comma-separated list of nodes participating in the rebalance. It may or may not include the coordinator (node on which the command is run) |
--wait-health-check | Positive integer | Specified waiting time (in seconds, default 60 seconds) for the LB to remove the source node from the active backend node list. Once the specified waiting time is exceeded, the load rebalancing task will start. |
--conn-evict-rate | Positive integer | Client disconnection rate on source nodes; 500 connections per second by default |
--abs-conn-threshold | Positive integer | Absolute threshold for checking connection balance; 1000 by default |
--rel-conn-threshold | Number > 1.0 | Relative threshold for checking connection balance; 1.1 by default |
--wait-takeover | Positive integer | Specified waiting time (in seconds, default 60 seconds) for clients to reconnect and take over the sessions after all connections are disconnected. |
--sess-evict-rate | Positive integer | Session evacuation rate on source nodes; 500 sessions per second by default |
--abs-sess-threshold | Positive integer | Absolute threshold for checking session balance; 1000 by default |
--rel-sess-threshold | Number > 1.0 | Relative threshold for checking session balance; 1.1 by default |
Check Session Balance
Connections are considered to be balanced when the following condition holds:
avg(DonorConns) < avg(RecipientConns) + abs_conn_threshold OR avg(DonorConns) < avg(RecipientConns) * rel_conn_threshold
Copied!
2
3
A similar rule is applied to disconnected sessions.
Example
To achieve load rebalancing among the three nodes emqx@127.0.0.1
, emqx2@127.0.0.1
, and emqx3@127.0.0.1
, you can use the following command:
./bin/emqx_ctl rebalance start \ --wait-health-check 10 \ --wait-takeover 60 \ --conn-evict-rate 5 \ --sess-evict-rate 5 \ --abs-conn-threshold 30 \ --abs-sess-threshold 30 \ --nodes "emqx1@127.0.0.1 emqx2@127.0.0.1 emqx3@127.0.0.1" Rebalance started
Copied!
2
3
4
5
6
7
8
9
# Get Rebalance Status
The CLI command for getting rebalance status is:
emqx_ctl rebalance node-status
Copied!
Example:
./bin/emqx_ctl rebalance node-status Node 'emqx1@127.0.0.1': rebalance coordinator Rebalance state: evicting_conns Coordinator node: 'emqx1@127.0.0.1' Donor nodes: ['emqx2@127.0.0.1','emqx3@127.0.0.1'] Recipient nodes: ['emqx1@127.0.0.1'] Connection eviction rate: 5 connections/second Session eviction rate: 5 sessions/second Connection goal: 0.0 Current average donor node connection count: 300.0
Copied!
2
3
4
5
6
7
8
9
10
# Stop Rebalancing
The CLI command to stop rebalancing is:
emqx_ctl rebalance stop
Copied!
Example:
./bin/emqx_ctl rebalance stop Rebalance stopped
Copied!
2
# Start and Stop Rebalancing via HTTP API
All the operations available from the CLI are also available from the API. Start/stop commands require a node as a parameter.
# Start Rebalancing
The request body for rebalancing should includes the following fields:
nodes
conn_evict_rate
sess_evict_rate
wait_takeover
wait_health_check
abs_conn_threshold
rel_conn_threshold
abs_sess_threshold
rel_sess_threshold
The meaning of these fields are the same as those in the corresponding CLI command.
Example:
curl -v -u admin:public -H "Content-Type: application/json" -X POST 'http://127.0.0.1:8081/api/v4/load_rebalance/emqx1@127.0.0.1/start' -d '{"conn_evict_rate": 5, "sess_evict_rate": 5, "nodes": ["emqx1@127.0.0.1", "emqx2@127.0.0.1"]}' {"data":[],"code":0}
Copied!
2
3
# Get Node-Local Status
The code example for obtaining the rebalancing status for the requested node is as follows:
curl -s -u admin:public -H "Content-Type: application/json" -X GET 'http://127.0.0.1:8081/api/v4/load_rebalance/status' { "status": "enabled", "stats": { "initial_sessions": 0, "initial_connected": 0, "current_sessions": 0, "current_connected": 0 }, "state": "waiting_takeover", "session_recipients": [ "emqx3@127.0.0.1", "emqx2@127.0.0.1" ], "session_goal": 0, "session_eviction_rate": 5, "process": "evacuation", "connection_goal": 0, "connection_eviction_rate": 5 }
Copied!
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Get Cluster-Wide Status
The code example for obtaining the rebalancing status for a cluster is as follows:
curl -s -u admin:public -H "Content-Type: application/json" -X GET 'http://127.0.0.1:8081/api/v4/load_rebalance/global_status' { "rebalances": [], "evacuations": [ { "node": "emqx1@127.0.0.1", "stats": { "initial_sessions": 0, "initial_connected": 0, "current_sessions": 0, "current_connected": 0 }, "state": "waiting_takeover", "session_recipients": [ "emqx3@127.0.0.1", "emqx2@127.0.0.1" ], "session_goal": 0, "session_eviction_rate": 5, "connection_goal": 0, "connection_eviction_rate": 5 } ] }
Copied!
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Stop Rebalancing
The code example for stopping rebalancing is as follows:
curl -v -u admin:public -H "Content-Type: application/json" -X POST 'http://127.0.0.1:8081/api/v4/load_rebalance/emqx1@127.0.0.1/stop' {"data":[],"code":0}
Copied!
2
3
# Load Balancer Integration
During evacuation/rebalance, it is up to the user to provide the necessary configuration for the load balancer (if any). This configuration should help disconnected clients to be directed to the recipient nodes when they reconnect. Without such a configuration, there may be an excess number of disconnections.
To help create that configuration, EMQX provides health check endpoints:
GET /api/v4/load_rebalance/availability_check
Copied!
They respond with 503 HTTP code for the donor or evacuated nodes and 200 HTTP code for nodes operating normally and receiving connections.
For example, the described configuration for Haproxy and a 3-node cluster could look like this:
defaults timeout connect 5s timeout client 60m timeout server 60m listen mqtt bind *:1883 mode tcp maxconn 50000 timeout client 6000s default_backend emqx_cluster backend emqx_cluster mode tcp balance leastconn option httpchk http-check send meth GET uri /api/v4/load_rebalance/availability_check hdr Authorization "Basic YWRtaW46cHVibGlj" server emqx1 127.0.0.1:3001 check port 5001 inter 1000 fall 2 rise 5 weight 1 maxconn 1000 server emqx2 127.0.0.1:3002 check port 5002 inter 1000 fall 2 rise 5 weight 1 maxconn 1000 server emqx3 127.0.0.1:3003 check port 5003 inter 1000 fall 2 rise 5 weight 1 maxconn 1000
Copied!
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Here we have 3 nodes with MQTT listeners on ports 3001, 3002, and 3003 and HTTP listeners on ports 5001, 5002, and 5003, respectively.