Skip to content

Architecture

EMQX 5.0 redesigns the cluster architecture with Mria, which significantly improves EMQX's horizontal scalability. The new design supports 100,000,000 MQTT connections with a single cluster.

EMQX Mria

In this Mria, each node assumes one of two roles: Core node or Replicant. Core nodes serve as a data layer for the database. Replicant nodes connect to Core nodes and passively replicate data updates from Core nodes. On how core and replicant node works, you can continue to read the EMQX clustering.

By default, all nodes assume the Core node role, so the cluster behaves like that in EMQX 4.x, which is recommended for a small cluster with 3 nodes or fewer. The Core + Replicant mode is only recommended if there are more than 3 nodes in the cluster.

Enable Core + Replicant Mode

To enable the Core + Replicant mode, it is necessary to designate certain nodes as replicant nodes. This is achieved by setting node.role parameter to replicant. Additionally, you need to enable an automatic cluster discovery strategy (cluster.discovery_strategy).

TIP

Replicant nodes cannot use manual discovery strategy to discover core nodes.

Configuration example:

bash
node {
    ## To set a node as a replicant node:
    role = replicant
}
cluster {
    ## Enable static discovery strategy:
    discovery_strategy = static
    static.seeds = [emqx@host1.local, emqx@host2.local]
}

Monitor and Debug

The Mria performance can be monitored using Prometheus metrics or Erlang console.

Prometheus Indicators

You can integrate with Prometheus to monitor the cluster operations. On how to integrate with Prometheus, see Log and observability - Integrate with Prometheus.

Core Nodes

IndicatorsDescription
emqx_mria_last_intercepted_transTransactions received by the shard since the node started
emqx_mria_weightInstantaneous load of the Core node
emqx_mria_replicantsReplicant nodes connected to the Core node Numbers are grouped per shard.
emqx_mria_server_mqlPending transactions waiting to be sent to the replicant nodes. Less is optimal.
If this indicator shows a growing trend, more Core nodes are needed.

Replicant Nodes

IndicatorsDescription
emqx_mria_lagIndicate how far the Replicant lags behind the upstream Core node. Less is better.
emqx_mria_bootstrap_timeStartup time of the Replica node. This value should remain stable if the system operates normally.
emqx_mria_bootstrap_num_keysNumber of database records copied from the Core node during startup. This value should remain stable if the system operates normally.
emqx_mria_message_queue_lenQueue length during message replication. Should be around 0.
emqx_mria_replayq_lenInternal replay queue length on the Replicant nodes. Less is better.

Console Commands

You can also monitor the operating status of the cluster with command emqx eval 'mria_rlog:status().' on the Erlang console.

If EMQX cluster is operating normally, you can get a list of status information, for example, the current log level, the number of messages processed, and the number of messages dropped.

Pseudo-Distributed Cluster

EMQX also provides a pseudo-distributed cluster feature for testing and development purposes. It refers to a cluster setup where multiple instances of EMQX are running on a single machine, with each instance configured as a node in the cluster.

After starting the first node, use the following command to start the second node and join the cluster manually. To avoid port conflicts, we need to adjust some listening ports:

bash
EMQX_NODE__NAME='emqx2@127.0.0.1' \
    EMQX_LOG__FILE_HANDLERS__DEFAULT__FILE='log2/emqx.log' \
    EMQX_STATSD__SERVER='127.0.0.1:8124' \
    EMQX_LISTENERS__TCP__DEFAULT__BIND='0.0.0.0:1882' \
    EMQX_LISTENERS__SSL__DEFAULT__BIND='0.0.0.0:8882' \
    EMQX_LISTENERS__WS__DEFAULT__BIND='0.0.0.0:8082' \
    EMQX_LISTENERS__WSS__DEFAULT__BIND='0.0.0.0:8085' \
    EMQX_DASHBOARD__LISTENERS__HTTP__BIND='0.0.0.0:18082' \
    EMQX_NODE__DATA_DIR="./data2" \
./bin/emqx start

./bin/emqx ctl cluster join emqx1@127.0.0.1

The above code example is to create a cluster manually, you can also refer to the auto clustering section on how to create a cluster automatically.

The dashboard is designed under the assumption that all cluster nodes use the same port number. Using distinct ports on a single computer may cause Dashboard UI issues, therefore, it is not recommended in production.