End-to-End Tracing Span Details
EMQX provides end-to-end tracing capabilities based on the OpenTelemetry standard. This allows you to monitor the entire lifecycle of MQTT messages and client activities within the EMQX cluster. This page describes the spans generated by EMQX and explains what they reveal about the broker’s internal operations.
Client Lifecycle Spans
These spans trace the primary lifecycle events of an MQTT client.
client.connect: A root span that traces the client connection process. It starts when a client initiates a connection to the broker and ends when the connection is established or rejected.client.disconnect: Traces the client disconnection process. It starts when a client sends aDISCONNECTpacket or when the connection is closed for other reasons (e.g., network error, keep-alive timeout).client.subscribe: Traces a client's subscription request. It covers the entire process of the broker receiving theSUBSCRIBEpacket, processing the subscriptions, and sending aSUBACKpacket.client.unsubscribe: Traces a client's unsubscription request. It covers the process from receiving theUNSUBSCRIBEpacket to sending theUNSUBACKpacket.
Authentication and Authorization Spans
These spans reveal how EMQX performs authentication and authorization checks.
client.authn: Traces the authentication process for a client. This span is a child of theclient.connectspan.client.authn_backend: Traces a specific backend call during authentication (e.g., querying a database, calling an HTTP service). This is a child span ofclient.authnand is useful for identifying performance bottlenecks in authentication backends.client.authz: Traces the authorization process for a client, which occurs on publish or subscribe operations.client.authz_backend: Traces a specific backend call during authorization. This is a child span ofclient.authz.
Message Lifecycle Spans
These spans trace the journey of an MQTT message through the broker.
Ingress (Client to Broker)
client.publish: A root span that traces a message published by a client to the broker. It starts when the broker receives thePUBLISHpacket.message.route: A child span ofclient.publishthat traces the routing of the message within the broker to find matching subscribers.message.forward: If the message needs to be delivered to a subscriber on another node in the cluster, this span traces the forwarding of the message to that node. It is a child ofmessage.route.message.handle_forward: On the receiving node, this span traces the handling of a forwarded message.
Egress (Broker to Client)
broker.publish: Traces how the broker prepares and publishes a message to a subscriber. It is a child of eithermessage.routeormessage.handle_forward.
QoS Acknowledgement Spans
These spans trace the QoS 1 and QoS 2 acknowledgement flows.
Broker to Publisher
broker.puback: Traces the broker sending aPUBACKto a publisher (QoS 1).broker.pubrec: Traces the broker sending aPUBRECto a publisher (QoS 2).broker.pubcomp: Traces the broker sending aPUBCOMPto a publisher (QoS 2), completing the QoS 2 flow from the publisher's side.
Publisher to Broker
client.pubrel: Traces the broker receiving aPUBRELfrom a publisher (QoS 2).
Broker to Subscriber
broker.pubrel: Traces the broker sending aPUBRELto a subscriber (QoS 2).
Subscriber to Broker
client.puback: Traces the broker receiving aPUBACKfrom a subscriber (QoS 1).client.pubrec: Traces the broker receiving aPUBRECfrom a subscriber (QoS 2).client.pubcomp: Traces the broker receiving aPUBCOMPfrom a subscriber (QoS 2), completing the QoS 2 flow from the subscriber's side.
Rule Engine Spans
These spans trace the execution within the EMQX Rule Engine.
broker.rule_engine.apply: Traces how a message is evaluated against rules. This is a child of themessage.route.broker.rule_engine.action: Traces the execution of a specific action triggered by a matched rule. This is a child ofbroker.rule_engine.apply.
Broker Internal Spans
These spans trace internal broker operations not initiated directly by clients.
broker.disconnect: Traces when the broker actively disconnects a client (e.g., due to an administrative action).broker.subscribe: Traces an internal subscription process initiated by the broker itself (e.g., due to an administrative action).broker.unsubscribe: Traces an internal unsubscription process.
Trace Sampling and Filtering
EMQX's OpenTelemetry integration includes a flexible sampler that allows you to control which traces are generated. This helps to manage the volume of trace data and focus on specific clients, topics, or event types. The decision to sample a trace is made based on the following hierarchy:
Trace Context Source: Determines whether EMQX extracts trace context from incoming MQTT packets. This is controlled by the
follow_traceparentboolean switch.- If
true(the default), EMQX attempts to extract trace context from the incoming request (e.g., from thetraceparentuser property in an MQTT packet). This allows you to link traces that originate from upstream instrumented applications. - If
false, EMQX ignores any incoming trace context and always starts a new trace.
- If
Remote Sampling Decision: If
follow_traceparentistrueand an incoming request contains a trace context that is already marked as "sampled", EMQX will respect this upstream decision and sample the trace, unless overridden by other rules.Whitelist Rules: If the trace is not already sampled by a remote parent, you can define specific rules to force sampling for certain clients or topics. This is the most direct way to ensure you are tracing the activity you care about.
ClientID whitelist: Forces sampling for all root-level activities (connect, subscribe, publish, etc.).
Topic whitelist: Forces sampling for messages published to matching topics.
Note: This rule applies to the start of a trace (e.g., the
client.publishspan). It does not apply to thebroker.publishspan, which is responsible for message delivery to subscribers.
Ratio-Based Sampling: If no whitelist rule matches, the decision falls back to ratio-based sampling, which is controlled by the
sample_ratioconfiguration.You can configure this ratio (from
0.0to1.0) to control the percentage of traces that are captured. A value of1.0means 100% of traces will be captured, while0.0means none will be (unless they match a whitelist rule).Event Type Switches: Even if a trace is selected by the ratio-based sampler, it is only generated if the relevant event type switch is enabled. These switches act as a global on/off for categories of spans. The available switches are:
client_connect_disconnect: A boolean switch to enable or disable tracing for client connect and disconnect events.client_subscribe_unsubscribe: A boolean switch to enable or disable tracing for client subscribe and unsubscribe events.client_messaging: A boolean switch to enable or disable tracing for client message publishing.trace_rule_engine: A boolean switch to enable or disable tracing for the rule engine.
Message Trace Level: For spans related to QoS acknowledgements (e.g.,
PUBACK,PUBREC), you can control their creation based on a QoS level using themsg_trace_levelswitch.msg_trace_level: This setting can be configured to a specific QoS level (0, 1, or 2) to control which acknowledgement spans are created based on the QoS of the original message.For example, if
msg_trace_levelis set to1,PUBACKspans will be created for QoS 1 messages. For QoS 2 messages, this setting will generatePUBRECspans, but notPUBRELorPUBCOMPspans. This helps to reduce the verbosity of traces for high-QoS message flows.