# End-to-End Tracing Span Details

EMQX provides end-to-end tracing capabilities based on the OpenTelemetry standard. This allows you to monitor the entire lifecycle of MQTT messages and client activities within the EMQX cluster. This page describes the spans generated by EMQX and explains what they reveal about the broker’s internal operations.

## Client Lifecycle Spans

These spans trace the primary lifecycle events of an MQTT client.

- **`client.connect`**: A root span that traces the client connection process. It starts when a client initiates a connection to the broker and ends when the connection is established or rejected.

- **`client.disconnect`**: Traces the client disconnection process. It starts when a client sends a `DISCONNECT` packet or when the connection is closed for other reasons (e.g., network error, keep-alive timeout).

- **`client.subscribe`**: Traces a client's subscription request. It covers the entire process of the broker receiving the `SUBSCRIBE` packet, processing the subscriptions, and sending a `SUBACK` packet.

- **`client.unsubscribe`**: Traces a client's unsubscription request. It covers the process from receiving the `UNSUBSCRIBE` packet to sending the `UNSUBACK` packet.

## Authentication and Authorization Spans

These spans reveal how EMQX performs authentication and authorization checks.

- **`client.authn`**: Traces the authentication process for a client. This span is a child of the `client.connect` span.

- **`client.authn_backend`**: Traces a specific backend call during authentication (e.g., querying a database, calling an HTTP service). This is a child span of `client.authn` and is useful for identifying performance bottlenecks in authentication backends.

- **`client.authz`**: Traces the authorization process for a client, which occurs on publish or subscribe operations.

- **`client.authz_backend`**: Traces a specific backend call during authorization. This is a child span of `client.authz`.

## Message Lifecycle Spans

These spans trace the journey of an MQTT message through the broker.

### Ingress (Client to Broker)

- **`client.publish`**: A root span that traces a message published by a client to the broker. It starts when the broker receives the `PUBLISH` packet.

- **`message.route`**: A child span of `client.publish` that traces the routing of the message within the broker to find matching subscribers.

- **`message.forward`**: If the message needs to be delivered to a subscriber on another node in the cluster, this span traces the forwarding of the message to that node. It is a child of `message.route`.

- **`message.handle_forward`**: On the receiving node, this span traces the handling of a forwarded message.

### Egress (Broker to Client)

- **`broker.publish`**: Traces how the broker prepares and publishes a message to a subscriber. It is a child of either `message.route` or `message.handle_forward`.

### QoS Acknowledgement Spans

These spans trace the QoS 1 and QoS 2 acknowledgement flows.

#### Broker to Publisher

- **`broker.puback`**: Traces the broker sending a `PUBACK` to a publisher (QoS 1).

- **`broker.pubrec`**: Traces the broker sending a `PUBREC` to a publisher (QoS 2).

- **`broker.pubcomp`**: Traces the broker sending a `PUBCOMP` to a publisher (QoS 2), completing the QoS 2 flow from the publisher's side.

#### Publisher to Broker

- **`client.pubrel`**: Traces the broker receiving a `PUBREL` from a publisher (QoS 2).

#### Broker to Subscriber

- **`broker.pubrel`**: Traces the broker sending a `PUBREL` to a subscriber (QoS 2).

#### Subscriber to Broker

- **`client.puback`**: Traces the broker receiving a `PUBACK` from a subscriber (QoS 1).

- **`client.pubrec`**: Traces the broker receiving a `PUBREC` from a subscriber (QoS 2).

- **`client.pubcomp`**: Traces the broker receiving a `PUBCOMP` from a subscriber (QoS 2), completing the QoS 2 flow from the subscriber's side.

## Rule Engine Spans

These spans trace the execution within the EMQX Rule Engine.

- **`broker.rule_engine.apply`**: Traces how a message is evaluated against rules. This is a child of the `message.route`.

- **`broker.rule_engine.action`**: Traces the execution of a specific action triggered by a matched rule. This is a child of `broker.rule_engine.apply`.

## Broker Internal Spans

These spans trace internal broker operations not initiated directly by clients.

- **`broker.disconnect`**: Traces when the broker actively disconnects a client (e.g., due to an administrative action).

- **`broker.subscribe`**: Traces an internal subscription process initiated by the broker itself (e.g., due to an administrative action).

- **`broker.unsubscribe`**: Traces an internal unsubscription process.

## Trace Sampling and Filtering

EMQX's OpenTelemetry integration includes a flexible sampler that allows you to control which traces are generated. This helps to manage the volume of trace data and focus on specific clients, topics, or event types. The decision to sample a trace is made based on the following hierarchy:

1.  **Trace Context Source**: Determines whether EMQX extracts trace context from incoming MQTT packets. This is controlled by the `follow_traceparent` boolean switch.
    
    -   If `true` (the default), EMQX attempts to extract trace context from the incoming request (e.g., from the `traceparent` user property in an MQTT packet). This allows you to link traces that originate from upstream instrumented applications.
    -   If `false`, EMQX ignores any incoming trace context and always starts a new trace.
    
2.  **Remote Sampling Decision**: If `follow_traceparent` is `true` and an incoming request contains a trace context that is already marked as "sampled", EMQX will respect this upstream decision and sample the trace, unless overridden by other rules.

3.  **Whitelist Rules**: If the trace is not already sampled by a remote parent, you can define specific rules to force sampling for certain clients or topics. This is the most direct way to ensure you are tracing the activity you care about.
    -   **ClientID whitelist**: Forces sampling for all root-level activities (connect, subscribe, publish, etc.).
    -   **Topic whitelist**: Forces sampling for messages published to matching topics.
        
        > **Note:** This rule applies to the start of a trace (e.g., the `client.publish` span). It does not apply to the `broker.publish` span, which is responsible for message delivery to subscribers.
    
4.  **Ratio-Based Sampling**: If no whitelist rule matches, the decision falls back to ratio-based sampling, which is controlled by the `sample_ratio` configuration.
    
    You can configure this ratio (from `0.0` to `1.0`) to control the percentage of traces that are captured. A value of `1.0` means 100% of traces will be captured, while `0.0` means none will be (unless they match a whitelist rule).
    
5.  **Event Type Switches**: Even if a trace is selected by the ratio-based sampler, it is only generated if the relevant event type switch is enabled. These switches act as a global on/off for categories of spans. The available switches are:
    
    -   `client_connect_disconnect`: A boolean switch to enable or disable tracing for client connect and disconnect events.
    -   `client_subscribe_unsubscribe`: A boolean switch to enable or disable tracing for client subscribe and unsubscribe events.
    -   `client_messaging`: A boolean switch to enable or disable tracing for client message publishing.
    -   `trace_rule_engine`: A boolean switch to enable or disable tracing for the rule engine.
    
6.  **Message Trace Level**: For spans related to QoS acknowledgements (e.g., `PUBACK`, `PUBREC`), you can control their creation based on a QoS level using the `msg_trace_level` switch.
    - `msg_trace_level`: This setting can be configured to a specific QoS level (0, 1, or 2) to control which acknowledgement spans are created based on the QoS of the original message. 
    
      For example, if `msg_trace_level` is set to `1`, `PUBACK` spans will be created for QoS 1 messages. For QoS 2 messages, this setting will generate `PUBREC` spans, but not `PUBREL` or `PUBCOMP` spans. This helps to reduce the verbosity of traces for high-QoS message flows.