# MQTT Stream

MQTT Stream is a built-in data persistence and query feature in EMQX Edge. Messages on configured topics are side-captured, buffered in memory, and written to disk in batches, without affecting real-time MQTT forwarding. Persisted data can be queried by time range directly from the Dashboard.

## Why MQTT Stream

In edge computing scenarios, large volumes of data are often produced as high-frequency MQTT messages: industrial telemetry, vehicle CAN bus data, and device status metrics.

Relying solely on real-time uplink to the cloud typically introduces the following challenges:

- Critical data may be lost when the network is unstable or unavailable.
- After an incident occurs, it is difficult to fully reconstruct the data before and after the event.
- There is no unified mechanism for local buffering and persistence at the edge.

MQTT Stream is designed to address these issues. With MQTT Stream, EMQX Edge can establish a complete local data pipeline at the edge:

> **MQTT ingestion -> Local buffering -> Batch persistence -> Historical query**

This pipeline operates independently of external systems and continues to function in weak or offline network conditions, providing a reliable data foundation for troubleshooting, analysis, and system optimization.

## Key Features

- **Non-intrusive capture**: Matching messages are side-copied into the pipeline; normal MQTT forwarding is unaffected.
- **In-memory Ring Buffer**: Messages are batched in memory before writing, avoiding per-message disk I/O.
- **Batch Parquet persistence**: Batches are encoded and written asynchronously to local Parquet files, suited for time-range scans and downstream analytics.
- **Compression and encryption**: Parquet files support multiple compression algorithms and optional at-rest encryption.
- **Dashboard queries**: Historical data is queryable by topic and time window directly from the Dashboard.
- **Offline operation**: Data is buffered and persisted regardless of network availability.
- **Pluggable Stream plugins**: Encoding, storage schema, and query behavior are plugin-defined. Built-in RAW, SPI, and CANP plugins are included; custom plugins are supported.

## Typical Use Cases

- **Reliable data preservation in weak or offline networks**: In deployments with unstable connectivity, MQTT Stream buffers and persists messages locally to prevent data loss.

- **Post-incident data analysis and troubleshooting**: By storing high-frequency MQTT data in time order, users can replay and analyze historical data to diagnose device or system failures.

- **Local archiving of high-frequency edge data**: For data that is impractical to stream to the cloud in real time, MQTT Stream enables batch buffering and structured local storage while preserving completeness.

- **End-to-end edge-to-cloud data pipelines**: Locally persisted data can later be analyzed, exported, or forwarded to cloud systems, supporting a closed-loop data workflow from edge collection to feedback and optimization.

## Learn More

- **[MQTT Stream User Guide](./mqtt-stream-user-guide.md)**: Configure and operate MQTT Stream, including persistence and querying.
- **[MQTT Stream Quick Start](./mqtt-stream-quick-start)**: Quickly verify data persistence and time-range queries with a minimal setup.
- **[MQTT Stream Design and Implementation](./mqtt-stream-design)**: Understand the internal architecture, data pipeline, and Stream plugin model.

MQTT Stream is available as a commercial add-on feature starting from EMQX Edge 1.4. If you require lightweight local persistence and message stream management at the edge, please [contact us](https://www.emqx.com/en/contact) for more information.
