MQTT Stream
MQTT Stream is a built-in data persistence and query feature in EMQX Edge. Messages on configured topics are side-captured, buffered in memory, and written to disk in batches, without affecting real-time MQTT forwarding. Persisted data can be queried by time range directly from the Dashboard.
Why MQTT Stream
In edge computing scenarios, large volumes of data are often produced as high-frequency MQTT messages: industrial telemetry, vehicle CAN bus data, and device status metrics.
Relying solely on real-time uplink to the cloud typically introduces the following challenges:
- Critical data may be lost when the network is unstable or unavailable.
- After an incident occurs, it is difficult to fully reconstruct the data before and after the event.
- There is no unified mechanism for local buffering and persistence at the edge.
MQTT Stream is designed to address these issues. With MQTT Stream, EMQX Edge can establish a complete local data pipeline at the edge:
MQTT ingestion -> Local buffering -> Batch persistence -> Historical query
This pipeline operates independently of external systems and continues to function in weak or offline network conditions, providing a reliable data foundation for troubleshooting, analysis, and system optimization.
Key Features
- Non-intrusive capture: Matching messages are side-copied into the pipeline; normal MQTT forwarding is unaffected.
- In-memory Ring Buffer: Messages are batched in memory before writing, avoiding per-message disk I/O.
- Batch Parquet persistence: Batches are encoded and written asynchronously to local Parquet files, suited for time-range scans and downstream analytics.
- Compression and encryption: Parquet files support multiple compression algorithms and optional at-rest encryption.
- Dashboard queries: Historical data is queryable by topic and time window directly from the Dashboard.
- Offline operation: Data is buffered and persisted regardless of network availability.
- Pluggable Stream plugins: Encoding, storage schema, and query behavior are plugin-defined. Built-in RAW, SPI, and CANP plugins are included; custom plugins are supported.
Typical Use Cases
Reliable data preservation in weak or offline networks: In deployments with unstable connectivity, MQTT Stream buffers and persists messages locally to prevent data loss.
Post-incident data analysis and troubleshooting: By storing high-frequency MQTT data in time order, users can replay and analyze historical data to diagnose device or system failures.
Local archiving of high-frequency edge data: For data that is impractical to stream to the cloud in real time, MQTT Stream enables batch buffering and structured local storage while preserving completeness.
End-to-end edge-to-cloud data pipelines: Locally persisted data can later be analyzed, exported, or forwarded to cloud systems, supporting a closed-loop data workflow from edge collection to feedback and optimization.
Learn More
- MQTT Stream User Guide: Configure and operate MQTT Stream, including persistence and querying.
- MQTT Stream Quick Start: Quickly verify data persistence and time-range queries with a minimal setup.
- MQTT Stream Design and Implementation: Understand the internal architecture, data pipeline, and Stream plugin model.
MQTT Stream is available as a commercial add-on feature starting from EMQX Edge 1.4. If you require lightweight local persistence and message stream management at the edge, please contact us for more information.