Ingest MQTT Data into Azure Blob Storage
Azure Blob Storage is Microsoft's cloud-based object storage solution, designed specifically for handling large volumes of unstructured data. Unstructured data refers to data types that do not follow a specific data model or format, such as text files or binary data. EMQX Platform can efficiently store MQTT messages in Blob Storage containers, providing a versatile solution for storing Internet of Things (IoT) data.
This page provides a detailed introduction to the data integration between EMQX Platform and Azure Blob Storage and offers practical guidance on the rule and Sink creation.
How It Works
Azure Blob Storage data integration in the EMQX Platform is a ready-to-use feature that can be easily configured for complex business development. In a typical IoT application, the EMQX Platform acts as the IoT platform responsible for device connectivity and message transmission, while Azure Blob Storage serves as the data storage platform, handling message data storage.
EMQX Platform utilizes rules engines and Sinks to forward device events and data to Azure Blob Storage. Applications can read data from Azure Blob Storage for further data applications. The specific workflow is as follows:
- Device Connection to EMQX Platform: IoT devices trigger an online event upon successfully connecting via the MQTT protocol. The event includes device ID, source IP address, and other property information.
- Device Message Publishing and Receiving: Devices publish telemetry and status data through specific topics. EMQX Platform receives the messages and compares them within the rules engine.
- Rules Engine Processing Messages: The built-in rules engine processes messages and events from specific sources based on topic matching. It matches corresponding rules and processes messages and events, such as data format transformation, filtering specific information, or enriching messages with context information.
- Writing to Azure Blob Storage: The rule triggers an action to write the message to the Storage Container. Using the Azure Blob Storage Sink, users can extract data from processing results and send it to Blob Storage. Messages can be stored in text or binary format, or multiple lines of structured data can be aggregated into a single CSV file, depending on the message content and the Sink configuration.
After events and message data are written to the Storage Container, you can connect to Azure Blob Storage to read the data for flexible application development, such as:
- Data archiving: Store device messages as objects in Azure Blob Storage for long-term preservation to meet compliance requirements or business needs.
- Data analysis: Import data from Storage Container into analytics services like Snowflake for predictive maintenance, device efficiency evaluation, and other data analysis services.
Features and Advantages
Using Azure Blob Storage data integration in the EMQX Platform can bring the following features and advantages to your business:
- Message Transformation: Messages can undergo extensive processing and transformation in EMQX Platform rules before being written to Azure Blob Storage, facilitating subsequent storage and use.
- Flexible Data Operations: With the Azure Blob Storage Sink, specific fields of data can be conveniently written into Azure Blob Storage containers, supporting the dynamic setting of containers and object keys for flexible data storage.
- Integrated Business Processes: The Azure Blob Storage Sink allows device data to be combined with the rich ecosystem applications of Azure Blob Storage, enabling more business scenarios like data analysis and archiving.
- Low-Cost Long-Term Storage: Compared to databases, Azure Blob Storage offers a highly available, reliable, and cost-effective object storage service, suitable for long-term storage needs.
These features enable you to build efficient, reliable, and scalable IoT applications and benefit from business decisions and optimizations.
Before You Start
This section introduces the preparations required before creating an Azure Blob Storage Sink in EMQX.
Prerequisites
- Understand Data Integration.
- Familiarize yourself with Rules.
- Enable the NAT Gateway to support public access to Azure Storage.
Create a Container in Azure Storage
To access Azure Storage, you'll need an Azure subscription. If you don't already have a subscription, create a free account before you begin.
All access to Azure Storage takes place through a storage account. For this quickstart, create a storage account using the Azure portal, Azure PowerShell, or Azure CLI. For help creating a storage account, see Create a storage account.
To create a container in the Azure portal, navigate to your new storage account in the Azure portal. In the left menu for the storage account, scroll to the Data storage section, then select Containers. Select the + Container button, use
iot-data
as a name for your new container, and click Create to create the container.Navigate to Security+Networking -> Access keys in the storage account, and copy the Key. You will need this key to configure the Sink in EMQX.
Create a Connector
Before adding the Azure Blob Storage Sink, you need to create the corresponding connector.
In the deployment menu, select Data Integration, then choose Azure Blob Storage service under the data persistence services category. If you have already created other connectors, click New Connector, then select Azure Blob Storage service under the data persistence services category.
Connector Name: The system will automatically generate a name for the connector.
Enter the connection information:
- Account Name: Your Storage Account name
- Account Key: Your Storage Account key from the previous step
- Advanced Settings (Optional): Refer to Advanced Configuration。
Click the Test Connection button; if the Azure Blob Storage can be accessed normally, a success message will be returned.
Click the Create button to complete the creation of the connector.
Create a Rule
Next, you need to create a rule that specifies the data to be written and add response actions to forward the processed data to Azure Blob Storage.
Click the new rule icon under the Actions column in the connector list or click New Rule in the Rules List to enter the Create New Rule step page.
Input the following rule SQL in the SQL editor:
sqlSELECT * FROM "t/#"
TIP
If you're new to SQL, you can click SQL Examples and Enable Test to learn and test the results of the rule SQL.
Click Next to start creating actions.
From the Use Connector dropdown, select the connector you previously created.
Set the Container by entering
iot-data
.Select the Upload Method. The differences between the two methods are as follows:
- Direct Upload: Each time the rule is triggered, data is uploaded directly to Azure Storage according to the preset object key and content. This method is suitable for storing binary or large text data. However, it may generate a large number of files.
- Aggregated Upload: This method packages the results of multiple rule triggers into a single file (such as a CSV file) and uploads it to Azure Storage, making it suitable for storing structured data. It can reduce the number of files and improve write efficiency.
The configuration parameters differ for each method. Please configure according to the selected method:
Configure advanced settings options as needed (optional), details can be found in Advanced Settings。
Click the Confirm button to complete the action configuration.
In the popup success message box, click Return to Rule List to complete the data integration configuration.
Test the Rule
This section shows how to test the rule configured with the direct upload method.
- Use MQTTX to publish a message to the topic
t/1
:
mqttx pub -i emqx_c -t t/1 -m '{ "msg": "Hello Azure" }'
After sending a few messages, log in to the Azure portal, navigate to the storage account, and open the
iot-data
container. You should see the uploaded objects in the container.Check the runtime data in the EMQX Platform Console. Click on the rule ID in the rules list, and you can view the statistics of the rule and all actions under this rule on the runtime statistics page.