Skip to content

Data Integration Overview

As a fully managed MQTT message cloud service, EMQX Platform connects Internet of Things (IoT) devices via the MQTT protocol and delivers messages in real time. Building on this foundation, Data Integration enhances EMQX Platform's capability of connecting with other cloud resources, enabling seamless integration of devices with other business systems. EMQX Platform Data Integration not only provides a clear and flexible "configurable" architecture solution but also simplifies the development process. It improves user availability, reduces the coupling between business systems and EMQX Platform, and provides a better infrastructure for data forwarding.

data_integration_intro

How It Works

As devices or applications establish connections, the MQTT broker routes the messages. Upon arrival, these messages are processed by the Rule Engine, a powerful component that utilizes SQL statements for data manipulation. This processed data is then forwarded to the target service by an "Action". Actions are categorized into two types: "Sink", for sending data to a service, and "Source", for receiving data from a service. Presently, the Data Integration operates only in "Sink" mode, facilitating the seamless integration of data into various cloud services. It will support "Source" mode in the future updates.

Connectors

A connector serves as the underlying connection channel for Sink/Source, used to connect to the cloud service products you purchase from the cloud platform. The cloud service products can be message queue services like Kafka or storage services like RDS.

Rules

Rules describe "where data comes from" and "how to filter and process data." Rules use SQL-like statements to write custom data and can use SQL testing to simulate exported data. To learn and understand how to write rule SQL, refer to Rule SQL Writing.

Actions

Actions determine "where the processed data goes." A rule can correspond to one or more actions, and actions need to be set with defined connectors, which means where the data is sent.

Work Flow

Below is the basic process for creating data integrations:

data_integration_intro

  1. Create a connector. You can select the service you need to connect to from the Data Integration initial page in your deployment and configure the connector.
  2. Create a rule to process the data collected from the device. The rule can collect and process data the way you want using SQL statements.
  3. Attach actions to the rule. The processed data will be forwarded to the cloud service through the configured connector when the rule triggers an action.
  4. Test whether the created data integration can run correctly.

Network Setting Required by Deployments

The data integration function in different deployments requires different levels of data source access and networking.

Serverless Deployment

  • Data sources only support public network access. Therefore, before creating a data source, you need to ensure the data source has the capability of public network access and open the security group.

  • Only supports Kafka and HTTP Server connectors.

  • Serverless Data Integration employs a pay-as-you-go mode, explaining as follows:

    EMQX Serverless provides users with a free quota for data integration: up to 1 million rule action executions per month. Should your usage exceed this allocation, a nominal fee of $0.25 is applied for each additional million rule action executions.

    To maintain optimal performance and manageability, the EMQX Platform imposes the following constraints on the creation of connectors, rules, and actions within each deployment:

    CategoryMaximum Allowed
    Total Connectors2
    Total Rules4
    Actions Associated Per Rule1

Dedicated Deployment

  • It is recommended to access data sources through an internal network. Therefore, before creating, you need to configure VPC peering first and also open the security group.
  • If you need to access it through the public network, you can enable a NAT gateway.

BYOC Deployment

  • It is recommended to access data sources through an internal network to improve network security and performance. Before creating, you need to configure a peering connection between the VPC where the resources are located and the VPC where the BYOC deployment is located in the public cloud console, and also open the relevant security group. For related steps, please refer to the Create VPC Peering Connections section.
  • If you need to access resources through the public network, please configure a NAT gateway for the VPC where the BYOC deployment is located in your public cloud console.