Prometheus Monitoring Alerts
Note
This feature is available on Dedicated Plan and BYOC Plan.
The Prometheus API is available in the EMQX Cloud and can be used to easily monitor critical metrics. In this article, we will explain how to configure the Prometheus service, access to critical metrics from the EMQX Cloud API, and how to view metrics using Grafana.
API Configuration
Go to EMQX Cloud Deployment Console, find Overview-REST API, get the API address, click on New Application, get the APP ID and APP Secret.
Deployment Metrics URI
GET /deployment_metrics
Returns cluster metrics for Prometheus collection.
Query Parameters:
None
Request Message
None
Request Example
curl -u app_id:app_secret -X GET {api}/deployment_metrics
Response
Metrics | Description | Type |
---|---|---|
deployment_emqx_cluster_status | The status of the EMQX cluster | Gauge |
deployment_emqx_sessions_count | The current number of sessions for the cluster | Gauge |
deployment_emqx_connections_count | The current number of connections for the cluster | Gauge |
deployment_emqx_messages_rate | The rate of messages sent and received per second | Gauge |
deployment_emqx_messages_send_rate | The rate of messages sent per second | Gauge |
deployment_emqx_messages_receive_rate | The rate of message receiving | Gauge |
deployment_emqx_subscriptions_count | The count of subscriptions for the cluster | Gauge |
deployment_emqx_metrics_live_connections_count | The current number of live connections for the current cluster | Gauge |
deployment_emqx_metrics_live_connections_max | The maximum number of live connections for the current cluster | Gauge |
deployment_emqx_metrics_modules_count | The current number of modules for the current cluster | Gauge |
deployment_emqx_metrics_modules_max | The maximum number of modules for the current cluster | Gauge |
deployment_emqx_metrics_users_count | The current number of username for the current cluster | Gauge |
deployment_emqx_metrics_connections_count | Current number of connections | Gauge |
deployment_emqx_metrics_connections_max | Maximum number of connections | Gauge |
deployment_emqx_metrics_channels_count | Current number of channels | Gauge |
deployment_emqx_metrics_channels_max | Maximum number of channels | Gauge |
deployment_emqx_metrics_sessions_count | Current number of sessions | Gauge |
deployment_emqx_metrics_sessions_max | Maximum number of sessions | Gauge |
deployment_emqx_metrics_topics_count | Current number of topics | Gauge |
deployment_emqx_metrics_topics_max | Maximum number of topics | Gauge |
deployment_emqx_metrics_suboptions_count | Current number of subscription options | Gauge |
deployment_emqx_metrics_suboptions_max | Maximum number of subscription options | Gauge |
deployment_emqx_metrics_subscribers_count | Current number of subscribers | Gauge |
deployment_emqx_metrics_subscribers_max | Maximum number of subscribers | Gauge |
deployment_emqx_metrics_subscriptions_count | Current number of subscriptions | Gauge |
deployment_emqx_metrics_subscriptions_max | Maximum number of subscriptions | Gauge |
deployment_emqx_metrics_subscriptions_shared_count | Current number of shared subscriptions | Gauge |
deployment_emqx_metrics_subscriptions_shared_max | Maximum number of shared subscriptions | Gauge |
deployment_emqx_metrics_routes_count | Current number of routes | Gauge |
deployment_emqx_metrics_routes_max | Maximum number of routes | Gauge |
deployment_emqx_metrics_retained_count | Current number of retained messages | Gauge |
deployment_emqx_metrics_retained_max | Historical maximum number of retained messages | Gauge |
deployment_emqx_metrics_client_authenticate_success | client.authenticate hook trigger times with success | Counter |
deployment_emqx_metrics_bytes_received | Number of received bytes | Counter |
deployment_emqx_metrics_bytes_sent | Number of sent bytes | Counter |
deployment_emqx_metrics_packets_received | Number of received packets | Counter |
deployment_emqx_metrics_packets_sent | Number of sent packets | Counter |
deployment_emqx_metrics_packets_connect_received | Number of received CONNECT packets | Counter |
deployment_emqx_metrics_packets_connack_auth_error | Number of sent CONNACK messages with auth error | Counter |
deployment_emqx_metrics_packets_connack_error | Number of sent CONNACK packets with error | Counter |
deployment_emqx_metrics_packets_connack_sent | Number of sent CONNACK packets | Counter |
deployment_emqx_metrics_packets_publish_received | Number of received PUBLISH packets | Counter |
deployment_emqx_metrics_packets_publish_sent | Number of sent PUBLISH packets | Counter |
deployment_emqx_metrics_packets_publish_inuse | Number of PUBLISH packets in use | Counter |
deployment_emqx_metrics_packets_publish_auth_error | Number of PUBLISH packets with auth error | Counter |
deployment_emqx_metrics_packets_publish_error | Number of PUBLISH packets with error | Counter |
deployment_emqx_metrics_packets_publish_dropped | Number of dropped PUBLISH packets | Counter |
deployment_emqx_metrics_packets_puback_received | Number of received PUBACK packets | Counter |
deployment_emqx_metrics_packets_puback_sent | Number of sent PUBACK packets | Counter |
deployment_emqx_metrics_packets_puback_inuse | Number of PUBACK packets in use | Counter |
deployment_emqx_metrics_packets_puback_missed | Number of missed PUBACK packets | Counter |
deployment_emqx_metrics_packets_pubrec_received | Number of received PUBREC packets | Counter |
deployment_emqx_metrics_packets_pubrec_sent | Number of sent PUBREC packets | Counter |
deployment_emqx_metrics_packets_pubrec_inuse | Number of PUBREC packets in use | Counter |
deployment_emqx_metrics_packets_pubrec_missed | Number of missed PUBREC packets | Counter |
deployment_emqx_metrics_packets_pubrel_received | Number of received PUBREL packets | Counter |
deployment_emqx_metrics_packets_pubrel_sent | Number of sent PUBREL packets | Counter |
deployment_emqx_metrics_packets_pubrel_missed | Number of missed PUBREL packets | Counter |
deployment_emqx_metrics_packets_pubcomp_received | Number of received PUBCOMP packets | Counter |
deployment_emqx_metrics_packets_pubcomp_sent | Number of sent PUBCOMP packets | Counter |
deployment_emqx_metrics_packets_pubcomp_inuse | Number of PUBCOMP packets in use | Counter |
deployment_emqx_metrics_packets_pubcomp_missed | Number of missed PUBCOMP packets | Counter |
deployment_emqx_metrics_packets_subscribe_received | Number of received SUBSCRIBE packets | Counter |
deployment_emqx_metrics_packets_subscribe_error | Number of SUBSCRIBE packets with error | Counter |
deployment_emqx_metrics_packets_subscribe_auth_error | Number of SUBSCRIBE packets with auth error | Counter |
deployment_emqx_metrics_packets_suback_sent | Number of sent SUBACK packets | Counter |
deployment_emqx_metrics_packets_unsubscribe_received | Number of received UNSUBSCRIBE packets | Counter |
deployment_emqx_metrics_packets_unsubscribe_error | Number of UNSUBSCRIBE packets with error | Counter |
deployment_emqx_metrics_packets_unsuback_sent | Number of sent UNSUBACK packets | Counter |
deployment_emqx_metrics_packets_pingreq_received | Number of received PINGREQ packets | Counter |
deployment_emqx_metrics_packets_pingresp_sent | Number of sent PINGRESP packets | Counter |
deployment_emqx_metrics_packets_disconnect_received | Number of received DISCONNECT packets | Counter |
deployment_emqx_metrics_packets_disconnect_sent | Number of sent DISCONNECT packets | Counter |
deployment_emqx_metrics_packets_auth_received | Number of received AUTH packets | Counter |
deployment_emqx_metrics_packets_auth_sent | Number of sent AUTH packets | Counter |
deployment_emqx_metrics_delivery_dropped_too_large | Number of dropped too large deliveries | Counter |
deployment_emqx_metrics_delivery_dropped_queue_full | Number of dropped deliveries due to full queue | Counter |
deployment_emqx_metrics_delivery_dropped_qos0_msg | Number of dropped QoS 0 messages | Counter |
deployment_emqx_metrics_delivery_dropped_expired | Number of expired message deliveries | Counter |
deployment_emqx_metrics_delivery_dropped_no_local | Number of deliveries with no local clients | Counter |
deployment_emqx_metrics_delivery_dropped | Total number of dropped deliveries | Counter |
deployment_emqx_metrics_messages_delayed | Number of delayed messages | Counter |
deployment_emqx_metrics_messages_delivered | Number of delivered messages | Counter |
deployment_emqx_metrics_messages_dropped | Number of dropped messages | Counter |
deployment_emqx_metrics_messages_dropped_no_subscribers | Number of messages dropped due to no subscribers | Counter |
deployment_emqx_metrics_messages_dropped_await_pubrel_timeout | Number of messages dropped awaiting PUBREL timeout | Counter |
deployment_emqx_metrics_messages_forward | Number of forwarded messages | Counter |
deployment_emqx_metrics_messages_publish | Number of published messages | Counter |
deployment_emqx_metrics_messages_qos0_received | Number of QoS 0 messages received | Counter |
deployment_emqx_metrics_messages_qos2_received | Number of QoS 2 messages received | Counter |
deployment_emqx_metrics_messages_qos1_received | Number of QoS 1 messages received | Counter |
deployment_emqx_metrics_messages_qos0_sent | Number of QoS 0 messages sent | Counter |
deployment_emqx_metrics_messages_qos1_sent | Number of QoS 1 messages sent | Counter |
deployment_emqx_metrics_messages_qos2_sent | Number of QoS 2 messages sent | Counter |
deployment_emqx_metrics_messages_received | Total number of received messages | Counter |
deployment_emqx_metrics_messages_sent | Total number of sent messages | Counter |
deployment_emqx_metrics_messages_retained | Number of retained messages | Counter |
deployment_emqx_metrics_messages_acked | Number of acknowledged messages | Counter |
deployment_emqx_metrics_client_connect | Number of client connections | Counter |
deployment_emqx_metrics_client_authenticate | Number of client authentications | Counter |
deployment_emqx_metrics_client_connack | Number of CONNACK packets sent | Counter |
deployment_emqx_metrics_client_connected | Number of currently connected clients | Counter |
deployment_emqx_metrics_client_disconnected | Number of client disconnections | Counter |
deployment_emqx_metrics_client_check_acl | Number of ACL checks performed | Counter |
deployment_emqx_metrics_client_subscribe | Number of client subscriptions | Counter |
deployment_emqx_metrics_client_unsubscribe | Number of client unsubscriptions | Counter |
deployment_emqx_metrics_client_auth_success | Number of successful client authentications | Counter |
deployment_emqx_metrics_client_auth_success_anonymous | Number of successful anonymous client authentications | Counter |
deployment_emqx_metrics_client_auth_failure | Number of failed client authentications | Counter |
deployment_emqx_metrics_client_acl_allow | Number of ACL allow decisions | Counter |
deployment_emqx_metrics_client_acl_deny | Number of ACL deny decisions | Counter |
deployment_emqx_metrics_client_acl_cache_hit | Number of ACL cache hits | Counter |
deployment_emqx_metrics_session_created | Number of sessions created | Counter |
deployment_emqx_metrics_session_discarded | Number of discarded sessions | Counter |
deployment_emqx_metrics_session_resumed | Number of resumed sessions | Counter |
deployment_emqx_metrics_session_takeovered | Number of sessions taken over | Counter |
deployment_emqx_metrics_session_terminated | Number of terminated sessions | Counter |
Data Integration Metrics URI
GET /deployment_metrics/data_integration
Returns data integration metrics for Prometheus collection.
Query Parameters:
None
Request Message
None
Request Example
curl -u app_id:app_secret -X GET {api}/deployment_metrics/data_integration
Response
- Each resource status
- Each rule status
- The matching rate of each rule
- The count of different match statuses for each rule ("matched", "passed", "failed", "exception", "no_result")
- The count of all action processing statuses under each rule ("success", "failed", "taken")
Metrics | Description | Type | Variable Labels |
---|---|---|---|
deployment_emqx_resource_status | The current status of a specific resource. | Gauge | resource_id |
deployment_emqx_rule_status | The current status of a specific rule. | Gauge | rule_id |
deployment_emqx_rule_matched_rate | The matching rate of a specific rule. | Gauge | rule_id |
deployment_emqx_rule_matched_count | The total matches for a specific rule. | Gauge | rule_id, match_status |
deployment_emqx_rule_action_execution_count | The execution count for actions of a specific rule. | Gauge | rule_id, execution_status |
Prometheus Configuration
Install Prometheus
bashwget -c https://github.com/prometheus/prometheus/releases/download/v2.35.0-rc0/prometheus-2.35.0-rc0.linux-amd64.tar.gz tar xvfz prometheus-*.tar.gz
Modify configuration file
Go to the monitoring directory specified for your Prometheus service and modify the scrape_configs section of the configuration file prometheus.yml as shown in the example below.
bashscrape_configs: - job_name: 'emqx_cloud_metrics_deployment' scheme: 'https' static_configs: - targets: [ 'xxxx:8443' ] metrics_path: "/api/deployment_metrics" params: type: [ "prometheus" ] basic_auth: username: 'APP ID' password: 'APP Secret' - job_name: 'emqx_cloud_metrics_data_integration' scheme: 'https' static_configs: - targets: [ 'xxxx:8443' ] metrics_path: "/api/deployment_metrics/data_integration" params: type: [ "prometheus" ] basic_auth: username: 'APP ID' password: 'APP Secret'
Launch and check service status
Launch Prometheus
bash./prometheus --config.file=prometheus.yml
Access your Prometheus service via your local IP with the corresponding port, e.g. x.x.x.x:9090 and check Status-Targets to confirm that the new scrape_config file has been read. If the status shows an exception, you may need to check the configuration file and restart the Prometheus service.
Grafana Configuration
Install and launch Grafana
bashwget https://dl.grafana.com/enterprise/release/grafana-enterprise-8.4.6.linux-amd64.tar.gz tar -zxvf grafana-enterprise-8.4.6.linux-amd64.tar.gz ./bin/grafana-server
Configure Grafana
Access Grafana dashboard via the local IP + the corresponding port, e.g. x.x.x.x:3000. The initial ID and password are admin. You can change the password when logging in for the first time.
Import Grafana Data Templates
EMQX Cloud provides template files for Grafana Dashboard. These templates contain a display of all EMQX Cloud monitoring data. Users can import them directly into Grafana to display EMQX monitoring status icons.
Access to the template :emqx_prometheus/grafana_template/EMQ.json,The EMQ.json file can be uploaded locally via Upload JSON file, or written manually via the import panel json.
Metrics Details
After the system has been set up and running for a while, the data collected by EMQX Cloud Prometheus will be displayed on Grafana, including the number of clients, subscriptions, topics, messages, messages and other business information history statistics. You can view the corresponding display charts for each metric, and detailed information at a certain point in time.
Prometheus tracks the following metrics data for your EMQX Cloud deployment:
Metrics | Type | Description |
---|---|---|
emqx_vm_used_memory | gauge | Memory occupied by the VM |
emqx_vm_total_memory | gauge | Total memory opened by the VM |
emqx_vm_run_queue | gauge | RunQueue size |
emqx_vm_process_messages_in_queues | gauge | Total number of message blocks |
emqx_vm_cpu_use | gauge | Number of CPUs occupied |
emqx_vm_cpu_idle | gauge | CPU idle |
emqx_topics_max | gauge | Maximum number of topics in history |
emqx_topics_count | gauge | Number of current topics |
emqx_subscriptions_shared_max | gauge | Historical maximum shared subscriptions |
emqx_subscriptions_shared_count | gauge | Current Shared Subscriptions |
emqx_subscriptions_max | gauge | Historical maximum number of subscription relationships |
emqx_subscriptions_count | gauge | Current number of subscription relationships |
emqx_subscribers_max | gauge | Historical maximum number of subscribers |
emqx_subscribers_count | gauge | Current number of subscribers |
emqx_suboptions_max | gauge | Historical maximum number of subscription configuration items |
emqx_suboptions_count | gauge | Historical maximum number of subscription configuration items |
emqx_sessions_max | gauge | Historical maximum number of sessions |
emqx_sessions_count | gauge | Number of current sessions |
emqx_session_terminated | counter | Sessions terminated |
emqx_session_takeovered | counter | Sessions taken over |
emqx_session_resumed | counter | Session reactivated |
emqx_session_discarded | counter | Sessions discarded |
emqx_session_created | counter | Sessions created |
emqx_routes_max | gauge | Historical maximum number of routes |
emqx_routes_count | gauge | Current number of routes |
emqx_retained_max | gauge | Historical maximum number of reserved messages |
emqx_retained_count | gauge | Number of current hold messages |
emqx_packets_unsubscribe_received | counter | Number of UNSUB messages received |
emqx_packets_unsubscribe_error | counter | Number of UNSUB messages rejected |
emqx_packets_unsuback_sent | counter | Number of UNSUBACK messages sent |
emqx_packets_subscribe_received | counter | Number of SUB messages received |
emqx_packets_subscribe_error | counter | Rejected SUB messages |
emqx_packets_subscribe_auth_error | counter | Number of SUB messages rejected (ACL check failed) |
emqx_packets_suback_sent | counter | Number of SUBACK messages sent |
emqx_packets_sent | counter | Number of messages sent |
emqx_packets_received | counter | Received messages |
emqx_packets_pubrel_sent | counter | Number of PUBREL messages sent |
emqx_packets_pubrel_received | counter | Number of PUBREL messages received |
emqx_packets_pubrel_missed | counter | Number of PUBREL messages rejected (PacketId not found) |
emqx_packets_pubrec_sent | counter | Number of PUBREC messages sent |
emqx_packets_pubrec_received | counter | Number of PUBREC messages received |
emqx_packets_pubrec_missed | counter | Number of PUBREC messages rejected (PacketId not found) |
emqx_packets_pubrec_inuse | counter | Number of PUBREC messages rejected (PacketId occupied) |
emqx_packets_publish_sent | counter | Number of PUB messages sent |
emqx_packets_publish_received | counter | Number of PUB messages received |
emqx_packets_publish_inuse | counter | Number of rejected PUB messages (PacketId occupied) |
emqx_packets_publish_error | counter | Number of incorrect PUB messages |
emqx_packets_publish_dropped | counter | Number of PUB messages discarded |
emqx_packets_publish_auth_error | counter | Number of PUB messages rejected (ACL check failed) |
emqx_packets_pubcomp_sent | counter | Number of PUBCOMP messages sent |
emqx_packets_pubcomp_received | counter | Number of PUBCOMP messages received |
emqx_packets_pubcomp_missed | counter | Number of PUBCOMP messages rejected (PacketId not found) |
emqx_packets_pubcomp_inuse | counter | Number of PUBCOMP messages rejected (PacketId occupied) |
emqx_packets_puback_sent | counter | Number of PUBACK messages sent |
emqx_packets_puback_received | counter | Number of PUBACK messages received |
emqx_packets_puback_missed | counter | Number of PUBACK messages rejected (PacketId not found) |
emqx_packets_puback_inuse | counter | Number of PUBACK messages rejected (PacketId occupied) |
emqx_packets_pingresp_sent | counter | Number of PONG messages sent |
emqx_packets_pingreq_received | counter | Number of PING messages received |
emqx_packets_disconnect_sent | counter | Number of disconnect messages sent |
emqx_packets_disconnect_received | counter | Number of disconnect messages received |
emqx_packets_connect | counter | Number of connection messages received |
emqx_packets_connack_sent | counter | Number of connection confirmation messages sent |
emqx_packets_connack_error | counter | Number of connection failure messages sent |
emqx_packets_connack_auth_error | counter | Number of failed connection authentication messages sent |
emqx_packets_auth_sent | counter | Number of authentication messages sent |
emqx_packets_auth_received | counter | Number of authentication messages received |
emqx_messages_sent | counter | Total number of messages sent |
emqx_messages_retained | counter | Total number of messages stored as reserved messages |
emqx_messages_received | counter | Total number of messages received |
emqx_messages_qos2_sent | counter | Total number of QoS2 messages sent |
emqx_messages_qos2_received | counter | Total number of QoS2 messages received |
emqx_messages_qos1_sent | counter | Total number of QoS1 messages sent |
emqx_messages_qos1_received | counter | Total number of QoS1 messages received |
emqx_messages_qos0_sent | counter | Total number of QoS0 messages sent |
emqx_messages_qos0_received | counter | Total number of QoS0 messages received |
emqx_messages_publish | counter | Total number of messages initiated and released |
emqx_messages_forward | counter | Total number of messages forwarded across nodes |
emqx_messages_dropped_no_subscribers | counter | Total number of non-subscriber messages discarded |
emqx_messages_dropped_expired | counter | Total number of expired messages discarded |
emqx_messages_dropped | counter | Messages Discarded |
emqx_messages_delivered | counter | Messages Delivered |
emqx_messages_delayed | counter | Total number of messages deposited as delayed |
emqx_messages_acked | counter | Messages received back |
emqx_delivery_dropped_too_large | counter | Number of discards for messages delivered too large |
emqx_delivery_dropped_queue_full | counter | Message Delivery Queue Full Discards |
emqx_delivery_dropped_qos0_msg | counter | Message delivered QoS0 Number of discards |
emqx_delivery_dropped_no_local | counter | Number of message delivery no_local discards |
emqx_delivery_dropped_expired | counter | Message delivery overdue discards |
emqx_delivery_dropped | counter | Message delivery discards |
emqx_connections_max | gauge | Historical maximum number of connections |
emqx_connections_count | gauge | Current connections |
emqx_cluster_nodes_stopped | gauge | Number of stopped nodes in the cluster |
emqx_cluster_nodes_running | gauge | Number of running nodes in the cluster |
emqx_client_unsubscribe | counter | Client unsubscribes |
emqx_client_subscribe | counter | Client initiated subscriptions |
emqx_client_disconnected | counter | Clients offline |
emqx_client_connected | counter | Clients are online |
emqx_client_check_acl | counter | Client initiated ACL request |
emqx_client_authenticate | counter | Client initiates authentication |
emqx_client_auth_anonymous | counter | Login as anonymous client |
emqx_bytes_sent | counter | Total number of bytes sent |
emqx_bytes_received | counter | Total bytes received |