The most scalable open-source MQTT broker for IoT, IIoT, and connected vehicles
#12983 Add new rule engine event $events/client_check_authn_complete
for authentication completion event.
#13180 Improved client message handling performance when EMQX is running on Erlang/OTP 26 and increased message throughput by 10% in fan-in mode.
#13191 Upgraded EMQX Docker images to run on Erlang/OTP 26.
EMQX had been running on Erlang/OTP 26 since v5.5 except for docker images which were on Erlang/OTP 25. Now all releases are on Erlang/OTP 26.
#13242 Significantly increased the startup speed of EMQX dashboard listener.
#13156 Resolved an issue where the Dashboard Monitoring pages would crash following the update to EMQX v5.7.0.
#13164 Fixed HTTP authorization request body encoding.
Before this fix, the HTTP authorization request body encoding format was taken from the accept
header. The fix is to respect the content-type
header instead. Also added access
templating variable for v4 compatibility. The access code of SUBSCRIBE action is 1
and PUBLISH action is 2
.
#13238 Improved the logged error messages when an HTTP authorization request with an unsupported content-type header is returned.
#13258 Fix an issue where the MQTT-SN gateway would not restart correctly due to incorrect startup order of gateway dependencies.
#13273 Fixed and improved handling of URIs in several configurations. The fix includes the following improvement details:
https://example.com?q=x
were mistakenly rejected. These URIs are now properly recognized as valid.#13276 Fixed an issue in the durable message storage mechanism where parts of the internal storage state were not correctly persisted during the setup of new storage generations. The concept of "generation" is used internally and is crucial for managing message expiration and cleanup. This could have manifested as messages being lost after a restart of EMQX.
#13291 Fixed an issue where durable storage sites that were down being reported as up.
#13290 Fixed an issue where the command $ bin/emqx ctl rules show rule_0hyd
would produce no output when used to display rules with a data integration action attached.
#13293 Improved the restoration process from data backups by automating the re-indexing of imported retained messages. Previously, re-indexing required manual intervention using the emqx ctl retainer reindex start
CLI command after importing a data backup file.
This fix also extended the functionality to allow exporting retained messages to a backup file when the retainer.backend.storage_type
is configured as ram
. Previously, only setups with disc
as the storage type supported exporting retained messages.
#13140 Fixed an issue that caused text traces for the republish action to crash and not display correctly.
#13148 Fixed an issue where a 500 HTTP status code could be returned by /connectors/:connector-id/start
when there is a timeout waiting for the resource to be connected.
#13181 EMQX now forcefully shut down the connector process when attempting to stop a connector, if such operation times out. This fix also improved the clarity of error messages when disabling an action or source fails due to an unresponsive underlying connector.
#13216 Respect clientid_prefix
config for MQTT bridges. Since EMQX v5.4.1, the MQTT client IDs are restricted to a maximum of 23 bytes. Previously, the system factored the clientid_prefix
into the hash of the original, longer client ID, affecting the final shortened ID. The fix includes the following change details:
disconnect_after_expire
option. When enabled, the client will be disconnected after the JWT token expires.Note: This is a breaking change. This option is enabled by default, so the default behavior is changed. Previously, the clients with actual JWTs could connect to the broker and stay connected even after the JWT token expired. Now, the client will be disconnected after the JWT token expires. To preserve the previous behavior, set disconnect_after_expire
to false
.
unescape
function has been added to the rule engine SQL language to handle the expansion of escape sequences in strings. This addition has been done because string literals in the SQL language don't support any escape codes (e.g., \n
and \t
). This enhancement allows for more flexible string manipulation within SQL expressions.#12872 Implemented Client Attributes feature. It allows setting additional properties for each client using key-value pairs. Property values can be generated from MQTT client connection information (such as username, client ID, TLS certificate) or set from data accompanying successful authentication returns. Properties can be used in EMQX for authentication, authorization, data integration, and MQTT extension functions. Compared to using static properties like client ID directly, client properties offer greater flexibility in various business scenarios, simplifying the development process and enhancing adaptability and efficiency in development work.
Initialization of client_attrs
The client_attrs
fields can be initially populated from one of the following clientinfo
fields:
cn
: The common name from the TLS client's certificate.dn
: The distinguished name from the TLS client's certificate, that is, the certificate "Subject".clientid
: The MQTT client ID provided by the client.username
: The username provided by the client.user_property
: Extract a property value from 'User-Property' of the MQTT CONNECT packet.Extension through Authentication Responses
Additional attributes may be merged into client_attrs
from authentication responses. Supported
authentication backends include:
client_attrs
field.client_attrs
claim within the JWT.Usage in Authentication and Authorization
If client_attrs
is initialized before authentication, it can be used in external authentication
requests. For instance, ${client_attrs.property1}
can be used within request templates
directed at an HTTP server for authenticity validation.
client_attrs
can be utilized in authorization configurations or request templates, enhancing
flexibility and control. Examples include: In acl.conf
, use {allow, all, all, ["${client_attrs.namespace}/#"]}
to apply permissions based on the namespace
attribute.${client_attrs.namespace}
can be used within request templates to dynamically include client attributes.#12910 Added plugin configuration management and schema validation. For EMQX enterprise edition, one can also annotate the schema with metadata to facilitate UI rendering in the Dashboard. See more details in the plugin template and plugin documentation.
#12923 Provided more specific error when importing wrong format into builtin authenticate database.
#12940 Added ignore_readonly
argument to PUT /configs
API.
Before this change, EMQX would return 400 (BAD_REQUEST) if the raw config included read-only root keys (cluster
, rpc
, and node
).
After this enhancement it can be called as PUT /configs?ignore_readonly=true
, EMQX will in this case ignore readonly root config keys, and apply the rest. For observability purposes, an info level message is logged if any readonly keys are dropped.
Also fixed an exception when config has bad HOCON syntax (returns 500). Now bad syntax will cause the API to return 400 (BAD_REQUEST).
#12957 Started building packages for macOS 14 (Apple Silicon) and Ubuntu 24.04 Noble Numbat (LTS).
#12887 Fixed MQTT enhanced auth with sasl scram.
#12962 TLS clients can now verify server hostname against wildcard certificate. For example, if a certificate is issued for host *.example.com
, TLS clients is able to verify server hostnames like srv1.example.com
.
emqx_retainer
application. Previously, client disconnection while receiving retained messages could cause a process leak.#12653 The rule engine function bin2hexstr
now supports bitstring inputs with a bit size that is not divisible by 8. Such bitstrings can be returned by the rule engine function subbits
.
#12657 The rule engine SQL-based language previously did not allow putting any expressions as array elements in array literals (only constants and variable references were allowed). This has now been fixed so that one can use any expressions as array elements. The following is now permitted, for example:
select
[21 + 21, abs(-abs(-2)), [1 + 1], 4] as my_array
from "t/#"
#12932 Previously, if a HTTP action request received a 503 (Service Unavailable) status, it was marked as a failure and the request was not retried. This has now been fixed so that the request is retried a configurable number of times.
#12948 Fixed an issue where sensitive HTTP header values like Authorization
would be substituted by ******
after updating a connector.
#13118 Fix a performance issue in the rule engine template rendering.
subscribers.count
subscribers.max
contains shared-subscribers. It only contains non-shared subscribers previously.#12812 Made resource health checks non-blocking operations. This means that operations such as updating or removing a resource won't be blocked by a lengthy running health check.
#12830 Made channel (action/source) health checks non-blocking operations. This means that operations such as updating or removing an action/source data integration won't be blocked by a lengthy running health check.
#12993 Fixed listener config update API when handling an unknown zone.
Before this fix, when a listener config is updated with an unknown zone, for example {"zone": "unknown"}
, the change would be accepted, causing all clients to crash whens connected.
After this fix, updating the listener with an unknown zone name will get a "Bad request" response.
#13012 The MQTT listerners config option access_rules
has been improved in the following ways:
#13041 Improved HTTP authentication error log message. If HTTP content-type header is missing for POST method, it now emits a meaningful error message instead of a less readable exception with stack trace.
#13077 This fix makes EMQX only read action configurations from the global configuration when the connector starts or restarts, and instead stores the latest configurations for the actions in the connector. Previously, updates to action configurations would sometimes not take effect without disabling and enabling the action. This means that an action could sometimes run with the old (previous) configuration even though it would look like the action configuration has been updated successfully.
#13090 Attempting to start an action or source whose connector is disabled will no longer attempt to start the connector itself.
#12909 Fixed UDP listener process handling on errors or closure, The fix ensures the UDP listener is cleanly stopped and restarted as needed if these error conditions occur.
#13001 Fixed an issue where the syskeeper forwarder would never reconnect when the connection was lost.
#13010 Fixed the issue where the JT/T 808 gateway could not correctly reply to the REGISTER_ACK message when requesting authentication from the registration service failed.
#12947 For JWT authentication, a new boolean option disconnect_after_expire
has been added with default value set to true
. When enabled, the client will be disconnected after the JWT token expires.
Previously, the clients with actual JWTs could connect to the broker and stay connected even after the JWT token expired. Now, the client will be disconnected after the JWT token expires. To preserve the previous behavior, set disconnect_after_expire
to false
.
#12957 Stopped building packages for macOS 12.
#12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.
#12766 Renamed message_queue_too_long
error reason to mailbox_overflow
mailbox_overflow
. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size
.
#12773 Upgraded HTTP client libraries.
The HTTP client library (gun-1.3
) incorrectly appended a :portnumber
suffix to the Host
header for
standard ports (http
on port 80, https
on port 443). This could cause compatibility issues with servers or gateways performing strict Host
header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound
or "The specified CustomDomain does not exist."
#12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave
command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy
was not manual
. With the latest update, executing the cluster leave
command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable
command or simply restart the node.
#12814 Improved error handling for the /clients/{clientid}/mqueue_messages
and /clients/{clientid}/inflight_messages
APIs in EMQX. These updates address:
{"code":"INTERNAL_ERROR","message":"timeout"}
, and log additional details for troubleshooting.{"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}
. This ensures clearer feedback when client connections are interrupted.#12824 Updated the statistics metrics subscribers.count
and subscribers.max
to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.
#12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:
sources.mqtt
category with specific connectors and parameters such as QoS and topics.mnesia
table for retained messages was not supported.#12843 Fixed cluster_rpc_commit
transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave
command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.
#12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.
The "Retained messages" backend API uses the qlc
library. This problem was due to a permission issue where the qlc
library's file_sorter
function tried to use a non-writable directory, /opt/emqx
, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.
This update modifies the ownership settings of the /opt/emqx
directory to emqx:emqx
, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.
#12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.
#12766 Renamed message_queue_too_long
error reason to mailbox_overflow
mailbox_overflow
. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size
.
#12773 Upgraded HTTP client libraries.
The HTTP client library (gun-1.3
) incorrectly appended a :portnumber
suffix to the Host
header for
standard ports (http
on port 80, https
on port 443). This could cause compatibility issues with servers or gateways performing strict Host
header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound
or "The specified CustomDomain does not exist."
#12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave
command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy
was not manual
. With the latest update, executing the cluster leave
command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable
command or simply restart the node.
#12814 Improved error handling for the /clients/{clientid}/mqueue_messages
and /clients/{clientid}/inflight_messages
APIs in EMQX. These updates address:
{"code":"INTERNAL_ERROR","message":"timeout"}
, and log additional details for troubleshooting.{"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}
. This ensures clearer feedback when client connections are interrupted.#12824 Updated the statistics metrics subscribers.count
and subscribers.max
to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.
#12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:
sources.mqtt
category with specific connectors and parameters such as QoS and topics.mnesia
table for retained messages was not supported.#12843 Fixed cluster_rpc_commit
transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave
command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.
#12882 Fixed an issue with the RocketMQ action in EMQX data integration, ensuring that messages are correctly routed to their configured topics. Previously, when multiple actions shared a single RocketMQ connector, all messages were mistakenly sent to the topic configured for the first action. This fix involves starting a distinct set of RocketMQ workers for each topic, preventing cross-topic message delivery errors.
#12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.
The "Retained messages" backend API uses the qlc
library. This problem was due to a permission issue where the qlc
library's file_sorter
function tried to use a non-writable directory, /opt/emqx
, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.
This update modifies the ownership settings of the /opt/emqx
directory to emqx:emqx
, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.
#12251 Optimized the performance of the RocksDB-based persistent sessions, achieving a reduction in RAM usage and database request frequency. Key improvements include:
#12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain
, EMQX retains records of expired sessions for a specified duration.
Session count API: Use the API GET /api/v5/sessions_count?since=1705682238
to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.
Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions
, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:
# TYPE emqx_cluster_sessions_count gauge
emqx_cluster_sessions_count 1234
NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.
#12338 Introduced a time-based garbage collection mechanism to the RocksDB-based persistent session backend. This feature ensures more efficient management of stored messages, optimizing storage utilization and system performance by automatically purging outdated messages.
#12398 Exposed the swagger_support
option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.
#12467 Started supporting cluster discovery using AAAA DNS record type.
#12483 Renamed emqx ctl conf cluster_sync tnxid ID
to emqx ctl conf cluster_sync inspect ID
.
For backward compatibility, tnxid
is kept, but considered deprecated and will be removed in 5.7.
#12499 Enhanced client banning capabilities with extended rules, including:
clientid
against a specified regular expression.username
against a specified regular expression.Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.
#12509 Implemented API to re-order all authenticators / authorization sources.
#12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~
and ~"""
markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:
rule_xlu4 {
sql = """~
SELECT
*
FROM
"t/#"
~"""
}
See HOCON 0.42.0 release notes for details.
#12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:
authentication_failure
authorization_permission_denied
cannot_publish_to_topic_due_to_not_authorized
cannot_publish_to_topic_due_to_quota_exceeded
connection_rejected_due_to_license_limit_reached
dropped_msg_due_to_mqueue_is_full
#12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.
To get the first chunk of data:
GET /clients/{clientid}/mqueue_messages?limit=100
GET /clients/{clientid}/inflight_messages?limit=100
Alternatively, for the first chunks without specifying a start position:
GET /clients/{clientid}/mqueue_messages?limit=100&position=none
GET /clients/{clientid}/inflight_messages?limit=100&position=none
To get the next chunk of data:
GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
GET /clients/{clientid}/inflight_messages?limit=100&position={position}
Where {position}
is a value (opaque string token) of meta.position
field from the previous response.
Ordering and Prioritization:
#12590 Removed mfa
meta data from log messages to improve clarity.
#12641 Improved text log formatter fields order. The new fields order is as follows:
tag
> clientid
> msg
> peername
> username
> topic
> [other fields]
#12670 Added field shared_subscriptions
to endpoint /monitor_current
and /monitor_current/nodes/:node
.
#12679 Upgraded docker image base from Debian 11 to Debian 12.
#12700 Started supporting "b" and "B" unit in bytesize hocon fields. For example, all three fields below will have the value of 1024 bytes:
bytesize_field = "1024b"
bytesize_field2 = "1024B"
bytesize_field2 = 1024
#12719 The /clients
API has been upgraded to accommodate queries for multiple clientid
s and username
s simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.
Examples of Multi-Client/Username Queries:
/clients?clientid=client1&clientid=client2
/clients?username=user11&username=user2
/clients?clientid=client1&clientid=client2&username=user1&username=user2
Examples of Selecting Fields for the Response:
/clients?fields=all
(Note: Omitting the fields
parameter defaults to returning all fields.)/clients?fields=clientid,username
#12381 Added new SQL functions: map_keys()
, map_values()
, map_to_entries()
, join_to_string()
, join_to_string()
, join_to_sql_values_string()
, is_null_var()
, is_not_null_var()
.
For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.
#12336 Performance enhancement. Created a dedicated async task handler pool to handle client session cleanup tasks.
#12725 Implemented REST API to list the available source types.
#12746 Added username
log field. If MQTT client is connected with a non-empty username the logs and traces will include username
field.
#12785 Added timestamp_format
configuration option to log handlers. This new option allows for the following settings:
auto
: Automatically determines the timestamp format based on the log formatter being used.
Utilizes rfc3339
format for text formatters, and epoch
format for JSON formatters.
epoch
: Represents timestamps in microseconds precision Unix epoch format.
rfc3339
: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00
.
#11868 Fixed a bug where will messages were not published after session takeover.
#12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.
When variables in payload
and topic
templates are undefined, they are now rendered as empty strings instead of the literal undefined
string.
#12472 Fixed an issue where certain read operations on /api/v5/actions/
and /api/v5/sources/
endpoints might result in a 500
error code during the process of rolling upgrades.
#12492 EMQX now returns the Receive-Maximum
property in the CONNACK
message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum
setting and the server's max_inflight
configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK
message.
#12500 The GET /clients
and GET /client/:clientid
HTTP APIs have been updated to include disconnected persistent sessions in their responses.
NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.
#12513 Changed the level of several flooding log events from warning
to info
.
#12530 Improved the error reporting for frame_too_large
events and malformed CONNECT
packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.
#12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name
and cluster.discover_strategy
. Specifically, when utilizing the dns
strategy with either a
or aaaa
record types, it is mandatory for all nodes to use a (static) IP address as the host name.
#12562 Added a new configuration root: durable_storage
. This configuration tree contains the settings related to the new persistent session feature.
#12566 Enhanced the bootstrap file for REST API keys:
Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.
API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.
#12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.
#12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.
#12663 Fixed an issue where the emqx_vm_cpu_use
and emqx_vm_cpu_idle
metrics, accessible via the Prometheus endpoint /prometheus/stats
, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.
#12668 Refactored the SQL function date_to_unix_ts()
by using calendar:datetime_to_gregorian_seconds/1
.
This change also added validation for the input date format.
#12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon
. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon
were only applied after the initial boot configuration was generated using etc/emqx.conf
, leading to potential loss of some log segment files due to late reconfiguration.
Now, both {data_dir}/configs/cluster.hocon
and etc/emqx.conf
are loaded concurrently, with settings from emqx.conf
taking precedence, to create the boot configuration.
#12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.
#12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats
endpoint of the Prometheus API. The correction applies to the following metrics:
emqx_cluster_sessions_count
emqx_cluster_sessions_max
emqx_cluster_nodes_running
emqx_cluster_nodes_stopped
emqx_subscriptions_shared_count
emqx_subscriptions_shared_max
Additionally, this fix rectified an issue within the /stats
endpoint concerning the subscriptions.shared.count
and subscriptions.shared.max
fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.
#12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.
#12740 Fixed an issue when durable sessions could not be kicked out.
#12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.
The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.
If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical
severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.
#12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.
#12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges
config root.
A bridge
is now either action
+ connector
for egress data integration, or source
+ connector
for ingress data integration.
Please note that the bridges
config (in cluster.hocon
) and the REST API path api/v5/bridges
still works, but considered deprecated.
#12634 Triple-quote string values in HOCON config files no longer support escape sequence.
The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:
cluster.hocon
,
meaning for generated configs, there is no compatibility issue.emqx.conf
) a thorough review is needed
to inspect if escape sequences are used (such as \n
, \r
, \t
and \\
), if yes,
such strings should be changed to regular quotes (one pair of "
) instead of triple-quotes.#12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain
, EMQX retains records of expired sessions for a specified duration.
Session count API: Use the API GET /api/v5/sessions_count?since=1705682238
to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.
Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions
, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:
# TYPE emqx_cluster_sessions_count gauge
emqx_cluster_sessions_count 1234
NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.
#12398 Exposed the swagger_support
option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.
#12467 Started supporting cluster discovery using AAAA DNS record type.
#12483 Renamed emqx ctl conf cluster_sync tnxid ID
to emqx ctl conf cluster_sync inspect ID
.
For backward compatibility, tnxid
is kept, but considered deprecated and will be removed in 5.7.
#12495 Introduced new AWS S3 connector and action.
#12499 Enhanced client banning capabilities with extended rules, including:
clientid
against a specified regular expression.username
against a specified regular expression.Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.
#12509 Implemented API to re-order all authenticators / authorization sources.
#12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~
and ~"""
markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:
rule_xlu4 {
sql = """~
SELECT
*
FROM
"t/#"
~"""
}
See HOCON 0.42.0 release notes for details.
#12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:
authentication_failure
authorization_permission_denied
cannot_publish_to_topic_due_to_not_authorized
cannot_publish_to_topic_due_to_quota_exceeded
connection_rejected_due_to_license_limit_reached
dropped_msg_due_to_mqueue_is_full
#12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.
To get the first chunk of data:
GET /clients/{clientid}/mqueue_messages?limit=100
GET /clients/{clientid}/inflight_messages?limit=100
Alternatively, for the first chunks without specifying a start position:
GET /clients/{clientid}/mqueue_messages?limit=100&position=none
GET /clients/{clientid}/inflight_messages?limit=100&position=none
To get the next chunk of data:
GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
GET /clients/{clientid}/inflight_messages?limit=100&position={position}
Where {position}
is a value (opaque string token) of meta.position
field from the previous response.
Ordering and Prioritization:
#12590 Removed mfa
meta data from log messages to improve clarity.
#12641 Improved text log formatter fields order. The new fields order is as follows:
tag
> clientid
> msg
> peername
> username
> topic
> [other fields]
#12670 Added field shared_subscriptions
to endpoint /monitor_current
and /monitor_current/nodes/:node
.
#12679 Upgraded docker image base from Debian 11 to Debian 12.
#12700 Started supporting "b" and "B" unit in bytesize hocon fields.
For example, all three fields below will have the value of 1024 bytes:
bytesize_field = "1024b"
bytesize_field2 = "1024B"
bytesize_field2 = 1024
#12719 The /clients
API has been upgraded to accommodate queries for multiple clientid
s and username
s simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.
Examples of Multi-Client/Username Queries:
/clients?clientid=client1&clientid=client2
/clients?username=user11&username=user2
/clients?clientid=client1&clientid=client2&username=user1&username=user2
Examples of Selecting Fields for the Response:
/clients?fields=all
(Note: Omitting the fields
parameter defaults to returning all fields.)/clients?fields=clientid,username
#12330 The Cassandra bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12353 The OpenTSDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12376 The Kinesis bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12386 The GreptimeDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12423 The RabbitMQ bridge has been split into connector, action and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12425 The ClickHouse bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12439 The Oracle bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12449 The TDEngine bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12488 The RocketMQ bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12512 The HStreamDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically, however, it is recommended to do the upgrade manually as new fields have been added to the configuration.
#12543 The DynamoDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12595 The Kafka Consumer bridge has been split into connector and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12619 The Microsoft SQL Server bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.
#12381 Added new SQL functions: map_keys()
, map_values()
, map_to_entries()
, join_to_string()
, join_to_string()
, join_to_sql_values_string()
, is_null_var()
, is_not_null_var()
.
For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.
#12427 Introduced the capability to specify a limit on the number of Kafka partitions that can be used for Kafka data integration.
#12577 Updated the service_account_json
field for both the GCP PubSub Producer and Consumer connectors to accept JSON-encoded strings. Now, it's possible to set this field to a JSON-encoded string. Using the previous format (a HOCON map) is still supported but not encouraged.
#12581 Added JSON schema to schema registry.
#12602 Enhanced health checking for IoTDB connector, using its ping
API instead of just checking for an existing socket connection.
#12336 Refined the approach to managing asynchronous tasks by segregating the cleanup of channels into its own dedicated pool. This separation addresses performance issues encountered during channels cleanup under conditions of high network latency, ensuring that such tasks do not impede the efficiency of other asynchronous operations, such as route cleanup.
#12725 Implemented REST API to list the available source types.
#12746 Added username
log field. If MQTT client is connected with a non-empty username the logs and traces will include username
field.
#12785 Added timestamp_format
configuration option to log handlers. This new option allows for the following settings:
auto
: Automatically determines the timestamp format based on the log formatter being used.
Utilizes rfc3339
format for text formatters, and epoch
format for JSON formatters.
epoch
: Represents timestamps in microseconds precision Unix epoch format.
rfc3339
: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00
.
#11868 Fixed a bug where will messages were not published after session takeover.
#12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.
When variables in payload
and topic
templates are undefined, they are now rendered as empty strings instead of the literal undefined
string.
#12472 Fixed an issue where certain read operations on /api/v5/actions/
and /api/v5/sources/
endpoints might result in a 500
error code during the process of rolling upgrades.
#12492 EMQX now returns the Receive-Maximum
property in the CONNACK
message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum
setting and the server's max_inflight
configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK
message.
NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.
#12505 Upgraded the Kafka producer client wolff
from version 1.10.1 to 1.10.2. This latest version maintains a long-lived metadata connection for each connector, optimizing EMQX's performance by reducing the frequency of establishing new connections for action and connector health checks.
#12513 Changed the level of several flooding log events from warning
to info
.
#12530 Improved the error reporting for frame_too_large
events and malformed CONNECT
packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.
#12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name
and cluster.discover_strategy
. Specifically, when utilizing the dns
strategy with either a
or aaaa
record types, it is mandatory for all nodes to use a (static) IP address as the host name.
#12566 Enhanced the bootstrap file for REST API keys:
Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.
API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.
#12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.
#12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.
#12663 Fixed an issue where the emqx_vm_cpu_use
and emqx_vm_cpu_idle
metrics, accessible via the Prometheus endpoint /prometheus/stats
, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.
#12668 Refactored the SQL function date_to_unix_ts()
by using calendar:datetime_to_gregorian_seconds/1
.
This change also added validation for the input date format.
#12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon
. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon
were only applied after the initial boot configuration was generated using etc/emqx.conf
, leading to potential loss of some log segment files due to late reconfiguration.
Now, both {data_dir}/configs/cluster.hocon
and etc/emqx.conf
are loaded concurrently, with settings from emqx.conf
taking precedence, to create the boot configuration.
#12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.
#12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats
endpoint of the Prometheus API. The correction applies to the following metrics:
emqx_cluster_sessions_count
emqx_cluster_sessions_max
emqx_cluster_nodes_running
emqx_cluster_nodes_stopped
emqx_subscriptions_shared_count
emqx_subscriptions_shared_max
Additionally, this fix rectified an issue within the /stats
endpoint concerning the subscriptions.shared.count
and subscriptions.shared.max
fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.
#12390 Fixed an issue where the /license
API request may crash during cluster joining processes.
#12411 Fixed a bug where null
values would be inserted as 1853189228
in int
columns in Cassandra data integration.
#12522 Refined the parsing process for Kafka bootstrap hosts to exclude spaces following commas, addressing connection timeouts and DNS resolution failures due to malformed host entries.
#12656 Implemented a topic verification step for creating GCP PubSub Producer actions, ensuring failure notifications when the topic doesn't exist or provided credentials lack sufficient permissions.
#12678 Enhanced the DynamoDB connector to clearly report the reason for connection failures, improving upon the previous lack of error insights.
#12681 Fixed a security issue where secrets could be logged at debug level when sending messages to a RocketMQ bridge/action.
#12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.
#12767 Fixed issues encountered during upgrades from 5.0.1 to 5.5.1, specifically related to Kafka Producer configurations that led to upgrade failures. The correction ensures that Kafka Producer configurations are accurately transformed into the new action and connector configuration format required by EMQX version 5.5.1 and beyond.
#12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.
The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.
If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical
severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.
#12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.
#12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges
config root.
A bridge
is now either action
+ connector
for egress data integration, or source
+ connector
for ingress data integration.
Please note that the bridges
config (in cluster.hocon
) and the REST API path api/v5/bridges
still works, but considered deprecated.
#12634 Triple-quote string values in HOCON config files no longer support escape sequence.
The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:
cluster.hocon
,
meaning for generated configs, there is no compatibility issue.emqx.conf
) a thorough review is needed
to inspect if escape sequences are used (such as \n
, \r
, \t
and \\
), if yes,
such strings should be changed to regular quotes (one pair of "
) instead of triple-quotes.#12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.
#12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.
The affected APIs include:
/clients/:clientid/subscribe
/clients/:clientid/subscribe/bulk
/clients/:clientid/unsubscribe
/clients/:clientid/unsubscribe/bulk
#12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info
level.
#12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at
metric will report a value of 0, indicating the non-existence of the certificate.
#12608 Fixed a function_clause
error in the IoTDB action caused by the absence of a payload
field in query data.
#12610 Fixed an issue where connections to the LDAP connector could unexpectedly disconnect after a certain period of time.
#12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug
level logs in the HTTP Server connector, mitigating potential security risks.
#12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts
produced incorrect results for dates starting from March 1st on leap years.
#12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.
#12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.
The affected APIs include:
/clients/:clientid/subscribe
/clients/:clientid/subscribe/bulk
/clients/:clientid/unsubscribe
/clients/:clientid/unsubscribe/bulk
#12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info
level.
#12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at
metric will report a value of 0, indicating the non-existence of the certificate.
#12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug
level logs in the HTTP Server connector, mitigating potential security risks.
#12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts
produced incorrect timestamp results for dates starting from March 1st on leap years.