Emqx Versions Save

The most scalable open-source MQTT broker for IoT, IIoT, and connected vehicles

v5.7.1

1 week ago

Enhancements

  • #12983 Add new rule engine event $events/client_check_authn_complete for authentication completion event.

  • #13180 Improved client message handling performance when EMQX is running on Erlang/OTP 26 and increased message throughput by 10% in fan-in mode.

  • #13191 Upgraded EMQX Docker images to run on Erlang/OTP 26.

    EMQX had been running on Erlang/OTP 26 since v5.5 except for docker images which were on Erlang/OTP 25. Now all releases are on Erlang/OTP 26.

  • #13242 Significantly increased the startup speed of EMQX dashboard listener.

Bug Fixes

  • #13156 Resolved an issue where the Dashboard Monitoring pages would crash following the update to EMQX v5.7.0.

  • #13164 Fixed HTTP authorization request body encoding.

    Before this fix, the HTTP authorization request body encoding format was taken from the accept header. The fix is to respect the content-type header instead. Also added access templating variable for v4 compatibility. The access code of SUBSCRIBE action is 1 and PUBLISH action is 2.

  • #13238 Improved the logged error messages when an HTTP authorization request with an unsupported content-type header is returned.

  • #13258 Fix an issue where the MQTT-SN gateway would not restart correctly due to incorrect startup order of gateway dependencies.

  • #13273 Fixed and improved handling of URIs in several configurations. The fix includes the following improvement details:

    • Authentication and authorization configurations: Corrected a previous error where valid pathless URIs such as https://example.com?q=x were mistakenly rejected. These URIs are now properly recognized as valid.
    • Connector configurations: Enhanced checks to ensure that URIs with potentially problematic components, such as user info or fragment parts, are no longer erroneously accepted.
  • #13276 Fixed an issue in the durable message storage mechanism where parts of the internal storage state were not correctly persisted during the setup of new storage generations. The concept of "generation" is used internally and is crucial for managing message expiration and cleanup. This could have manifested as messages being lost after a restart of EMQX.

  • #13291 Fixed an issue where durable storage sites that were down being reported as up.

  • #13290 Fixed an issue where the command $ bin/emqx ctl rules show rule_0hyd would produce no output when used to display rules with a data integration action attached.

  • #13293 Improved the restoration process from data backups by automating the re-indexing of imported retained messages. Previously, re-indexing required manual intervention using the emqx ctl retainer reindex start CLI command after importing a data backup file.

    This fix also extended the functionality to allow exporting retained messages to a backup file when the retainer.backend.storage_type is configured as ram. Previously, only setups with disc as the storage type supported exporting retained messages.

  • #13140 Fixed an issue that caused text traces for the republish action to crash and not display correctly.

  • #13148 Fixed an issue where a 500 HTTP status code could be returned by /connectors/:connector-id/start when there is a timeout waiting for the resource to be connected.

  • #13181 EMQX now forcefully shut down the connector process when attempting to stop a connector, if such operation times out. This fix also improved the clarity of error messages when disabling an action or source fails due to an unresponsive underlying connector.

  • #13216 Respect clientid_prefix config for MQTT bridges. Since EMQX v5.4.1, the MQTT client IDs are restricted to a maximum of 23 bytes. Previously, the system factored the clientid_prefix into the hash of the original, longer client ID, affecting the final shortened ID. The fix includes the following change details:

    • Without Prefix: The behavior remains unchanged. EMQX hashes the long client IDs (exceeding 23 bytes) to fit within the 23-byte limit.
    • With Prefix:
      • Prefix ≤ 19 bytes: The prefix is retained, and the remaining portion of the client ID is hashed into a 4-byte space, ensuring the total length does not exceed 23 bytes.
      • Prefix ≥ 20 bytes: EMQX will not attempt to shorten the client ID, fully preserving the configured prefix regardless of length.

v5.7.0

1 month ago

Enhancements

Security

  • #12947 For JWT authentication, support new disconnect_after_expire option. When enabled, the client will be disconnected after the JWT token expires.

Note: This is a breaking change. This option is enabled by default, so the default behavior is changed. Previously, the clients with actual JWTs could connect to the broker and stay connected even after the JWT token expired. Now, the client will be disconnected after the JWT token expires. To preserve the previous behavior, set disconnect_after_expire to false.

Data Processing and Integration

  • #12671 An unescape function has been added to the rule engine SQL language to handle the expansion of escape sequences in strings. This addition has been done because string literals in the SQL language don't support any escape codes (e.g., \n and \t). This enhancement allows for more flexible string manipulation within SQL expressions.

Extensibility

  • #12872 Implemented Client Attributes feature. It allows setting additional properties for each client using key-value pairs. Property values can be generated from MQTT client connection information (such as username, client ID, TLS certificate) or set from data accompanying successful authentication returns. Properties can be used in EMQX for authentication, authorization, data integration, and MQTT extension functions. Compared to using static properties like client ID directly, client properties offer greater flexibility in various business scenarios, simplifying the development process and enhancing adaptability and efficiency in development work. Initialization of client_attrs The client_attrs fields can be initially populated from one of the following clientinfo fields:

    • cn: The common name from the TLS client's certificate.
    • dn: The distinguished name from the TLS client's certificate, that is, the certificate "Subject".
    • clientid: The MQTT client ID provided by the client.
    • username: The username provided by the client.
    • user_property: Extract a property value from 'User-Property' of the MQTT CONNECT packet.

    Extension through Authentication Responses Additional attributes may be merged into client_attrs from authentication responses. Supported authentication backends include:

    • HTTP: Attributes can be included in the JSON object of the HTTP response body through a client_attrs field.
    • JWT: Attributes can be included via a client_attrs claim within the JWT.

    Usage in Authentication and Authorization If client_attrs is initialized before authentication, it can be used in external authentication requests. For instance, ${client_attrs.property1} can be used within request templates directed at an HTTP server for authenticity validation.

    • The client_attrs can be utilized in authorization configurations or request templates, enhancing flexibility and control. Examples include: In acl.conf, use {allow, all, all, ["${client_attrs.namespace}/#"]} to apply permissions based on the namespace attribute.
    • In other authorization backends, ${client_attrs.namespace} can be used within request templates to dynamically include client attributes.
  • #12910 Added plugin configuration management and schema validation. For EMQX enterprise edition, one can also annotate the schema with metadata to facilitate UI rendering in the Dashboard. See more details in the plugin template and plugin documentation.

Operations and Management

  • #12923 Provided more specific error when importing wrong format into builtin authenticate database.

  • #12940 Added ignore_readonly argument to PUT /configs API. Before this change, EMQX would return 400 (BAD_REQUEST) if the raw config included read-only root keys (cluster, rpc, and node). After this enhancement it can be called as PUT /configs?ignore_readonly=true, EMQX will in this case ignore readonly root config keys, and apply the rest. For observability purposes, an info level message is logged if any readonly keys are dropped. Also fixed an exception when config has bad HOCON syntax (returns 500). Now bad syntax will cause the API to return 400 (BAD_REQUEST).

  • #12957 Started building packages for macOS 14 (Apple Silicon) and Ubuntu 24.04 Noble Numbat (LTS).

Bug Fixes

Security

  • #12887 Fixed MQTT enhanced auth with sasl scram.

  • #12962 TLS clients can now verify server hostname against wildcard certificate. For example, if a certificate is issued for host *.example.com, TLS clients is able to verify server hostnames like srv1.example.com.

MQTT

  • #12996 Fixed process leak in emqx_retainer application. Previously, client disconnection while receiving retained messages could cause a process leak.

Data Processing and Integration

  • #12653 The rule engine function bin2hexstr now supports bitstring inputs with a bit size that is not divisible by 8. Such bitstrings can be returned by the rule engine function subbits.

  • #12657 The rule engine SQL-based language previously did not allow putting any expressions as array elements in array literals (only constants and variable references were allowed). This has now been fixed so that one can use any expressions as array elements. The following is now permitted, for example:

    select
    [21 + 21, abs(-abs(-2)), [1 + 1], 4] as my_array
    from "t/#"
    
  • #12932 Previously, if a HTTP action request received a 503 (Service Unavailable) status, it was marked as a failure and the request was not retried. This has now been fixed so that the request is retried a configurable number of times.

  • #12948 Fixed an issue where sensitive HTTP header values like Authorization would be substituted by ****** after updating a connector.

  • #13118 Fix a performance issue in the rule engine template rendering.

Observability

  • #12765 Make sure stats subscribers.count subscribers.max contains shared-subscribers. It only contains non-shared subscribers previously.

Operations and Management

  • #12812 Made resource health checks non-blocking operations. This means that operations such as updating or removing a resource won't be blocked by a lengthy running health check.

  • #12830 Made channel (action/source) health checks non-blocking operations. This means that operations such as updating or removing an action/source data integration won't be blocked by a lengthy running health check.

  • #12993 Fixed listener config update API when handling an unknown zone. Before this fix, when a listener config is updated with an unknown zone, for example {"zone": "unknown"}, the change would be accepted, causing all clients to crash whens connected. After this fix, updating the listener with an unknown zone name will get a "Bad request" response.

  • #13012 The MQTT listerners config option access_rules has been improved in the following ways:

    • The listener no longer crash with an incomprehensible error message if a non-valid access rule is configured. Instead a configuration error is generated.
    • One can now add several rules in a single string by separating them by comma (for example, "allow 10.0.1.0/24, deny all").
  • #13041 Improved HTTP authentication error log message. If HTTP content-type header is missing for POST method, it now emits a meaningful error message instead of a less readable exception with stack trace.

  • #13077 This fix makes EMQX only read action configurations from the global configuration when the connector starts or restarts, and instead stores the latest configurations for the actions in the connector. Previously, updates to action configurations would sometimes not take effect without disabling and enabling the action. This means that an action could sometimes run with the old (previous) configuration even though it would look like the action configuration has been updated successfully.

  • #13090 Attempting to start an action or source whose connector is disabled will no longer attempt to start the connector itself.

Gateways

  • #12909 Fixed UDP listener process handling on errors or closure, The fix ensures the UDP listener is cleanly stopped and restarted as needed if these error conditions occur.

  • #13001 Fixed an issue where the syskeeper forwarder would never reconnect when the connection was lost.

  • #13010 Fixed the issue where the JT/T 808 gateway could not correctly reply to the REGISTER_ACK message when requesting authentication from the registration service failed.

Breaking Changes

  • #12947 For JWT authentication, a new boolean option disconnect_after_expire has been added with default value set to true. When enabled, the client will be disconnected after the JWT token expires.

    Previously, the clients with actual JWTs could connect to the broker and stay connected even after the JWT token expired. Now, the client will be disconnected after the JWT token expires. To preserve the previous behavior, set disconnect_after_expire to false.

  • #12957 Stopped building packages for macOS 12.

v5.6.1

2 months ago

Bug Fixes

  • #12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.

  • #12766 Renamed message_queue_too_long error reason to mailbox_overflow

    mailbox_overflow. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size.

  • #12773 Upgraded HTTP client libraries.

    The HTTP client library (gun-1.3) incorrectly appended a :portnumber suffix to the Host header for standard ports (http on port 80, https on port 443). This could cause compatibility issues with servers or gateways performing strict Host header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound or "The specified CustomDomain does not exist."

  • #12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy was not manual. With the latest update, executing the cluster leave command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable command or simply restart the node.

  • #12814 Improved error handling for the /clients/{clientid}/mqueue_messages and /clients/{clientid}/inflight_messages APIs in EMQX. These updates address:

    • Internal Timeout: If EMQX fails to retrieve the list of Inflight or Mqueue messages within the default 5-second timeout, likely under heavy system load, the API will return 500 error with the response {"code":"INTERNAL_ERROR","message":"timeout"}, and log additional details for troubleshooting.
    • Client Shutdown: Should the client connection be terminated during an API call, the API now returns a 404 error, with the response {"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}. This ensures clearer feedback when client connections are interrupted.
  • #12824 Updated the statistics metrics subscribers.count and subscribers.max to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.

  • #12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:

    • The data integration sources specified in backup files were not being imported. This included configurations under the sources.mqtt category with specific connectors and parameters such as QoS and topics.
    • Importing the mnesia table for retained messages was not supported.
  • #12843 Fixed cluster_rpc_commit transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.

  • #12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.

    The "Retained messages" backend API uses the qlc library. This problem was due to a permission issue where the qlc library's file_sorter function tried to use a non-writable directory, /opt/emqx, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.

    This update modifies the ownership settings of the /opt/emqx directory to emqx:emqx, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.

e5.6.1

2 months ago

Bug Fixes

  • #12759 EMQX now automatically removes invalid backup files that fail during upload due to schema validation errors. This fix ensures that only valid configuration files are displayed and stored, enhancing system reliability.

  • #12766 Renamed message_queue_too_long error reason to mailbox_overflow

    mailbox_overflow. The latter is consistent with the corresponding config parameter: force_shutdown.max_mailbox_size.

  • #12773 Upgraded HTTP client libraries.

    The HTTP client library (gun-1.3) incorrectly appended a :portnumber suffix to the Host header for standard ports (http on port 80, https on port 443). This could cause compatibility issues with servers or gateways performing strict Host header checks (e.g., AWS Lambda, Alibaba Cloud HTTP gateways), leading to errors such as InvalidCustomDomain.NotFound or "The specified CustomDomain does not exist."

  • #12802 Improved how EMQX handles node removal from clusters via the emqx ctl cluster leave command. Previously, nodes could unintentionally rejoin the same cluster (unless it was stopped) if the configured cluster discovery_strategy was not manual. With the latest update, executing the cluster leave command now automatically disables cluster discovery for the node, preventing it from rejoining. To re-enable cluster discovery, use the emqx ctl discovery enable command or simply restart the node.

  • #12814 Improved error handling for the /clients/{clientid}/mqueue_messages and /clients/{clientid}/inflight_messages APIs in EMQX. These updates address:

    • Internal Timeout: If EMQX fails to retrieve the list of Inflight or Mqueue messages within the default 5-second timeout, likely under heavy system load, the API will return 500 error with the response {"code":"INTERNAL_ERROR","message":"timeout"}, and log additional details for troubleshooting.
    • Client Shutdown: Should the client connection be terminated during an API call, the API now returns a 404 error, with the response {"code": "CLIENT_SHUTDOWN", "message": "Client connection has been shutdown"}. This ensures clearer feedback when client connections are interrupted.
  • #12824 Updated the statistics metrics subscribers.count and subscribers.max to include shared subscribers. Previously, these metrics accounted only for non-shared subscribers.

  • #12826 Fixed issues related to the import functionality of source data integrations and retained messages in EMQX. Before this update:

    • The data integration sources specified in backup files were not being imported. This included configurations under the sources.mqtt category with specific connectors and parameters such as QoS and topics.
    • Importing the mnesia table for retained messages was not supported.
  • #12843 Fixed cluster_rpc_commit transaction ID cleanup procedure on replicator nodes after executing the emqx ctl cluster leave command. Previously, failing to properly clear these transaction IDs impeded configuration updates on the core node.

  • #12882 Fixed an issue with the RocketMQ action in EMQX data integration, ensuring that messages are correctly routed to their configured topics. Previously, when multiple actions shared a single RocketMQ connector, all messages were mistakenly sent to the topic configured for the first action. This fix involves starting a distinct set of RocketMQ workers for each topic, preventing cross-topic message delivery errors.

  • #12885 Fixed an issue in EMQX where users were unable to view "Retained Messages" under the "Monitoring" menu in the Dashboard.

    The "Retained messages" backend API uses the qlc library. This problem was due to a permission issue where the qlc library's file_sorter function tried to use a non-writable directory, /opt/emqx, to store temporary files, resulting from recent changes in directory ownership permissions in Docker deployments.

    This update modifies the ownership settings of the /opt/emqx directory to emqx:emqx, ensuring that all necessary operations, including retained messages retrieval, can proceed without access errors.

v5.6.0

3 months ago

Enhancements

  • #12251 Optimized the performance of the RocksDB-based persistent sessions, achieving a reduction in RAM usage and database request frequency. Key improvements include:

    • Introduced dirty session state to avoid frequent mria transactions.
    • Introduced an intermediate buffer for the persistent messages.
    • Used separate tracks of PacketIds for QoS1 and QoS2 messages.
    • Limited the number of continuous ranges of inflight messages to 1 per stream.
  • #12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain, EMQX retains records of expired sessions for a specified duration.

    • Session count API: Use the API GET /api/v5/sessions_count?since=1705682238 to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.

    • Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:

      # TYPE emqx_cluster_sessions_count gauge
      emqx_cluster_sessions_count 1234
      

      NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.

  • #12338 Introduced a time-based garbage collection mechanism to the RocksDB-based persistent session backend. This feature ensures more efficient management of stored messages, optimizing storage utilization and system performance by automatically purging outdated messages.

  • #12398 Exposed the swagger_support option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.

  • #12467 Started supporting cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID.

    For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12499 Enhanced client banning capabilities with extended rules, including:

    • Matching clientid against a specified regular expression.
    • Matching client's username against a specified regular expression.
    • Matching client's peer address against a CIDR range.

    Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.

  • #12509 Implemented API to re-order all authenticators / authorization sources.

  • #12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~ and ~""" markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:

    rule_xlu4 {
      sql = """~
        SELECT
          *
        FROM
          "t/#"
      ~"""
    }
    

    See HOCON 0.42.0 release notes for details.

  • #12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively, for the first chunks without specifying a start position:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Ordering and Prioritization:

    • Mqueue Messages: These are prioritized and sequenced based on their queue order (FIFO), from higher to lower priority. By default, mqueue messages carry a uniform priority level of 0.
    • In-Flight Messages: Sequenced by the timestamp of their insertion into the in-flight storage, from oldest to newest.
  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improved text log formatter fields order. The new fields order is as follows:

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Added field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgraded docker image base from Debian 11 to Debian 12.

  • #12700 Started supporting "b" and "B" unit in bytesize hocon fields. For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 The /clients API has been upgraded to accommodate queries for multiple clientids and usernames simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.

    Examples of Multi-Client/Username Queries:

    • To query multiple clients by ID: /clients?clientid=client1&clientid=client2
    • To query multiple users: /clients?username=user11&username=user2
    • To combine multiple client IDs and usernames in one query: /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Examples of Selecting Fields for the Response:

    • To include all fields in the response: /clients?fields=all (Note: Omitting the fields parameter defaults to returning all fields.)
    • To specify only certain fields: /clients?fields=clientid,username
  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.

  • #12336 Performance enhancement. Created a dedicated async task handler pool to handle client session cleanup tasks.

  • #12725 Implemented REST API to list the available source types.

  • #12746 Added username log field. If MQTT client is connected with a non-empty username the logs and traces will include username field.

  • #12785 Added timestamp_format configuration option to log handlers. This new option allows for the following settings:

    • auto: Automatically determines the timestamp format based on the log formatter being used. Utilizes rfc3339 format for text formatters, and epoch format for JSON formatters.

    • epoch: Represents timestamps in microseconds precision Unix epoch format.

    • rfc3339: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00.

Bug Fixes

  • #11868 Fixed a bug where will messages were not published after session takeover.

  • #12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.

    When variables in payload and topic templates are undefined, they are now rendered as empty strings instead of the literal undefined string.

  • #12472 Fixed an issue where certain read operations on /api/v5/actions/ and /api/v5/sources/ endpoints might result in a 500 error code during the process of rolling upgrades.

  • #12492 EMQX now returns the Receive-Maximum property in the CONNACK message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum setting and the server's max_inflight configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK message.

  • #12500 The GET /clients and GET /client/:clientid HTTP APIs have been updated to include disconnected persistent sessions in their responses.

    NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.

  • #12513 Changed the level of several flooding log events from warning to info.

  • #12530 Improved the error reporting for frame_too_large events and malformed CONNECT packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.

  • #12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name and cluster.discover_strategy. Specifically, when utilizing the dns strategy with either a or aaaa record types, it is mandatory for all nodes to use a (static) IP address as the host name.

  • #12562 Added a new configuration root: durable_storage. This configuration tree contains the settings related to the new persistent session feature.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.

    • API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.

  • #12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.

  • #12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.

  • #12663 Fixed an issue where the emqx_vm_cpu_use and emqx_vm_cpu_idle metrics, accessible via the Prometheus endpoint /prometheus/stats, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.

  • #12668 Refactored the SQL function date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1. This change also added validation for the input date format.

  • #12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon were only applied after the initial boot configuration was generated using etc/emqx.conf, leading to potential loss of some log segment files due to late reconfiguration.

    Now, both {data_dir}/configs/cluster.hocon and etc/emqx.conf are loaded concurrently, with settings from emqx.conf taking precedence, to create the boot configuration.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.

  • #12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats endpoint of the Prometheus API. The correction applies to the following metrics:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Additionally, this fix rectified an issue within the /stats endpoint concerning the subscriptions.shared.count and subscriptions.shared.max fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.

  • #12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.

  • #12740 Fixed an issue when durable sessions could not be kicked out.

  • #12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.

    The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.

    If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.

  • #12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.

Breaking changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration. Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon, meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes, such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.

e5.6.0

3 months ago

Enhancements

  • #12326 Enhanced session tracking with registration history. EMQX now has the capability to monitor the history of session registrations, including those that have expired. By configuring broker.session_history_retain, EMQX retains records of expired sessions for a specified duration.

    • Session count API: Use the API GET /api/v5/sessions_count?since=1705682238 to obtain a count of sessions across the cluster that remained active since the given UNIX epoch timestamp (with seconds precision). This enhancement aids in analyzing session activity over time.

    • Metrics expansion with cluster sessions gauge: A new gauge metric, cluster_sessions, is added to better track the number of sessions within the cluster. This metric is also integrated into Prometheus for easy monitoring:

      # TYPE emqx_cluster_sessions_count gauge
      emqx_cluster_sessions_count 1234
      

      NOTE: Please consider this metric as an approximate estimation. Due to the asynchronous nature of data collection and calculation, exact precision may vary.

  • #12398 Exposed the swagger_support option in the Dashboard configuration, allowing for the enabling or disabling of the Swagger API documentation.

  • #12467 Started supporting cluster discovery using AAAA DNS record type.

  • #12483 Renamed emqx ctl conf cluster_sync tnxid ID to emqx ctl conf cluster_sync inspect ID.

    For backward compatibility, tnxid is kept, but considered deprecated and will be removed in 5.7.

  • #12495 Introduced new AWS S3 connector and action.

  • #12499 Enhanced client banning capabilities with extended rules, including:

    • Matching clientid against a specified regular expression.
    • Matching client's username against a specified regular expression.
    • Matching client's peer address against a CIDR range.

    Important Notice: Implementing a large number of broad matching rules (not specific to an individual clientid, username, or host) may affect system performance. It's advised to use these extended ban rules judiciously to maintain optimal system efficiency.

  • #12509 Implemented API to re-order all authenticators / authorization sources.

  • #12517 Configuration files have been upgraded to accommodate multi-line string values, preserving indentation for enhanced readability and maintainability. This improvement utilizes """~ and ~""" markers to quote indented lines, offering a structured and clear way to define complex configurations. For example:

    rule_xlu4 {
      sql = """~
        SELECT
          *
        FROM
          "t/#"
      ~"""
    }
    

    See HOCON 0.42.0 release notes for details.

  • #12520 Implemented log throttling. The feature reduces the volume of logged events that could potentially flood the system by dropping all but the first occurance of an event within a configured time window. Log throttling is applied to the following log events that are critical yet prone to repetition:

    • authentication_failure
    • authorization_permission_denied
    • cannot_publish_to_topic_due_to_not_authorized
    • cannot_publish_to_topic_due_to_quota_exceeded
    • connection_rejected_due_to_license_limit_reached
    • dropped_msg_due_to_mqueue_is_full
  • #12561 Implemented HTTP APIs to get the list of client's in-flight and message queue (mqueue) messages. These APIs facilitate detailed insights and effective control over message queues and in-flight messaging, ensuring efficient message handling and monitoring.

    To get the first chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100
    • GET /clients/{clientid}/inflight_messages?limit=100

    Alternatively, for the first chunks without specifying a start position:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position=none
    • GET /clients/{clientid}/inflight_messages?limit=100&position=none

    To get the next chunk of data:

    • GET /clients/{clientid}/mqueue_messages?limit=100&position={position}
    • GET /clients/{clientid}/inflight_messages?limit=100&position={position}

    Where {position} is a value (opaque string token) of meta.position field from the previous response.

    Ordering and Prioritization:

    • Mqueue Messages: These are prioritized and sequenced based on their queue order (FIFO), from higher to lower priority. By default, mqueue messages carry a uniform priority level of 0.
    • In-Flight Messages: Sequenced by the timestamp of their insertion into the in-flight storage, from oldest to newest.
  • #12590 Removed mfa meta data from log messages to improve clarity.

  • #12641 Improved text log formatter fields order. The new fields order is as follows:

    tag > clientid > msg > peername > username > topic > [other fields]

  • #12670 Added field shared_subscriptions to endpoint /monitor_current and /monitor_current/nodes/:node.

  • #12679 Upgraded docker image base from Debian 11 to Debian 12.

  • #12700 Started supporting "b" and "B" unit in bytesize hocon fields.

    For example, all three fields below will have the value of 1024 bytes:

    bytesize_field = "1024b"
    bytesize_field2 = "1024B"
    bytesize_field2 = 1024
    
  • #12719 The /clients API has been upgraded to accommodate queries for multiple clientids and usernames simultaneously, offering a more flexible and powerful tool for monitoring client connections. Additionally, this update introduces the capability to customize which client information fields are included in the API response, optimizing for specific monitoring needs.

    Examples of Multi-Client/Username Queries:

    • To query multiple clients by ID: /clients?clientid=client1&clientid=client2
    • To query multiple users: /clients?username=user11&username=user2
    • To combine multiple client IDs and usernames in one query: /clients?clientid=client1&clientid=client2&username=user1&username=user2

    Examples of Selecting Fields for the Response:

    • To include all fields in the response: /clients?fields=all (Note: Omitting the fields parameter defaults to returning all fields.)
    • To specify only certain fields: /clients?fields=clientid,username
  • #12330 The Cassandra bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12353 The OpenTSDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12376 The Kinesis bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12386 The GreptimeDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12423 The RabbitMQ bridge has been split into connector, action and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12425 The ClickHouse bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12439 The Oracle bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12449 The TDEngine bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12488 The RocketMQ bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12512 The HStreamDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically, however, it is recommended to do the upgrade manually as new fields have been added to the configuration.

  • #12543 The DynamoDB bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12595 The Kafka Consumer bridge has been split into connector and source components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12619 The Microsoft SQL Server bridge has been split into connector and action components. They are backwards compatible with the bridge HTTP API. Configuration will be upgraded automatically.

  • #12381 Added new SQL functions: map_keys(), map_values(), map_to_entries(), join_to_string(), join_to_string(), join_to_sql_values_string(), is_null_var(), is_not_null_var().

    For more information on the functions and their usage, refer to Built-in SQL Functions the documentation.

  • #12427 Introduced the capability to specify a limit on the number of Kafka partitions that can be used for Kafka data integration.

  • #12577 Updated the service_account_json field for both the GCP PubSub Producer and Consumer connectors to accept JSON-encoded strings. Now, it's possible to set this field to a JSON-encoded string. Using the previous format (a HOCON map) is still supported but not encouraged.

  • #12581 Added JSON schema to schema registry.

    JSON Schema supports Draft 03, Draft 04 and Draft 05.

  • #12602 Enhanced health checking for IoTDB connector, using its ping API instead of just checking for an existing socket connection.

  • #12336 Refined the approach to managing asynchronous tasks by segregating the cleanup of channels into its own dedicated pool. This separation addresses performance issues encountered during channels cleanup under conditions of high network latency, ensuring that such tasks do not impede the efficiency of other asynchronous operations, such as route cleanup.

  • #12725 Implemented REST API to list the available source types.

  • #12746 Added username log field. If MQTT client is connected with a non-empty username the logs and traces will include username field.

  • #12785 Added timestamp_format configuration option to log handlers. This new option allows for the following settings:

    • auto: Automatically determines the timestamp format based on the log formatter being used. Utilizes rfc3339 format for text formatters, and epoch format for JSON formatters.

    • epoch: Represents timestamps in microseconds precision Unix epoch format.

    • rfc3339: Uses RFC3339 compliant format for date-time strings. For example, 2024-03-26T11:52:19.777087+00:00.

Bug Fixes

  • #11868 Fixed a bug where will messages were not published after session takeover.

  • #12347 Implemented an update to ensure that messages processed by the Rule SQL for the MQTT egress data bridge are always rendered as valid, even in scenarios where the data is incomplete or lacks certain placeholders defined in the bridge configuration. This adjustment prevents messages from being incorrectly deemed invalid and subsequently discarded by the MQTT egress data bridge, as was the case previously.

    When variables in payload and topic templates are undefined, they are now rendered as empty strings instead of the literal undefined string.

  • #12472 Fixed an issue where certain read operations on /api/v5/actions/ and /api/v5/sources/ endpoints might result in a 500 error code during the process of rolling upgrades.

  • #12492 EMQX now returns the Receive-Maximum property in the CONNACK message for MQTT v5 clients, aligning with protocol expectations. This implementation considers the minimum value of the client's Receive-Maximum setting and the server's max_inflight configuration as the limit for the number of inflight (unacknowledged) messages permitted. Previously, the determined value was not sent back to the client in the CONNACK message.

    NOTE: A current known issue with these enhanced API responses is that the total client count provided may exceed the actual number of clients due to the inclusion of disconnected sessions.

  • #12505 Upgraded the Kafka producer client wolff from version 1.10.1 to 1.10.2. This latest version maintains a long-lived metadata connection for each connector, optimizing EMQX's performance by reducing the frequency of establishing new connections for action and connector health checks.

  • #12513 Changed the level of several flooding log events from warning to info.

  • #12530 Improved the error reporting for frame_too_large events and malformed CONNECT packet parsing failures. These updates now provide additional information, aiding in the troubleshooting process.

  • #12541 Introduced a new configuration validation step for autocluster by DNS records to ensure compatibility between node.name and cluster.discover_strategy. Specifically, when utilizing the dns strategy with either a or aaaa record types, it is mandatory for all nodes to use a (static) IP address as the host name.

  • #12566 Enhanced the bootstrap file for REST API keys:

    • Empty lines within the file are now skipped, eliminating the previous behavior of generating an error.

    • API keys specified in the bootstrap file are assigned the highest precedence. In cases where a new key from the bootstrap file conflicts with an existing key, the older key will be automatically removed to ensure that the bootstrap keys take effect without issue.

  • #12646 Fixed an issue with the rule engine's date-time string parser. Previously, time zone adjustments were only effective for date-time strings specified with second-level precision.

  • #12652 Fixed a discrepancy where the subbits functions with 4 and 5 parameters, despite being documented, were missing from the actual implementation. These functions have now been added.

  • #12663 Fixed an issue where the emqx_vm_cpu_use and emqx_vm_cpu_idle metrics, accessible via the Prometheus endpoint /prometheus/stats, were inaccurately reflecting the average CPU usage since the operating system boot. This fix ensures that these metrics now accurately represent the current CPU usage and idle, providing more relevant and timely data for monitoring purposes.

  • #12668 Refactored the SQL function date_to_unix_ts() by using calendar:datetime_to_gregorian_seconds/1. This change also added validation for the input date format.

  • #12672 Changed the process for generating the node boot configuration by incorporating the loading of {data_dir}/configs/cluster.hocon. Previously, changes to logging configurations made via the Dashboard and saved in {data_dir}/configs/cluster.hocon were only applied after the initial boot configuration was generated using etc/emqx.conf, leading to potential loss of some log segment files due to late reconfiguration.

    Now, both {data_dir}/configs/cluster.hocon and etc/emqx.conf are loaded concurrently, with settings from emqx.conf taking precedence, to create the boot configuration.

  • #12696 Fixed an issue where attempting to reconnect an action or source could lead to wrong error messages being returned in the HTTP API.

  • #12714 Fixed inaccuracies in several metrics reported by the /prometheus/stats endpoint of the Prometheus API. The correction applies to the following metrics:

    • emqx_cluster_sessions_count
    • emqx_cluster_sessions_max
    • emqx_cluster_nodes_running
    • emqx_cluster_nodes_stopped
    • emqx_subscriptions_shared_count
    • emqx_subscriptions_shared_max

    Additionally, this fix rectified an issue within the /stats endpoint concerning the subscriptions.shared.count and subscriptions.shared.max fields. Previously, these values failed to update promptly following a client's disconnection or unsubscription from a Shared-Subscription.

  • #12390 Fixed an issue where the /license API request may crash during cluster joining processes.

  • #12411 Fixed a bug where null values would be inserted as 1853189228 in int columns in Cassandra data integration.

  • #12522 Refined the parsing process for Kafka bootstrap hosts to exclude spaces following commas, addressing connection timeouts and DNS resolution failures due to malformed host entries.

  • #12656 Implemented a topic verification step for creating GCP PubSub Producer actions, ensuring failure notifications when the topic doesn't exist or provided credentials lack sufficient permissions.

  • #12678 Enhanced the DynamoDB connector to clearly report the reason for connection failures, improving upon the previous lack of error insights.

  • #12681 Fixed a security issue where secrets could be logged at debug level when sending messages to a RocketMQ bridge/action.

  • #12715 Fixed a crash that could occur during configuration updates if the connector for the ingress data integration source had active channels.

  • #12767 Fixed issues encountered during upgrades from 5.0.1 to 5.5.1, specifically related to Kafka Producer configurations that led to upgrade failures. The correction ensures that Kafka Producer configurations are accurately transformed into the new action and connector configuration format required by EMQX version 5.5.1 and beyond.

  • #12768 Addressed a startup failure issue in EMQX version 5.4.0 and later, particularly noted during rolling upgrades from versions before 5.4.0. The issue was related to the initialization of the routing schema when both v1 and v2 routing tables were empty.

    The node now attempts to retrieve the routing schema version in use across the cluster instead of using the v2 routing table by default when local routing tables are found empty at startup. This approach mitigates potential conflicts and reduces the chances of diverging routing storage schemas among cluster nodes, especially in a mixed-version cluster scenario.

    If conflict is detected in a running cluster, EMQX writes instructions on how to manually resolve it in the log as part of the error message with critical severity. The same error message and instructions will also be written on standard error to make sure this message will not get lost even if no log handler is configured.

  • #12786 Added a strict check that prevents replicant nodes from connecting to core nodes running with a different version of EMQX application. This check ensures that during the rolling upgrades, the replicant nodes can only work when at least one core node is running the same EMQX release version.

Breaking changes

  • #12576 Starting from 5.6, the "Configuration Manual" document will no longer include the bridges config root.

    A bridge is now either action + connector for egress data integration, or source + connector for ingress data integration. Please note that the bridges config (in cluster.hocon) and the REST API path api/v5/bridges still works, but considered deprecated.

  • #12634 Triple-quote string values in HOCON config files no longer support escape sequence.

    The detailed information can be found in this pull request. Here is a summary of the impact on EMQX users:

    • EMQX 5.6 is the first version to generate triple-quote strings in cluster.hocon, meaning for generated configs, there is no compatibility issue.
    • For user hand-crafted configs (such as emqx.conf) a thorough review is needed to inspect if escape sequences are used (such as \n, \r, \t and \\), if yes, such strings should be changed to regular quotes (one pair of ") instead of triple-quotes.

e5.5.1

3 months ago

Enhancements

  • #12497 Improved MongoDB connector performance, resulting in more efficient database interactions. This enhancement is supported by improvements in the MongoDB Erlang driver as well (mongodb-erlang PR).

Bug Fixes

  • #12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.

  • #12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.

    The affected APIs include:

    • /clients/:clientid/subscribe

    • /clients/:clientid/subscribe/bulk

    • /clients/:clientid/unsubscribe

    • /clients/:clientid/unsubscribe/bulk

  • #12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info level.

  • #12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at metric will report a value of 0, indicating the non-existence of the certificate.

  • #12608 Fixed a function_clause error in the IoTDB action caused by the absence of a payload field in query data.

  • #12610 Fixed an issue where connections to the LDAP connector could unexpectedly disconnect after a certain period of time.

  • #12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug level logs in the HTTP Server connector, mitigating potential security risks.

  • #12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts produced incorrect results for dates starting from March 1st on leap years.

v5.5.1

3 months ago

5.5.1

Bug Fixes

  • #12471 Fixed an issue that data integration configurations failed to load correctly during upgrades from EMQX version 5.0.2 to newer releases.

  • #12598 Fixed an issue that users were unable to subscribe to or unsubscribe from shared topic filters via HTTP API.

    The affected APIs include:

    • /clients/:clientid/subscribe

    • /clients/:clientid/subscribe/bulk

    • /clients/:clientid/unsubscribe

    • /clients/:clientid/unsubscribe/bulk

  • #12601 Fixed an issue where logs of the LDAP driver were not being captured. Now, all logs are recorded at the info level.

  • #12606 The Prometheus API experienced crashes when the specified SSL certificate file did not exist in the given path. Now, when an SSL certificate file is missing, the emqx_cert_expiry_at metric will report a value of 0, indicating the non-existence of the certificate.

  • #12620 Redacted sensitive information in HTTP headers to exclude authentication and authorization credentials from debug level logs in the HTTP Server connector, mitigating potential security risks.

  • #12632 Fixed an issue where the rule engine's SQL built-in function date_to_unix_ts produced incorrect timestamp results for dates starting from March 1st on leap years.

e5.6.0-alpha.2

3 months ago

v5.5.1-rc.1

4 months ago