Altinity Stable Build for ClickHouse 21.3.13.9 Release Notes
A few months ago we certified ClickHouse 21.1 as an Altinity Stable release. Since then, we have worked on newer releases and run them in-house. We completed several new features, and even more have been added by community contributors. We were running ClickHouse 21.3 to power our public datasets at Altinity.Cloud instance and testing it in our environments. As of 21.3.13.9 we are confident in certifying 21.3 as an Altinity Stable release.
This release is a relatively small upgrade since the previous Altinity Stable release. But even a small upgrade includes 742 pull requests from 133 contributors; these mostly focused on performance and stability improvements. There are several important new features as well! Please look below for detailed release notes.
Major new features since the previous stable release 21.1.x
A new release introduces a lot of changes and new features. These are organized by feature in the Appendix, so refer to this section for more detail. The following new features are worth mentioning on the front page:
- SQL features:
- Window functions! See our blog article for an introduction.
- Support multiple levels nesting in the Nested
datatype
.
- Security features:
- LDAP roles mapping a).
- Apply row-level security as a separate
PREWHERE
step a). - Server side keys support for S3 table function a).
- Cluster improvements:
- Hedged requests! That allows to reduce tail latencies on large clusters by running the same query at different replicas. This is controlled by
use_hedged_requests
andmax_parallel_replicas
settings. - Allow inserts into a specific shard via a distributed table. This is controlled by the
insert_shard_id
setting. - Allow inserts into
cluster()
table function and specify a sharding key. - Replicated database engine (experimental). Allows to replicate DDL statements across the cluster. It is not production ready in this release.
- Hedged requests! That allows to reduce tail latencies on large clusters by running the same query at different replicas. This is controlled by
- MergeTree features:
- Table level concurrency control. Can be controlled by the
max_concurrent_queries
merge tree setting.
- Table level concurrency control. Can be controlled by the
- Integrations:
- PostgreSQL table engine, table function and dictionary source.
As usual with ClickHouse, there are many performance and operational improvements in different server components.
a)- contributed by Altinity developers.
Backward Incompatible Changes
The following changes are backward incompatible and require user attention during an upgrade:
- It is no longer allowed to create
MergeTree
tables in the old syntax with tableTTL
; this syntax is now ignored. Attaching old tables is still possible. - Floating point columns and values are disallowed in:
- Partitioning key – these can be turned back on with the
allow_floating_point_partition_key
merge tree setting. - Bitwise functions.
- Partitioning key – these can be turned back on with the
- Excessive parenthesis in type definitions are no longer supported, example:
Array((UInt8))
.
Upgrade Notes
There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended.
- Now replicas that are processing the
ALTER TABLE ATTACH PART[ITION]
command search theirdetached/
folders before fetching the data from other replicas. As an implementation detail, a new commandATTACH_PART
is introduced in the replicated log. Parts are searched and compared by their checksums.ATTACH PART[ITION]
queries may not work during cluster upgrade.
Notice
It’s not possible to rollback to an older ClickHouse version after executingALTER ... ATTACH
query in the new version as the old servers would fail to pass the ATTACH_PART
entry in the replicated log.
- If you are upgrading RPM packages from https://packagecloud.io/Altinity/clickhouse-altinity-stable you will have to switch the repository since the packaging structure has been changed. Please refer to these instructions: https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-packaging-compatibility-greater-than-21.x-and-earlier.
- In the release notes for 21.1 we recommended disabling
optimize_on_insert
.This recommendation stays for 21.3 as well as inserts into Replacing, Summing, Collapsing and AggregatingMergeTree can produce empty parts.
Other Important Changes
Some new ClickHouse features are now enabled by default. It may lead to a change in behaviour, so review those carefully and disable features that may affect your system:
distributed_aggregation_memory_efficient
enable_global_with_statement
optimize_normalize_count_variants
optimize_respect_aliases
optimize_rewrite_sum_if_to_count_if
Known issues in 21.3.13.9
The development team continues to improve the quality of the 21.3 release. The following issues still exist in the 21.3.13.9 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications:
- Replication may get stuck due to clear empty parts in rare cases. This bug existed in 20.8 and 21.1 versions as well: https://github.com/ClickHouse/ClickHouse/issues/23292 (Fixed in 21.7).
- Queries to a Merge engine table with a
JOIN
may be very slow due to not respected partition and primary keys. The bug existed in 20.8 and 21.1 versions as well: https://github.com/ClickHouse/ClickHouse/issues/22226 (fixed in 21.5). - Queries with multiple
WITH
andJOIN
clauses may run significantly slower that in previous releases: https://github.com/ClickHouse/ClickHouse/issues/21067. - Skipping indices are not used in rare cases leading to increased query time: https://github.com/ClickHouse/ClickHouse/issues/20346.
ClickHouse Altinity Stable releases are based on the community versions. Community builds can be downloaded from repo.clickhouse.tech.
ClickHouse Altinity builds may be found at:
- Deb packages: https://packagecloud.io/Altinity/altinity-stable
- RPM packages: https://packagecloud.io/Altinity/altinity-stable
- Docker image: altinity/clickhouse-server:21.3.13.9.altinity_stable
Please contact us at info@altinity.com if you experience any issues with the upgrade.
Appendix
New functions
- DateTime functions:
- Compatibility with MySQL:
DATE, DAY, DAYOFMONTH, DAYOFWEEK, DAYOFYEAR, HOUR, MINUTE, MONTH, QUARTER, SECOND, YEAR
parseDateTimeBestEffortUSOrNull
parseDateTimeBestEffortUSOrZero
timezoneOffset
- Compatibility with MySQL:
- String functions:
decodeXMLComponent, extractTextFromHTML
- Map functions:
mapContains, mapKeys, mapValues
- Aggregate functions
- New combinator
-SimpleState
deltaSum
runningConcurrency
- New combinator
- Functions to work with new Map data type:
map
- Math/Statistics:
sign
- Type related:
reinterpret, reinterpretAsInt128, reinterpretAsInt256, reinterpretAsUInt256
- Internal ClickHouse:
connectionID, connection_id
normalizeQueryKeepNames, normalizedQueryHashKeepNames
New table functions
postgresql
file
New table engines
PostgreSQL
New metrics and events
system.metrics.
- PartsCommitted
- PartsCompact
- PartsDeleteOnDestroy
- PartsDeleting
- PartsInMemory
- PartsOutdated
- PartsPreCommitted
- PartsTemporary
- PartsWide
system.events:
- DistributedDelayedInserts
- DistributedDelayedInsertsMilliseconds
- DistributedRejectedInserts
- HedgedRequestsChangeReplica
New system tables
- distributed_ddl_queue
New columns in system tables
- metric_log.
- ProfileEvent_DistributedDelayedInserts, ProfileEvent_DistributedRejectedInserts, ProfileEvent_DistributedDelayedInsertsMilliseconds, ProfileEvent_HedgedRequestsChangeReplica, CurrentMetric_PartsTemporary, CurrentMetric_PartsPreCommitted, CurrentMetric_PartsCommitted, CurrentMetric_PartsOutdated, CurrentMetric_PartsDeleting, CurrentMetric_PartsDeleteOnDestroy, CurrentMetric_PartsWide, CurrentMetric_PartsCompact, CurrentMetric_PartsInMemory
- Parts:
- group_by_ttl_info.expression, group_by_ttl_info.min, group_by_ttl_info.max, rows_where_ttl_info.expression, rows_where_ttl_info.min, rows_where_ttl_info.max
- processes.http_referer
- query_log:
- http_referer, log_comment, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions
- query_thread_log.http_referer
- quota_limits:
- max_query_selects, max_query_inserts
- quota_usage:
- query_selects, max_query_selects, query_inserts, max_query_inserts
- quotas_usage:
- query_selects, max_query_selects, query_inserts, max_query_inserts
System.merge_tree_settings added/changed
Name | Old value | New value | Description |
---|---|---|---|
allow_floating_point_partition_key | 0 | Allow floating point as partition key. | |
inactive_parts_to_delay_insert | 0 | If table contains at least that many inactive parts in single partition, artificially slow down insert into table. | |
inactive_parts_to_throw_insert | 0 | If more than this number of inactive parts are in a single partition, throw the ‘Too many inactive parts … ’ exception. |
|
max_concurrent_queries | 0 | Max number of concurrently executed queries related to the MergeTree table (0 - disabled). Queries will still be limited by other max_concurrent_queries settings. |
|
min_marks_to_honor_max_concurrent_queries | 0 | Minimal number of marks to honor the MergeTree-level’s max_concurrent_queries (0 - disabled). Queries will still be limited by other max_concurrent_queries settings. |
system.settings added/changed
Name | Old value | New value | Description |
---|---|---|---|
allow_changing_replica_until_first_data_packet | 0 | Allow HedgedConnections to change the replica until receiving the first data packet. | |
allow_experimental_database_replicated | 0 | Allows creating databases with the Replicated engine. | |
allow_experimental_query_deduplication | 0 | Allows sending parts’ UUIDs for a query in order to deduplicate data parts, if any. | |
async_socket_for_remote | 1 | 0 | Asynchronously read from socket executing remote query. |
background_fetches_pool_size | 3 | 8 | Number of threads performing background fetches for replicated tables. Only has meaning at server startup. |
checksum_on_read | 1 | Validates checksums on reading. It is enabled by default and should be always enabled in production. Please do not expect any benefits in disabling this setting. It may only be used for experiments and benchmarks. These settings are only applicable for tables of the MergeTree family. Checksums are always validated for other table engines and when receiving data over the network. | |
database_replicated_ddl_output | 1 | Returns table with query execution status as a result of a DDL query. | |
database_replicated_initial_query_timeout_sec | 300 | How long an initial DDL query should wait for a Replicated database to process previous DDL queue entries. | |
distributed_aggregation_memory_efficient | 0 | 1 | The memory-saving mode of distributed aggregation is enabled. |
enable_global_with_statement | 0 | 1 | Propagate WITH statements to UNION queries and all subqueries. |
engine_file_empty_if_not_exists | 0 | Allows selecting data from a file engine table without a file. | |
engine_file_truncate_on_insert | 0 | Enables or disables truncate before insert in file engine tables. | |
flatten_nested | 1 | If true, columns of type Nested will be flatten to separate array columns instead of one array of tuples. |
|
hedged_connection_timeout_ms | 100 | Connection timeout for establishing connection with replica for Hedged requests. | |
insert_in_memory_parts_timeout | 600000 | REMOVED. | |
insert_shard_id | 0 | If non zero, when inserting into a distributed table, the data will be inserted into the shard insert_shard_id synchronously. Possible values range from 1 to shards_number of the corresponding distributed table. |
|
log_comment | Log comment into system.query_log table and server log. It can be set to an arbitrary string no longer than max_query_size . |
||
normalize_function_names | 0 | Normalize function names to their canonical names. | |
optimize_normalize_count_variants | 1 | Rewrite aggregate functions that semantically equals count() as count() . |
|
optimize_respect_aliases | 1 | If it is set to true, it will respect aliases in WHERE/GROUP BY/ORDER BY , that will help with partition pruning/secondary indexes/optimize_aggregation_in_order/optimize_read_in_order/optimize_trivial_count . |
|
optimize_rewrite_sum_if_to_count_if | 1 | Rewrite sumIf() and sum(if()) function countIf() function when logically equivalent. |
|
periodic_live_view_refresh | 60 | Interval after which a periodically refreshed live view is forced to refresh. | |
query_plan_max_optimizations_to_apply | 10000 | Limit the total number of optimizations applied to the query plan. If zero, ignored. If the limit is reached, throw an exception. | |
receive_data_timeout_ms | 2000 | Connection timeout for receiving the first packet of data or packet with positive progress from replica. | |
s3_max_connections | 1024 | The maximum number of connections per server. | |
sleep_in_send_data | 0 | Time to sleep in sending data in TCPHandler . |
|
sleep_in_send_tables_status | 0 | Time to sleep in sending tables the status response in TCPHandler . |
|
unknown_packet_in_send_data | 0 | Sends unknown packet instead of N-th data packet. |
|
use_hedged_requests | 0 | Use hedged requests for distributed queries. |
Also, please refer to the release notes from the development team available at the following URLs:
- 21.2 release notes: https://clickhouse.tech/docs/en/whats-new/changelog/#clickhouse-release-21-2
- 21.3 release notes: https://clickhouse.tech/docs/en/whats-new/changelog/#clickhouse-release-21-3-lts