A few months ago we certified ClickHouse 21.1 as an Altinity Stable release. Since then, we have worked on newer releases and run them in-house. We completed several new features, and even more have been added by community contributors. We were running ClickHouse 21.3 to power our public datasets at Altinity.Cloud instance and testing it in our environments. As of 21.3.13.9 we are confident in certifying 21.3 as an Altinity Stable release.
This release is a relatively small upgrade since the previous Altinity Stable release. But even a small upgrade includes 742 pull requests from 133 contributors; these mostly focused on performance and stability improvements. There are several important new features as well! Please look below for detailed release notes; read these carefully before upgrading. There are additional notes for point releases.
- Detailed release notes and upgrade instructions for version 21.3.13.9 - Read these first!
- 21.3.20.2 - This is the latest release of version 21.3. We recommend that you upgrade to this release. (But read the detailed release notes and upgrade instructions first.)
- 21.3.17.3
Major new features since the previous stable release 21.1.x
Released 2021-06-29
A new release introduces a lot of changes and new features. These are organized by feature in the Appendix, so refer to this section for more detail. The following new features are worth mentioning on the front page:
- SQL features:
- Window functions! See our blog article for an introduction.
- Support multiple levels nesting in the Nested
datatype
.
- Security features:
- LDAP roles mapping a).
- Apply row-level security as a separate
PREWHERE
step a). - Server side keys support for S3 table function a)>.
- Cluster improvements:
- Hedged requests! That allows to reduce tail latencies on large clusters by running the same query at different replicas. This is controlled by
use_hedged_requests
andmax_parallel_replicas
settings. - Allow inserts into a specific shard via a distributed table. This is controlled by the
insert_shard_id
setting. - Allow inserts into
cluster()
table function and specify a sharding key. - Replicated database engine (experimental). Allows to replicate DDL statements across the cluster. It is not production ready in this release.
- Hedged requests! That allows to reduce tail latencies on large clusters by running the same query at different replicas. This is controlled by
- MergeTree features:
- Table level concurrency control. Can be controlled by the
max_concurrent_queries
merge tree setting.
- Table level concurrency control. Can be controlled by the
- Integrations:
- PostgreSQL table engine, table function and dictionary source.
As usual with ClickHouse, there are many performance and operational improvements in different server components.
a) Contributed by Altinity developers.
Backward Incompatible Changes
The following changes are backward incompatible and require user attention during an upgrade:
- It is no longer allowed to create
MergeTree
tables in the old syntax with tableTTL
; this syntax is now ignored. Attaching old tables is still possible. - Floating point columns and values are disallowed in:
- Partitioning key – these can be turned back on with the
allow_floating_point_partition_key
merge tree setting. - Bitwise functions.
- Partitioning key – these can be turned back on with the
- Excessive parenthesis in type definitions are no longer supported, example:
Array((UInt8))
.
Upgrade Notes
There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended.
- Now replicas that are processing the
ALTER TABLE ATTACH PART[ITION]
command search theirdetached/
folders before fetching the data from other replicas. As an implementation detail, a new commandATTACH_PART
is introduced in the replicated log. Parts are searched and compared by their checksums.ATTACH PART[ITION]
queries may not work during cluster upgrade.
Notice
It’s not possible to rollback to an older ClickHouse version after executingALTER ... ATTACH
query in the new version as the old servers would fail to pass the ATTACH_PART
entry in the replicated log.
- If you are upgrading RPM packages from https://packagecloud.io/Altinity/clickhouse-altinity-stable you will have to switch the repository since the packaging structure has been changed. Please refer to these instructions: https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-packaging-compatibility-greater-than-21.x-and-earlier.
- In the release notes for 21.1 we recommended disabling
optimize_on_insert
.This recommendation stays for 21.3 as well as inserts into Replacing, Summing, Collapsing and AggregatingMergeTree can produce empty parts.
Other Important Changes
Some new ClickHouse features are now enabled by default. It may lead to a change in behaviour, so review those carefully and disable features that may affect your system:
distributed_aggregation_memory_efficient
enable_global_with_statement
optimize_normalize_count_variants
optimize_respect_aliases
optimize_rewrite_sum_if_to_count_if
Known issues in 21.3.13.9
The development team continues to improve the quality of the 21.3 release. The following issues still exist in the 21.3.13.9 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications:
- Replication may get stuck due to clear empty parts in rare cases. This bug existed in 20.8 and 21.1 versions as well: https://github.com/ClickHouse/ClickHouse/issues/23292 (Fixed in 21.7).
- Queries to a Merge engine table with a
JOIN
may be very slow due to not respected partition and primary keys. The bug existed in 20.8 and 21.1 versions as well: https://github.com/ClickHouse/ClickHouse/issues/22226 (fixed in 21.5). - Queries with multiple
WITH
andJOIN
clauses may run significantly slower that in previous releases: https://github.com/ClickHouse/ClickHouse/issues/21067. - Skipping indices are not used in rare cases leading to increased query time: https://github.com/ClickHouse/ClickHouse/issues/20346.
ClickHouse Altinity Stable releases are based on the community versions. Community builds can be downloaded from repo.clickhouse.tech.
ClickHouse Altinity builds may be found at:
- Deb packages: https://packagecloud.io/Altinity/altinity-stable
- RPM packages: https://packagecloud.io/Altinity/altinity-stable
- Docker image: altinity/clickhouse-server:21.3.13.9.altinity_stable
Please contact us at info@altinity.com if you experience any issues with the upgrade.
Appendix
New functions
- DateTime functions:
- Compatibility with MySQL:
DATE, DAY, DAYOFMONTH, DAYOFWEEK, DAYOFYEAR, HOUR, MINUTE, MONTH, QUARTER, SECOND, YEAR
parseDateTimeBestEffortUSOrNull
parseDateTimeBestEffortUSOrZero
timezoneOffset
- Compatibility with MySQL:
- String functions:
decodeXMLComponent, extractTextFromHTML
- Map functions:
mapContains, mapKeys, mapValues
- Aggregate functions
- New combinator
-SimpleState
deltaSum
runningConcurrency
- New combinator
- Functions to work with new Map data type:
map
- Math/Statistics:
sign
- Type related:
reinterpret, reinterpretAsInt128, reinterpretAsInt256, reinterpretAsUInt256
- Internal ClickHouse:
connectionID, connection_id
normalizeQueryKeepNames, normalizedQueryHashKeepNames
New table functions
postgresql
file
New table engines
PostgreSQL
New metrics and events
system.metrics.
- PartsCommitted
- PartsCompact
- PartsDeleteOnDestroy
- PartsDeleting
- PartsInMemory
- PartsOutdated
- PartsPreCommitted
- PartsTemporary
- PartsWide
system.events:
- DistributedDelayedInserts
- DistributedDelayedInsertsMilliseconds
- DistributedRejectedInserts
- HedgedRequestsChangeReplica
New system tables
- distributed_ddl_queue
New columns in system tables
- metric_log.
- ProfileEvent_DistributedDelayedInserts, ProfileEvent_DistributedRejectedInserts, ProfileEvent_DistributedDelayedInsertsMilliseconds, ProfileEvent_HedgedRequestsChangeReplica, CurrentMetric_PartsTemporary, CurrentMetric_PartsPreCommitted, CurrentMetric_PartsCommitted, CurrentMetric_PartsOutdated, CurrentMetric_PartsDeleting, CurrentMetric_PartsDeleteOnDestroy, CurrentMetric_PartsWide, CurrentMetric_PartsCompact, CurrentMetric_PartsInMemory
- Parts:
- group_by_ttl_info.expression, group_by_ttl_info.min, group_by_ttl_info.max, rows_where_ttl_info.expression, rows_where_ttl_info.min, rows_where_ttl_info.max
- processes.http_referer
- query_log:
- http_referer, log_comment, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions
- query_thread_log.http_referer
- quota_limits:
- max_query_selects, max_query_inserts
- quota_usage:
- query_selects, max_query_selects, query_inserts, max_query_inserts
- quotas_usage:
- query_selects, max_query_selects, query_inserts, max_query_inserts
System.merge_tree_settings added/changed
Name | Old value | New value | Description |
---|---|---|---|
allow_floating_point_partition_key | 0 | Allow floating point as partition key. | |
inactive_parts_to_delay_insert | 0 | If table contains at least that many inactive parts in single partition, artificially slow down insert into table. | |
inactive_parts_to_throw_insert | 0 | If more than this number of inactive parts are in a single partition, throw the ‘Too many inactive parts … ’ exception. |
|
max_concurrent_queries | 0 | Max number of concurrently executed queries related to the MergeTree table (0 - disabled). Queries will still be limited by other max_concurrent_queries settings. |
|
min_marks_to_honor_max_concurrent_queries | 0 | Minimal number of marks to honor the MergeTree-level’s max_concurrent_queries (0 - disabled). Queries will still be limited by other max_concurrent_queries settings. |
system.settings added/changed
Name | Old value | New value | Description |
---|---|---|---|
allow_changing_replica_until_first_data_packet | 0 | Allow HedgedConnections to change the replica until receiving the first data packet. | |
allow_experimental_database_replicated | 0 | Allows creating databases with the Replicated engine. | |
allow_experimental_query_deduplication | 0 | Allows sending parts’ UUIDs for a query in order to deduplicate data parts, if any. | |
async_socket_for_remote | 1 | 0 | Asynchronously read from socket executing remote query. |
background_fetches_pool_size | 3 | 8 | Number of threads performing background fetches for replicated tables. Only has meaning at server startup. |
checksum_on_read | 1 | Validates checksums on reading. It is enabled by default and should be always enabled in production. Please do not expect any benefits in disabling this setting. It may only be used for experiments and benchmarks. These settings are only applicable for tables of the MergeTree family. Checksums are always validated for other table engines and when receiving data over the network. | |
database_replicated_ddl_output | 1 | Returns table with query execution status as a result of a DDL query. | |
database_replicated_initial_query_timeout_sec | 300 | How long an initial DDL query should wait for a Replicated database to process previous DDL queue entries. | |
distributed_aggregation_memory_efficient | 0 | 1 | The memory-saving mode of distributed aggregation is enabled. |
enable_global_with_statement | 0 | 1 | Propagate WITH statements to UNION queries and all subqueries. |
engine_file_empty_if_not_exists | 0 | Allows selecting data from a file engine table without a file. | |
engine_file_truncate_on_insert | 0 | Enables or disables truncate before insert in file engine tables. | |
flatten_nested | 1 | If true, columns of type Nested will be flatten to separate array columns instead of one array of tuples. |
|
hedged_connection_timeout_ms | 100 | Connection timeout for establishing connection with replica for Hedged requests. | |
insert_in_memory_parts_timeout | 600000 | REMOVED. | |
insert_shard_id | 0 | If non zero, when inserting into a distributed table, the data will be inserted into the shard insert_shard_id synchronously. Possible values range from 1 to shards_number of the corresponding distributed table. |
|
log_comment | Log comment into system.query_log table and server log. It can be set to an arbitrary string no longer than max_query_size . |
||
normalize_function_names | 0 | Normalize function names to their canonical names. | |
optimize_normalize_count_variants | 1 | Rewrite aggregate functions that semantically equals count() as count() . |
|
optimize_respect_aliases | 1 | If it is set to true, it will respect aliases in WHERE/GROUP BY/ORDER BY , that will help with partition pruning/secondary indexes/optimize_aggregation_in_order/optimize_read_in_order/optimize_trivial_count . |
|
optimize_rewrite_sum_if_to_count_if | 1 | Rewrite sumIf() and sum(if()) function countIf() function when logically equivalent. |
|
periodic_live_view_refresh | 60 | Interval after which a periodically refreshed live view is forced to refresh. | |
query_plan_max_optimizations_to_apply | 10000 | Limit the total number of optimizations applied to the query plan. If zero, ignored. If the limit is reached, throw an exception. | |
receive_data_timeout_ms | 2000 | Connection timeout for receiving the first packet of data or packet with positive progress from replica. | |
s3_max_connections | 1024 | The maximum number of connections per server. | |
sleep_in_send_data | 0 | Time to sleep in sending data in TCPHandler . |
|
sleep_in_send_tables_status | 0 | Time to sleep in sending tables the status response in TCPHandler . |
|
unknown_packet_in_send_data | 0 | Sends unknown packet instead of N-th data packet. |
|
use_hedged_requests | 0 | Use hedged requests for distributed queries. |
Also, please refer to the release notes from the development team available at the following URLs:
- 21.2 release notes: https://clickhouse.com/docs/en/whats-new/changelog/2021#clickhouse-release-212
- 21.3 release notes: https://clickhouse.com/docs/en/whats-new/changelog/2021#clickhouse-release-213-lts
21.3.20.2
Released 2022-02-10
Changes Compared to Altinity Stable Build 21.3.17.3
Bug Fixes
- Integer overflow to resize the arrays causes heap corrupt. #33024 (Rajkumar Varada).
- fix crash when used fuzzBits with multiply same FixedString, Close #32737. #32755 (SuperDJY).
- Number of active replicas might be determined incorrectly when inserting with quorum if setting
replicated_can_become_leader
is disabled on some replicas. It’s fixed. #32157 (tavplubix). - Fix possible assertion
../src/IO/ReadBuffer.h:58: bool DB::ReadBuffer::next(): Assertion '!hasPendingData()' failed.
in TSKV format. #31804 (Kruglov Pavel). - Fix crash when function
dictGet
with type is used for dictionary attribute when type isNullable
. Fixes #30980. #31800 (Maksim Kita). - Fix usage of
Buffer
table engine with typeMap
. Fixes #30546. #31742 (Anton Popov). - Fix race in JSONEachRowWithProgress output format when data and lines with progress are mixed in output. #31736 (Kruglov Pavel).
- Settings
input_format_allow_errors_num
andinput_format_allow_errors_ratio
did not work for parsing of domain types, such asIPv4
, it’s fixed. Fixes #31686. #31697 (tavplubix). - Remove not like function into RPNElement. #31169 (sundyli).
- Using
formatRow
function with not row formats led to segfault. Don’t allow to use this function with such formats (because it doesn’t make sense). #31001 (Kruglov Pavel). - Functions for case-insensitive search in UTF8 strings like
positionCaseInsensitiveUTF8
andcountSubstringsCaseInsensitiveUTF8
might find substrings that actually does not match, it’s fixed. #30663 (tavplubix). - Fixed segfault which might happen if session expired during execution of REPLACE PARTITION. #30432 (tavplubix).
- Fix
[I]LIKE
function. Closes #28661. #30244 (Nikolay Degterinsky). - FlatDictionary, HashedDictionary fix bytes_allocated calculation for nullable attributes. #30238 (Maksim Kita).
- Support nullable arguments in function
initializeAggregation
. #30177 (Anton Popov). - Fix crash of sample by
tuple()
, closes #30004. #30016 (flynn). - Fix concurrent access to
LowCardinality
duringGROUP BY
(leads to SIGSEGV). #29782 (Azat Khuzhin). - Condition in filter predicate could be lost after push-down optimisation. #29625 (Nikolai Kochetov).
- Fix null deference for
GROUP BY WITH TOTALS HAVING
(when the column fromHAVING
wasn’t selected). #29553 (Azat Khuzhin). - Fix connection timeouts (
send_timeout
/receive_timeout
). #29282 (Azat Khuzhin). - Fix possible
Table columns structure in ZooKeeper is different from local table structure
exception while recreating or creating new replicas ofReplicatedMergeTree
, when one of table columns have default expressions with case-insensitive functions. #29266 (Anton Popov). - Fix segfault while inserting into column with type LowCardinality(Nullable) in Avro input format. #29132 (Kruglov Pavel).
- Fix the number of threads used in
GLOBAL IN
subquery (it was executed in single threads since #19414 bugfix). #28997 (Nikolai Kochetov). - Fix invalid constant type conversion when nullable or lowcardinality primary key is used. #28636 (Amos Bird).
- Fix
ORDER BY ... WITH FILL
with setTO
andFROM
and no rows in result set. #30888 (Anton Popov). - Fixed Apache Avro Union type index out of boundary issue in Apache Avro binary format. #33022 (Harry Lee).
- Fix null pointer dereference in low cardinality data when deserializing LowCardinality data in the Native format. #33021 (Harry Lee).
- Quota limit was not reached, but the limit was exceeded. This PR fixes #31174. #31656 (sunny).
- Quota limit was not reached, but the limit was exceeded. This PR fixes #31174. #31337 (sunny).
- Fix shutdown of
AccessControlManager
. Now there can’t be reloading of the configuration after AccessControlManager has been destroyed. This PR fixes the flaky test test_user_directories/test.py::test_relative_path. #29951 (Vitaly Baranov).
Performance Improvements
- Improved performance of JSON and XML output formats. #31673 (alexey-milovidov).
Other Improvements
- Updated zoneinfo files to 2021c. #29925 (alexey-milovidov).
- Use real tmp file instead of predefined “rows_sources” for vertical merges. This avoids generating garbage directories in tmp disks. #28299 (Amos Bird).
- Use separate
clickhouse-bridge
group and user for bridge processes. Set oom_score_adj so the bridges will be first subjects for OOM killer. Set set maximum RSS to 1 GiB. Closes #23861. #25280 (Kseniia Sumarokova).
Changes compared to Community Build 21.3.20.1-lts
- Applying 21.3 customizations for 21.3.20 release #113 (Vasily Nemkov).
- Improvement: Retries on HTTP connection drops in S3 (ClickHouse#22988)
- Bug Fix: Fixed possible error ‘Cannot read from istream at offset 0’ when reading a file from DiskS3 (https://github.com/ClickHouse/ClickHouse/pull/24885)
- Improvement: Don’t silently ignore errors and don’t count delays in ReadBufferFromS3 (https://github.com/ClickHouse/ClickHouse/pull/27484)
- Bug Fix: Fixed a bug with accessing S3 buckets when the object URL contained ‘+’ sign (ClickHouse#23822)
- Not for changelog: fixed tests (ClickHouse #27176, #23976 #26249).
Test Results
21.3.17.3
Released 2021-10-14
Changes Since v21.3.17.2-lts
Changes since the Long Term Support release of ClickHouse v21.3.17.2-lts to Altinity Stable 21.3.17.3:
- Retries on HTTP connection drops in S3 (ClickHouse#22988).
- Added sleep with backoff between read retries from S3. (ClickHouse#24311).
- Fixed possible error
Cannot read from istream at offset 0
when reading a file from DiskS3 (ClickHouse#24885). - Don’t silently ignore errors and don’t count delays in ReadBufferFromS3 (ClickHouse#27484).
- Fixed a bug with accessing S3 buckets when the object URL contained
+
sign (ClickHouse#23822). - Other minor things/tests fixed: