Altinity Stable Build for ClickHouse 23.8
Here are the detailed release notes and upgrade instructions for version 23.8.8. There are a lot of new features, but also many things that changed their behavior. Read these carefully before upgrading.
- Detailed release notes and upgrade instructions for version 23.8.8 - Read these first!
Major new features in 23.8 since the previous Altinity Stable release 23.3
Released 2023-12-27
A new release introduces a lot of changes and new functions. It is very hard to pick the most essential ones, so refer to the full list in the Appendix. The following major features are worth mentioning on the front page:
-
Few features that were experimental are now graduated as production ready:
- Projections #49719
- Query Results cache #47977. See our Caching Guide for examples.
- Geographical data types:
Point
,Ring
,Polygon
, andMultiPolygon
#50022
-
SQL features:
- Overlay databases for external data:
CREATE DATABASE files ENGINE = Filesystem('subdir/')
CREATE DATABASE bucket ENGINE = S3('https://clickhouse-public-datasets.s3.amazonaws.com/')
-
MergeTree improvements:
- There were a lot of improvements in order to reduce memory usage for merges, loading parts. FINAL, index analysis and others.
- Added setting
async_insert
for MergeTree tables. It has the same meaning as query-level settingasync_insert
and enables asynchronous inserts for a specific table. #49122.
-
Replication improvements:
-
Security & access control
- ROW POLICY can be applied to all tables in a database. #47640. a)
- Added server-side encryption using KMS keys with S3 tables, and the
header
setting with S3 disks. #48724. - Add server and format settings
display_secrets_in_show_and_select
for displaying secrets of tables, databases, table functions, and dictionaries. Add privilegedisplaySecretsInShowAndSelect
controlling which users can view secrets. #46528. - Adding the
grants
field in the users.xml file, which allows specifying grants for users. #49381. GRANT CURRENT GRANTS
-
Object storage
- Support proxy for S3 access a) – read our article for more detail
- Native support for GCP via
gcs
table function - Native support for Azure via
azureBlobStorage
andazureBlobStorageCluster
table functions
-
Operational:
- New tables for introspection:
system.zookeeper_connection
system.kafka_consumers
- Server and query level IO throttling for different operations (#48242):
- server settings:
max_remote_read_network_bandwidth_for_server
,max_remote_write_network_bandwidth_for_server
,max_local_read_bandwidth_for_server
,max_local_write_bandwidth_for_server
,max_backup_bandwidth_for_server
- profile settings:
max_remote_read_network_bandwidth
,max_remote_write_network_bandwidth
,max_local_read_bandwidth
,max_local_write_bandwidth
,max_backup_bandwidth
.
- server settings:
- New tables for introspection:
As usual with ClickHouse, there are many performance and operational improvements in different server components.
a) Contributed by Altinity developers.
Major changes that require attention
use_environment_credentials
is enabled by default since 23.4. That blocks access to anonymous S3 buckets. The recommended way accessing public buckets is NOSIGN keyword supported for both S3 table function and engine.- Compression of mark files and primary keys is enabled by default. It produces different binary representations of those files. This is controlled by
compress_marks
andcompress_primary_key
merge tree settings.- It is not possible to downgrade from 23.8 to 22.8 or earlier
- If you upgrade from versions prior to 22.9, you should either upgrade all replicas at once or disable the compression before upgrade, or upgrade through an intermediate version, where the compressed marks are supported but not enabled by default, such as 23.3.
- sparse columns enabled by default. It produces different binary representation, so if used downgrade to versions older than 22.1 might not be possible.. This can be disabled by
ratio_of_defaults_for_sparse_serialization=1.
max_concurrent_queries
default value has been increased from 100 to 1000. It may be too high in same cases,max_concurrent_queries
*max_threads
should be lower than ~8000.- Maximum number of parts per partition restriction (“too many parts” error) has been relaxed:
parts_to_delay_insert=1000
(was: 150)parts_to_throw_insert=3000
(was: 300)- There is a new setting
max_avg_part_size_for_too_many_parts=1073741824
, so ‘parts_to_throw’ restriction is applied only for partitions of 1TB size or above. This may result in slower performance, so changing ‘parts_to_throw’ to smaller values is recommended.
- Metadata cache has been removed
- The new feature
system.kafka_consumers
enables collecting librdkafka statistics every 3 seconds by default. If you have Kafka tables which you don’t read from, that can lead to collecting those unprocessed stat messages in memory. Collection is disabled in Altinity.Stable Build 23.8.8 and upstream build 23.8.9. You may turn it on and off using server level Kafka setting:
<kafka>
<statistics_interval_ms> <!-- Set non-zero value to enable -->
0
</statistics_interval_ms>
</kafka>
Backward Incompatible Changes
The following changes are backward incompatible and require user attention during an upgrade:
allow_experimental_query_cache
profile setting has been removed (the feature is production ready) #52685. ClickHouse will not start if it is defined, so please removed it before an upgrade- Several features have been removed:
- The
microseconds
column is removed from thesystem.text_log
, and from thesystem.metric_log
, because they are redundant in the presence of theevent_time_microseconds
column. #53601
Upgrade Notes
There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended.
- Upgrading from 20.3 and older to 22.9 and newer should be done through an intermediate version if there are any
ReplicatedMergeTree
tables; otherwise, the server with the new version will not start. #40641. Here is a possible upgrade path if you are upgrading from 20.3: 20.3 -> 22.8 -> 23.3 -> 23.8 - If you upgrade from versions prior to 22.9, you should either upgrade all replicas at once or disable
compress_marks
andcompress_primary_key
merge tree settings before upgrade, or upgrade through an intermediate version, where the compressed marks are supported but not enabled by default, such as 23.3. - Downgrading from 23.8 to version 23.5 or below may fail due to changes in sparse columns serialization, see #55153 for possible workaround.
Known Issues in 23.8.x
The development team continues to improve the quality of the 23.8 release. The following issues still exist in the 23.8.8 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications.
General stability issues:
- Some queries do not respect
max_threads
and may eventually start thousands of threads: https://github.com/ClickHouse/ClickHouse/issues/53287 – the workaround is to disableprefer_localhost_replica
setting. - Crash in projections analysis when
force_index_by_date
is enabled: https://github.com/ClickHouse/ClickHouse/issues/58620
Some queries that worked in 23.3 may fail in 23.8:
- topKWeightedState has changed that may result in inconsistent data after an upgrade: https://github.com/ClickHouse/ClickHouse/issues/55997
There are a few performance regressions:
- merge() table with IN operator is much slower: https://github.com/ClickHouse/ClickHouse/issues/51846 – fixed in Altinity Build 23.8.8.21
- Degraded query performance with
query_plan_optimize_primary_key
enabled: https://github.com/ClickHouse/ClickHouse/issues/53027
You may also look into a GitHub issues using v23.8-affected label.
Other Important Changes
Several settings have changed their defaults:
distributed_ddl_entry_format_version=5
(was: 3) – enables open telemetry and initial_query_id pass through. #52128- downgrading to versions with unprocessed DDL entries can lead to errors
connect_timeout_with_failover_ms=1000
(was: 50)connect_timeout_with_failover_secure_ms=1000
(was: 100)hedged_connection_timeout_ms=50
(was: 100)http_max_field_name_size=131072
(was: 1048576)http_max_field_value_size=131072
(was: 1048576)http_receive_timeout=30
(was: 180)http_send_timeout=30
(was: 180)
Some new ClickHouse features and optimizations are now enabled by default. It may lead to a change in behavior, so review those carefully and disable features that may affect your system:
merge_tree_settings.allow_vertical_merges_from_compact_to_wide_parts
async_query_sending_for_remote
check_dictionary_primary_key
- projection metadata now is checked on table attach #52361
move_all_conditions_to_prewhere,
enable_multiple_prewhere_read_steps
,move_primary_key_columns_to_end_of_prewhere
enable_memory_bound_merging_of_aggregation_results
– if you upgrade from version prior to 22.12, we recommend setting this flag tofalse
until the upgrade is finished.optimize_distinct_in_order
optimize_count_from_files
optimize_use_projections, optimize_use_implicit_projections
system.query_views_log
now collects information about MVs that are pushed from background threads, like Kafka / Rabbit etc. #46668. You can have more (useful!) data there after the upgrade.
LZ4 & ZSTD compression libraries were upgraded, checksum mismatch when having replicas with older versions are possible
In the previous releases we recommended disabling optimize_on_insert
. This recommendation stays for 23.8 as well as inserts into Summing
and AggregatingMergeTree
can slow down.
Changes in Altinity Stable build Compared to Upstream Build
ClickHouse Altinity Stable builds are open source and are based on the upstream LTS versions. Altinity.Stable 23.8.8 is based on upstream 23.8.8.20-lts, but we have additionally backported several fixes:
- Fix key analysis (with set) for Merge engine #54905 via #341
- Fix partition pruning of extra columns in set #55172 via #342
- Fix FINAL produces invalid read ranges in a rare case #54934 via #343
- Fix incorrect free space accounting for least_used JBOD policy #56030 via #344
- Fix ALTER COLUMN with ALIAS #56493 via #345 (backported into 23.8.9 upstream)
- Disable system.kafka_consumers by default due to possible live memory leak #57822 via #346 (backported into 23.8.9 upstream)
Let’s Install!
Linux packages can be found at https://packages.clickhouse.com for upstream builds, and at https://builds.altinity.cloud for Altinity Stable builds.
Note: naming schema for Altinity.Stable build packages has been changed since 21.8.x.
21.8.x | 22.3.x and later versions |
<package>_<ver>.altinitystable_all.deb
|
<package>_<ver>.altinitystable_amd64.deb
|
<package>-<ver>.altinitystable-2.noarch.rpm
|
<package>-<ver>.altinitystable.x86_64.rpm
|
Docker images for the upstream version should be referenced as clickhouse/clickhouse-server:23.8.8.20
Altinity Stable build images are available as altinity/clickhouse-server:23.8.8.21.altinitystable
.
For more information on installing ClickHouse from either the Altinity Builds or the upstream builds, see the ClickHouse Altinity Stable Release Build Install Guide.
Please contact us at info@altinity.com if you experience any issues with the upgrade.
Appendix
New table functions
- azureBlobStorage, azureBlobStorageCluster
- gcs
- redis
- urlCluster
New table engines
- AzureBlobStorage
- Redis
- S3Queue
New functions
COMING SOON
New formats
One
– doesn’t read any data and always returns a single row with columndummy
with typeUInt8
and value0
likesystem.one
. It can be used together with_file/_path
virtual columns to list files in file/s3/url/hdfs/etc table functions without reading any dataParquetMetadata
– providers introspection into Parquet files, see this excellent blog article for detailPrettyJSONEachRow
PrettyJSONLines
PrettyNDJSON
RowBinaryWithDefaults
New system tables
system.jemalloc_bins
system.kafka_consumers
system.user_processes
– query statistics grouped by usermerge_tree_metadata_cache
was removed
New columns in system tables
system.clusters: database_shard_name, database_replica_name, is_active, name
(alias tocluster
)system.filesystem_cache: cache_name, key, file_size
system.functions: syntax, arguments, returned_value, examples, categories
– those are populated for new functions only, but eventually may be filled in for allsystem.merge_tree_settings/replicated_merge_tree_settings/settings/server_settings: is_obsolete
system.merges: partitions, total_size_uncompressed
system.metrics: name
(alias tometric
)system.parts: primary_key_size, part_name
(alias toname
)system.parts_columns: column_modification_time, part_name
(alias toname
)system.projection_parts: part_name
(alias toname
)system.projection_parts_columns: column_modification_time, part_name
(alias toname
)system.query_cache: compressed
system.replicas: zookeeper_name, lost_part_count
system.storage_policies: perform_ttl_on_insert, load_balancing
New metrics and events
system.events:
Several groups of events were added:
COMING SOON
system.metrics:
COMING SOON
Also, please refer to the interim release notes from the development team available at the following URLs:
- https://github.com/ClickHouse/ClickHouse/blob/master/docs/changelogs/v23.4.1.1943-stable.md
- https://github.com/ClickHouse/ClickHouse/blob/master/docs/changelogs/v23.5.1.3174-stable.md
- https://github.com/ClickHouse/ClickHouse/blob/master/docs/changelogs/v23.6.1.1524-stable.md
- https://github.com/ClickHouse/ClickHouse/blob/master/docs/changelogs/v23.7.1.2470-stable.md
- https://github.com/ClickHouse/ClickHouse/blob/master/docs/changelogs/v23.8.1.2992-lts.md