Altinity Stable for ClickHouse 22.8.13 Release Notes
A few months ago, we certified the ClickHouse 22.3 release. It was delivered together with the Altinity Stable build for ClickHouse. Since then many things have happened to ClickHouse. On the Altinity side we continued to put in features but really focused on build process and testing. We started testing the new ClickHouse LTS release 22.8 as soon as it was out in late August. It took us several months to confirm 22.8 is ready for production use and to make sure upgrades go smoothly. As of 22.8.13 we are confident in certifying 22.3 as an Altinity Stable release.
This release is a significant upgrade since the previous Altinity Stable release. It includes more than 2000 pull requests from 280 contributors. Please look below for detailed release notes.
Major new features in 22.8 since the previous stable release 22.3
A new release introduces a lot of changes and new functions. It is very hard to pick the most essential ones, so refer to the full list in the Appendix. The following major features are worth mentioning on the front page:
- SQL features:
- GROUPING SETS
- Experimental support for
SQL DELETE
(lightweight deletes) – partially contributed by Altinity - Interpolate extension for
ORDER BY … WITH FILL
- Support
SELECT ... INTO OUTFILE
andSTDOUT
. INTERVAL
support for all kinds of time intervals a)- Support for non-constant argument for
LIKE
,ILIKE
and match functions - Support expressions with window functions
- Security features:
- Base58 encoding/decoding a)
- Named collections were extended to cover more data sources
- Replication and Cluster improvements:
- Support inserts into system.zookeeper table
- Integrations:
- ‘mongodb’ table function to insert into and select from MongoDB databases
- MeiliSearch engine integration (experimental)
- New ‘direct’ join algorithm to work efficiently with key-value databases
- Remote file system and object storage features:
- Proper support of Google GCS for S3 table function a)
- Write cache for remote file systems
- Caches management commands:
DESCRIBE CACHE
,SHOW CACHES
,SYSTEM DROP FILESYSTEM CACHE
- Other:
- Store metadata cache in RocksDB for faster startup times
- Add simple chart visualization to the built-in Play interface
system.settings_changes
table to track changes between versions (incomplete)
As usual with ClickHouse, there are many performance and operational improvements in different server components.
a) - contributed by Altinity developers.
Major changes that require attention
Ordinary database deprecation
The Ordinary database engine and the old storage definition syntax for *MergeTree tables are deprecated. It has following consequences:
-
If the system database has the Ordinary engine it will be automatically converted to Atomic on server startup.
-
default_database_engine
setting is deprecated and does nothing. New databases are always created as Atomic ones if the engine is not specified. -
By default it’s not possible to create new databases with the Ordinary engine. There is a setting to keep old behavior:
allow_deprecated_database_ordinary=1
. -
Setting
allow_deprecated_syntax_for_merge_tree
allows to create MergeTree tables in old syntax, but these settings may be removed in future releases. -
There is a way to convert existing databases from Ordinary to Atomic. In order to do that, create an empty convert_ordinary_to_atomic file in the flags directory and all Ordinary databases will be converted automatically on the next server start.
Background pools
Configuration of background pools have been moved from profile to server settings. However, due to the ClickHouse bug, pools need to be defined in both config.xml and users.xml.
- If pools were modified in your ClickHouse cluster, you need to add a section in config.xml.
- If you need to adjust pools in 22.8, you need to do in two places now
For example:
config.xml
—--------
<background_pool_size>18</background_pool_size>
<background_move_pool_size>18</background_move_pool_size>
<background_schedule_pool_size>18</background_schedule_pool_size>
<background_fetches_pool_size>18</background_fetches_pool_size>
<background_distributed_schedule_pool_size>18</background_distributed_schedule_pool_size>
<background_common_pool_size>18</background_common_pool_size>
And:
users.xml
—--------
<profiles>
<default>
<background_pool_size>18</background_pool_size>
<background_move_pool_size>18</background_move_pool_size>
<background_schedule_pool_size>18</background_schedule_pool_size>
<background_fetches_pool_size>18</background_fetches_pool_size>
<background_distributed_schedule_pool_size>18</background_distributed_schedule_pool_size>
</default>
</profiles>
Zero-copy replication for remote file systems
Zero-copy replication was incidentally turned on by default in 22.3. In 22.8 it is turned off. So if you were using Disk S3 on a replicated cluster, you will see an increase in S3 storage, double writes and double merges. This is still an experimental feature, it can be enabled with allow_remote_fs_zero_copy_replication
setting.
Changes in s3 multipart upload
Since 22.6 uploads to s3 are executed in more threads and in more aggressive manner, which can create a higher pressure on the network and saturate some s3-compatible storages (e.g. Minio). Better control of that was added only in 22.9.
Backward Incompatible Changes
The following changes are backward incompatible and require user attention during an upgrade:
- Extended range of Date32 and DateTime64 to support dates from the year 1900 to 2299. In previous versions, the supported interval was only from the year 1925 to 2283. The implementation is using the proleptic Gregorian calendar (which is conformant with ISO 8601:2004 (clause 3.2.1 The Gregorian calendar)) instead of accounting for historical transitions from the Julian to the Gregorian calendar. This change affects implementation-specific behavior for out-of-range arguments. E.g. if in previous versions the value of 1899-01-01 was clamped to 1925-01-01, in the new version it will be clamped to 1900-01-01. It changes the behavior of rounding with toStartOfInterval if you pass INTERVAL 3 QUARTER up to one quarter because the intervals are counted from an implementation-specific point of time.
- Now, all relevant dictionary sources respect the
remote_url_allow_hosts setting
. It was already done for HTTP, Cassandra, Redis. Added ClickHouse, MongoDB, MySQL, PostgreSQL. Host is checked only for dictionaries created from DDL. - ClickHouse x86 binaries now require support for AVX instructions, i.e. a CPU not older than Intel Sandy Bridge / AMD Bulldozer, both released in 2011.
- Make the remote filesystem cache composable, allow not to evict certain files (regarding idx, mrk, ..), delete the old cache version. Now it is possible to configure cache over Azure blob storage disk, over Local disk, over StaticWeb disk, etc. This PR is marked backward incompatible because cache configuration changes and in order for cache to work need to update the config file. Old cache will still be used with new configuration. The server will startup fine with the old cache configuration.
- Remove support for octal number literals in SQL. In previous versions they were parsed as Float64.
- Changes how settings using seconds as type are parsed to support floating point values (for example:
max_execution_time=0.5
). Infinity or NaN values will throw an exception. - Changed format of binary serialization of columns of experimental type Object. New format is more convenient to implement by third-party clients.
- LIKE patterns with trailing escape symbol (’') are now disallowed (as mandated by the SQL standard).
- Do not allow SETTINGS after FORMAT for INSERT queries. There is compatibility
setting parser_settings_after_format_compact
to accept such queries, but it is turned OFF by default. - Function yandexConsistentHash (consistent hashing algorithm by Konstantin “kostik” Oblakov) is renamed to kostikConsistentHash. The old name is left as an alias for compatibility. Although this change is backward compatible, we may remove the alias in subsequent releases, that’s why it’s recommended to update the usages of this function in your apps.
Upgrade Notes
There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended.
- If you run different ClickHouse versions on a cluster with AArch64 CPU or mix AArch64 and amd64 on a cluster, and use distributed queries with GROUP BY multiple keys of fixed-size type that fit in 256 bits but don’t fit in 64 bits, and the size of the result is huge, the data will not be fully aggregated in the result of these queries during upgrade. Workaround: upgrade with downtime instead of a rolling upgrade.
Rolling upgrade from 20.4 and older is impossible because the “leader election” mechanism is removed from ReplicatedMergeTree.
Known Issues in 22.8.x
The development team continues to improve the quality of the 22.8 release. The following issues still exist in the 22.8.13 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications.
General stability issues (fixed in Altinity.Stable build 22.8.13):
- https://github.com/ClickHouse/ClickHouse/issues/43816 – deadlock in ReplicatedMergeTreeQueue/MergeTreeBackgroundExecutor when waiting for concurrently executing entry
- https://github.com/ClickHouse/ClickHouse/issues/43865 – crash related to DNS and host_regexp
- https://github.com/ClickHouse/ClickHouse/issues/43867 – merges can be assigned to parts with no quorum when insert_quorum is used (fix is backported into Altinity.Stable build)
Some queries that worked in 22.3 may fail in 22.8:
- https://github.com/ClickHouse/ClickHouse/issues/42669 – queries may fail because enable_positional_arguments was enabled by default (fixed in 22.8.14)
There are a few performance regressions:
- https://github.com/ClickHouse/ClickHouse/issues/44500 – performance degradation related to Map columns
- https://github.com/ClickHouse/ClickHouse/issues/44402 performance degradation of realtime queries because
collect_hash_table_stats_during_aggregation
was enabled by default
You may also look into a GitHub issues using v22.8-affected label.
Other Important Changes
Several settings has changed their defaults:
- The new default value for
local_filesystem_read_method
is pread_threadpool. In some cases this can lead to performance degradation (see Degraded performance with local_filesystem_read_method = pread_threadpool) if you notice that 22.8 is reading data slower, setlocal_filesystem_read_method = pread
allow_remote_fs_zero_copy_replication
is now 0 (was 1)format_csv_allow_single_quotes
is now 0 (was 1) – if it is set to true, allow strings in single quotes.distributed_ddl_entry_format_version
is now 3 (was 1)log_query_threads
is now 0 (was 1)join_algorithm
is now ‘default’ (was ‘hash’) – ‘default’ means ‘hash’ or ‘direct’ if possible, so this change only applies to new ‘direct’ joins.max_download_threads
is now 4 – that affects number of parallel threads for URL and S3 table engines and table functions
Some new ClickHouse features and optimizations are now enabled by default. It may lead to a change in behavior, so review those carefully and disable features that may affect your system:
allow_experimental_geo_types
– allow geo data types such as Point, Ring, Polygon, MultiPolygonoutput_format_json_named_tuples_as_objects
– it allows to serialize named tuples as JSON objects in JSON formats.collect_hash_table_stats_during_aggregation
– enable collecting hash table statistics to optimize memory allocation, this may lead to performance degradationenable_positional_arguments
– enable positional arguments in ORDER BY, GROUP BY and LIMIT BYinput_format_skip_unknown_fields
– skip columns with unknown names from input data (it works for JSONEachRow, -WithNames, -WithNamesAndTypes and TSKV formats). Previously, ClickHouse threw an exception.odbc_bridge_use_connection_pooling
– use connection pooling in ODBC bridge. If set to false, a new connection is created every timeoptimize_distinct_in_order
– enable DISTINCT optimization if some columns in DISTINCT form a prefix of sorting. For example, prefix of sorting key in merge tree or ORDER BY statementoptimize_multiif_to_if
– replace ‘multiIf’ with only one condition to ‘if’.optimize_read_in_window_order
– enable ORDER BY optimization in window clause for reading data in corresponding order in MergeTree tables.optimize_sorting_by_input_stream_properties
– optimize sorting by sorting properties of input streamoutput_format_json_named_tuples_as_objects
– serialize named tuple columns as JSON objects.
ClickHouse embedded monitoring is since 21.8. It now collects host level metrics, and stores them every second in the table system.asynchronious_metric_log
. This can be visible as an increase of background writes, storage usage, etc. To return to the old rate of metrics refresh / flush, adjust those settings in config.xml
:
<asynchronous_metrics_update_period_s>
60
</asynchronous_metrics_update_period_s>
<asynchronous_metric_log>
<flush_interval_milliseconds>
60000
</flush_interval_milliseconds>
</asynchronous_metric_log>
In the previous releases we recommended disabling optimize_on_insert
. This recommendation stays for 22.8 as well as inserts into Summing
and AggregatingMergeTree
can slow down.
Changes in Altinity Stable build Compared to Community Build
ClickHouse Altinity Stable builds are based on the community LTS versions. Altinity.Stable 22.8.13.21 is based on community 22.8.13.20-lts, but we have additionally backported several fixes:
- #42134 Improve lost replica recovery (ReplicatedMergeTree)
- #40620 Base58 fix handling leading 0 / '1'
- #40485 Add support for extended (chunked) arrays for Parquet format
- #42321 Increase request_timeout_ms for s3 disks.
- #45629 Cancel c-ares failed requests and retry on system interrupts to prevent callbacks with dangling references and premature resolution failures
Let’s Install!
Linux packages can be found at:
- https://packages.clickhouse.com for Upstream builds, and at
- https://builds.altinity.cloud for Altinity Stable builds.
Note: naming schema for Altinity.Stable build packages has been changed since 21.8.x.
21.8.x | 22.3.x and later versions |
---|---|
<package>_<ver>.altinitystable_all.deb | <package>_<ver>.altinitystable_amd64.deb |
<package>-<ver>.altinitystable-2.noarch.rpm | <package>-<ver>.altinitystable.x86_64.rpm |
Docker images for community versions have been moved from ‘yandex’ to ‘clickhouse’ organization, and should be referenced as ‘clickhouse/clickhouse-server:22.8
’.
Altinity Stable build images are available as ‘altinity/clickhouse-server:22.8.13.21.altinitystable
’.
Mac users are welcome to use Homebrew Formulae. Ready-to-use bottles are available for both M1 and Intel Macs running Monterey.
For more information on installing ClickHouse from either the Altinity Builds or the Upstream Builds, see the ClickHouse Altinity Stable Release Build Install Guide.
Please contact us at info@altinity.com if you experience any issues with the upgrade.
Appendix
New table functions
- MeiliSearch – renamed to meilisearch (lowercase) in 22.9
- mongodb
- viewIfPermitted
New table engines
- MeiliSearch
- NATS
New functions
- L2SquaredDistance, L2SquaredNorm, distanceL2Squared, normL2Squared
- LAST_DAY, toLastDayOfMonth
- addMicroseconds, addMilliseconds, addNanoseconds, subtractMicroseconds, subtractMilliseconds, subtractNanoseconds, toIntervalMicrosecond, toIntervalMillisecond, toIntervalNanosecond
- arrayFirstOrNull, arrayLastOrNull
- base58Decode, base58Encode
- flattenTuple
- getTypeSerializationStreams
- Multiple h3* geo functions
- hashid
- isNullable
- makeDate, makeDate32, makeDateTime, makeDateTime64
- meiliMatch
- minSampleSizeContinous, minSampleSizeConversion
- nonNegativeDerivative
- nowInBlock
- th_value
- parseDateTime64BestEffortUS, parseDateTime64BestEffortUSOrNull, parseDateTime64BestEffortUSOrZero
- parseTimeDelta
- revision
- showCertificate
- toStartOfFiveMinutes, toStartOfMicrosecond, toStartOfMillisecond, toStartOfNanosecond
- transactionID, transactionLatestSnapshot, transactionOldestSnapshot
- wyHash64
New formats
- JSONColumns
- JSONColumnsWithMetadata
- JSONCompactColumns
- JSONLines
- MySQLDump
- NDJSON
- PrettyCompactNoEscapesMonoBlock
- PrettyMonoBlock
- PrettyNoEscapesMonoBlock
- PrettySpaceMonoBlock
- PrettySpaceNoEscapesMonoBlock
- Prometheus
- ProtobufList
- SQLInsert
New system tables
system.backups
system.certificates
system.filesystem_cache
system.merge_tree_metadata_cache
system.remote_data_paths
system.schema_inference_cache
system.settings_changes
New metrics and events
system.events
Several groups of events were added:
- Cache introspection events: Cached*
- Kafka engine introspection events (finally!): Kafka*
- MergeTree metadata cache events: MergeTreeMetadataCache*
- Keeper events: Keeper*
system.metrics
- CacheDetachedFileSegments
- CacheFileSegments
- FilesystemCacheElements
- FilesystemCacheReadBuffers
- FilesystemCacheSize
- KafkaAssignedPartitions
- KafkaBackgroundReads
- KafkaConsumers
- KafkaConsumersInUse
- KafkaConsumersWithAssignment
- KafkaLibrdkafkaThreads
- KafkaProducers
- KafkaWrites
- KeeperAliveConnections
- KeeperOutstandingRequests
- S3Requests
Changes in default configuration files https://gist.github.com/filimonov/d60f749be2581b151e6bdaa830bc8111#file-default_config_changes-txt
Also, please refer to the release notes from the development team available at the following URLs:
- 22.4 Release notes: https://clickhouse.com/docs/en/whats-new/changelog/2022/#-clickhouse-release-224-2022-04-19
- 22.5 Release notes: https://clickhouse.com/docs/en/whats-new/changelog/2022/#-clickhouse-release-225-2022-05-19
- 22.6 Release notes: https://clickhouse.com/docs/en/whats-new/changelog/2022/#-clickhouse-release-226-2022-06-16
- 22.7 Release notes: https://clickhouse.com/docs/en/whats-new/changelog/2022/#-clickhouse-release-227-2022-07-21
- 22.8 Release notes: https://clickhouse.com/docs/en/whats-new/changelog/2022/#-clickhouse-release-228-2022-08-18