Bringing it all Together
Querying data lakes with swarm clusters
At this point we’ve used all the major features of Antalya. We’ve seen how swarm clusters can speed up queries, and we’ve seen how Hybrid clusters can save you money by letting you put your cold data into object storage. Here are the key technologies Antalya delivers:
- Swarm clusters - Swarm clusters give us the benefits of horizontal scaling and caching. Swarms are also cheaper than clusters of regular nodes, so we save money as well.
- Hybrid tables - Hybrid tables save us money, letting us store cold data in cheaper object storage.
- Inserting data - When we run an
INSERTstatement, that statement is handled by the regular cluster and the data goes into block storage. - Exporting data - When we run
ALTER TABLE EXPORT PART, our data is written as Parquet files in object storage. We useiceto update the Iceberg catalog based on the data and metadata in the Parquet files, but that’s outside this picture. (At some point, Iceberg catalogs will be updated automatically without usingice.) - Searching a Hybrid table - When we search the Hybrid table, ClickHouse uses the watermark (
2025-01-01in the example) to determine what data to search.
Summary
Project Antalya adds significant benefits to what is already the premier open-source analytics platform. You can use swarm clusters and hybrid tables together or separately, making your queries faster while lowering your storage and computing costs.