ClickHouse® Cluster Settings
ClickHouse® clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.
For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.
This assumes that you have installed the clickhouse-operator
.
Initial Settings
The first section sets the cluster kind and api.
Parent | Setting | Type | Description |
---|---|---|---|
None | kind | String | Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation |
None | metadata | Object | Assigns metadata values for the cluster |
metadata | name | String | The name of the resource. |
metadata | labels | Array | Labels applied to the resource. |
metadata | annotation | Array | Annotations applied to the resource. |
Initial Settings Example
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse-installation-max"
labels:
label1: label1_value
label2: label2_value
annotations:
annotation1: annotation1_value
annotation2: annotation2_value
.spec.defaults
.spec.defaults
section represents default values the sections that follow .specs.defaults
.
Parent | Setting | Type | Description |
---|---|---|---|
defaults | replicasUseFQDN | `[Yes | No ]` |
defaults | distributedDDL | String | Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause). |
defaults | templates | Array | Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later. |
.spec.defaults Example
defaults:
replicasUseFQDN: "no"
distributedDDL:
profile: default
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
serviceTemplate: chi-service-template
.spec.configuration
.spec.configuration
section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.
.spec.configuration Example
configuration:
users:
readonly/profile: readonly
# <users>
# <readonly>
# <profile>readonly</profile>
# </readonly>
# </users>
test/networks/ip:
- "127.0.0.1"
- "::/0"
# <users>
# <test>
# <networks>
# <ip>127.0.0.1</ip>
# <ip>::/0</ip>
# </networks>
# </test>
# </users>
test/profile: default
test/quotas: default
.spec.configuration.zookeeper
.spec.configuration.zookeeper
defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex>
configuration section. For more information, see ClickHouse Zookeeper settings.
.spec.configuration.zookeeper Example
zookeeper:
nodes:
- host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
port: 2181
- host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
port: 2181
- host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
port: 2181
session_timeout_ms: 30000
operation_timeout_ms: 10000
root: /path/to/zookeeper/node
identity: user:password
.spec.configuration.profiles
.spec.configuration.profiles
defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>
. For more information, see the ClickHouse Server Settings page.
.spec.configuration.profiles Example
profiles:
readonly/readonly: 1
expands into
<profiles>
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
.spec.configuration.users
.spec.configuration.users
defines the users and is stored in <yandex><users></users></yandex>
. For more information, see the Configuration Files page.
.spec.configuration.users Example
users:
test/networks/ip:
- "127.0.0.1"
- "::/0"
expands into
<users>
<test>
<networks>
<ip>127.0.0.1</ip>
<ip>::/0</ip>
</networks>
</test>
</users>
.spec.configuration.settings
.spec.configuration.settings
sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.
.spec.configuration.settings Example
settings:
compression/case/method: "zstd"
# <compression>
# <case>
# <method>zstd</method>
# </case>
# </compression>
.spec.configuration.files
.spec.configuration.files
creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.
.spec.configuration.files Example
files:
dict1.xml: |
<yandex>
<!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
</yandex>
source1.csv: |
a1,b1,c1,d1
a2,b2,c2,d2
spec:
configuration:
settings:
dictionaries_config: config.d/*.dict
files:
dict_one.dict: |
<yandex>
<dictionary>
<name>one</name>
<source>
<clickhouse>
<host>localhost</host>
<port>9000</port>
<user>default</user>
<password/>
<db>system</db>
<table>one</table>
</clickhouse>
</source>
<lifetime>60</lifetime>
<layout><flat/></layout>
<structure>
<id>
<name>dummy</name>
</id>
<attribute>
<name>one</name>
<expression>dummy</expression>
<type>UInt8</type>
<null_value>0</null_value>
</attribute>
</structure>
</dictionary>
</yandex>
.spec.configuration.clusters
.spec.configuration.clusters
defines the ClickHouse clusters to be installed.
clusters:
Clusters and Layouts
.clusters.layout
defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.
Templates
podTemplate
is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate
defines the storage volumes. Both of these settings are applied per replica.
Basic Dimensions
Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.
Parent | Setting | Type | Description |
---|---|---|---|
.clusters.layout | shardsCount | Number | The number of shards for the cluster. |
.clusters.layout | replicasCount | Number | The number of replicas for the cluster. |
Basic Dimensions Example
In this example, the podTemplates
defines ClickHouses containers into a cluster called all-counts
with three shards and two replicas.
- name: all-counts
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shardsCount: 3
replicasCount: 2
This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.
<yandex>
<remote_servers>
<all-counts>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.1</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.2</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.3</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.4</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.1.5</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.1.6</host>
<port>9000</port>
</replica>
</shard>
</all-counts>
</remote_servers>
</yandex>
Specified Dimensions
The templates
section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.
In this example, shard0
here has replicasCount
specified, while shard1
has 3 replicas explicitly specified, with possibility to customized each replica.
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shardsCount: 3
replicasCount: 2
- name: customized
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shards:
- name: shard0
replicasCount: 3
weight: 1
internalReplication: Disabled
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
- name: shard1
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
replicas:
- name: replica0
- name: replica1
- name: replica2
Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate
.
- name: customized
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
layout:
shards:
- name: shard2
replicasCount: 3
templates:
podTemplate: clickhouse-v18.16.1
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
replicas:
- name: replica0
port: 9000
templates:
podTemplate: clickhouse-v19.11.3.11
dataVolumeClaimTemplate: default-volume-claim
logVolumeClaimTemplate: default-volume-claim
.spec.templates.serviceTemplates
.spec.templates.serviceTemplates
represents Kubernetes Service templates, with additional fields.
At the top level is generateName
which is used to explicitly specify service name to be created. generateName
is able to understand macros for the service level of the object created. The service levels are defined as:
- CHI
- Cluster
- Shard
- Replica
The macro and service level where they apply are:
Setting | CHI | Cluster | Shard | Replica | Description |
---|---|---|---|---|---|
{chi} |
X | X | X | X | ClickHouseInstallation name |
{chiID} |
X | X | X | X | short hashed ClickHouseInstallation name (Experimental) |
{cluster} |
X | X | X | The cluster name | |
{clusterID} |
X | X | X | short hashed cluster name (BEWARE, this is an experimental feature) | |
{clusterIndex} |
X | X | X | 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature) | |
{shard} |
X | X | shard name | ||
{shardID} |
X | X | short hashed shard name (BEWARE, this is an experimental feature) | ||
{shardIndex} |
X | X | 0-based index of the shard in the cluster (BEWARE, this is an experimental feature) | ||
{replica} |
X | replica name | |||
{replicaID} |
X | short hashed replica name (BEWARE, this is an experimental feature) | |||
{replicaIndex} |
X | 0-based index of the replica in the shard (BEWARE, this is an experimental feature) |
.spec.templates.serviceTemplates Example
templates:
serviceTemplates:
- name: chi-service-template
# generateName understands different sets of macros,
# depending on the level of the object, for which Service is being created:
#
# For CHI-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
#
# For Cluster-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
#
# For Shard-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
# 6. {shard} - shard name
# 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
# 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
#
# For Replica-level Service:
# 1. {chi} - ClickHouseInstallation name
# 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
# 3. {cluster} - cluster name
# 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
# 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
# 6. {shard} - shard name
# 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
# 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
# 9. {replica} - replica name
# 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
# 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
generateName: "service-{chi}"
# type ObjectMeta struct from k8s.io/meta/v1
metadata:
labels:
custom.label: "custom.value"
annotations:
cloud.google.com/load-balancer-type: "Internal"
service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
# type ServiceSpec struct from k8s.io/core/v1
spec:
ports:
- name: http
port: 8123
- name: client
port: 9000
type: LoadBalancer
.spec.templates.volumeClaimTemplates
.spec.templates.volumeClaimTemplates
defines the PersistentVolumeClaims
. For more information, see the Kubernetes PersistentVolumeClaim page.
.spec.templates.volumeClaimTemplates Example
templates:
volumeClaimTemplates:
- name: default-volume-claim
# type PersistentVolumeClaimSpec struct from k8s.io/core/v1
spec:
# 1. If storageClassName is not specified, default StorageClass
# (must be specified by cluster administrator) would be used for provisioning
# 2. If storageClassName is set to an empty string (‘’), no storage class will be used
# dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
# (that do not have a specified storageClassName) will be considered for binding to the PVC
#storageClassName: gold
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
.spec.templates.podTemplates
.spec.templates.podTemplates
defines the Pod Templates. For more information, see the Kubernetes Pod Templates.
The following additional sections have been defined for the ClickHouse cluster:
zone
distribution
zone
and distribution
together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity
and affinity.podAntiAffinity
are set.
.spec.templates.podTemplates Example
To place a ClickHouse instances in AWS us-east-1a
availability zone with one ClickHouse per host:
zone:
values:
- "us-east-1a"
distribution: "OnePerHost"
To place ClickHouse instances on nodes labeled as clickhouse=allow
with one ClickHouse per host:
zone:
key: "clickhouse"
values:
- "allow"
distribution: "OnePerHost"
Or the distribution
can be Unspecified
:
templates:
podTemplates:
# multiple pod templates makes possible to update version smoothly
# pod template for ClickHouse v18.16.1
- name: clickhouse-v18.16.1
# We may need to label nodes with clickhouse=allow label for this example to run
# See ./label_nodes.sh for this purpose
zone:
key: "clickhouse"
values:
- "allow"
# Shortcut version for AWS installations
#zone:
# values:
# - "us-east-1a"
# Possible values for distribution are:
# Unspecified
# OnePerHost
distribution: "Unspecified"
# type PodSpec struct {} from k8s.io/core/v1
spec:
containers:
- name: clickhouse
image: yandex/clickhouse-server:18.16.1
volumeMounts:
- name: default-volume-claim
mountPath: /var/lib/clickhouse
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "64Mi"
cpu: "100m"
References
- custom-resource
- 99-clickhouseinstallation-max.yaml
- server-settings_zookeeper
- ClickHouse settings
- ClickHouse external_dicts_dict
- Kubernetes service Kubernetes persistentvolumeclaims Kubernetes pod-templates