This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Configuration Guide

How to configure your Altinity Kubernetes Operator cluster.

Depending on your organization’s needs and environment, you can modify your environment to best fit your needs with the Altinity Kubernetes Operator or your cluster settings.

1 - ClickHouse Operator Settings

Settings and configurations for the Altinity Kubernetes Operator

Altinity Kubernetes Operator 0.18 and greater

For versions of the Altinity Kubernetes Operator 0.18 and later, the Altinity Kubernetes Operator settings can be modified through the clickhouse-operator-install-bundle.yaml file in the section etc-clickhouse-operator-files. This sets the config.yaml settings that are used to set the user configuration and other settings. For more information, see the sample config.yaml for the Altinity Kubernetes Operator.

Altinity Kubernetes Operator before 0.18

For versions before 0.18, the Altinity Kubernetes Operator settings can be modified through clickhouse-operator-install-bundle.yaml file in the section marked ClickHouse Settings Section.

New User Settings

Setting Default Value Description
chConfigUserDefaultProfile default Sets the default profile used when creating new users.
chConfigUserDefaultQuota default Sets the default quota used when creating new users.
chConfigUserDefaultNetworksIP ::1
127.0.0.1
0.0.0.0
Specifies the networks that the user can connect from. Note that 0.0.0.0 allows access from all networks.
chConfigUserDefaultPassword default The initial password for new users.

ClickHouse Operator Settings

The ClickHouse Operator role can connect to the ClickHouse database to perform the following:

  • Metrics requests
  • Schema Maintenance
  • Drop DNS Cache

Additional users can be created with this role by modifying the usersd XML files.

Setting Default Value Description
chUsername clickhouse_operator The username for the ClickHouse Operator user.
chPassword clickhouse_operator_password The default password for the ClickHouse Operator user.
chPort 8123 The IP port for the ClickHouse Operator user.

Log Parameters

The Log Parameters sections sets the options for log outputs and levels.

Setting Default Value Description
logtostderr true If set to true, submits logs to stderr instead of log files.
alsologtostderr false If true, submits logs to stderr as well as log files.
v 1 Sets V-leveled logging level.
stderrthreshold "" The error threshold. Errors at or above this level will be submitted to stderr.
vmodule "" A comma separated list of modules and their verbose level with {module name} = {log level}. For example: "module1=2,module2=3".
log_backtrace_at "" Location to store the stack backtrace.

Runtime Parameters

The Runtime Parameters section sets the resources allocated for processes such as reconcile functions.

Setting Default Value Description
reconcileThreadsNumber 10 The number threads allocated to manage reconcile requests.
reconcileWaitExclude false ???
reconcileWaitInclude false ???

Template Parameters

Template Parameters sets the values for connection values, user default settings, and other values. These values are based on ClickHouse configurations. For full details, see the ClickHouse documentation page.

2 - ClickHouse Cluster Settings

Settings and configurations for clusters and nodes

ClickHouse clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.

For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.

This assumes that you have installed the clickhouse-operator.

Initial Settings

The first section sets the cluster kind and api.

Parent Setting Type Description
None kind String Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation
None metadata Object Assigns metadata values for the cluster
metadata name String The name of the resource.
metadata labels Array Labels applied to the resource.
metadata annotation Array Annotations applied to the resource.

Initial Settings Example

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "clickhouse-installation-max"
  labels:
    label1: label1_value
    label2: label2_value
  annotations:
    annotation1: annotation1_value
    annotation2: annotation2_value

.spec.defaults

.spec.defaults section represents default values the sections that follow .specs.defaults.

Parent Setting Type Description
defaults replicasUseFQDN `[Yes No ]`
defaults distributedDDL String Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause).
defaults templates Array Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later.

.spec.defaults Example

  defaults:
    replicasUseFQDN: "no"
    distributedDDL:
      profile: default
    templates:
      podTemplate: clickhouse-v18.16.1
      dataVolumeClaimTemplate: default-volume-claim
      logVolumeClaimTemplate: default-volume-claim
      serviceTemplate: chi-service-template

.spec.configuration

.spec.configuration section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.

.spec.configuration Example

  configuration:
    users:
      readonly/profile: readonly
      #     <users>
      #        <readonly>
      #          <profile>readonly</profile>
      #        </readonly>
      #     </users>
      test/networks/ip:
        - "127.0.0.1"
        - "::/0"
      #     <users>
      #        <test>
      #          <networks>
      #            <ip>127.0.0.1</ip>
      #            <ip>::/0</ip>
      #          </networks>
      #        </test>
      #     </users>
      test/profile: default
      test/quotas: default

.spec.configuration.zookeeper

.spec.configuration.zookeeper defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex> configuration section. For more information, see ClickHouse Zookeeper settings.

.spec.configuration.zookeeper Example

    zookeeper:
      nodes:
        - host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
      session_timeout_ms: 30000
      operation_timeout_ms: 10000
      root: /path/to/zookeeper/node
      identity: user:password

.spec.configuration.profiles

.spec.configuration.profiles defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.profiles Example

    profiles:
      readonly/readonly: 1

expands into

      <profiles>
        <readonly>
          <readonly>1</readonly>
        </readonly>
      </profiles>

.spec.configuration.users

.spec.configuration.users defines the users and is stored in <yandex><users></users></yandex>. For more information, see the Configuration Files page.

.spec.configuration.users Example

  users:
    test/networks/ip:
        - "127.0.0.1"
        - "::/0"

expands into

     <users>
        <test>
          <networks>
            <ip>127.0.0.1</ip>
            <ip>::/0</ip>
          </networks>
        </test>
     </users>

.spec.configuration.settings

.spec.configuration.settings sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.

.spec.configuration.settings Example

    settings:
      compression/case/method: "zstd"
#      <compression>
#       <case>
#         <method>zstd</method>
#      </case>
#      </compression>

.spec.configuration.files

.spec.configuration.files creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.

.spec.configuration.files Example

    files:
      dict1.xml: |
        <yandex>
            <!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
        </yandex>        
      source1.csv: |
        a1,b1,c1,d1
        a2,b2,c2,d2        
spec:
  configuration:
    settings:
      dictionaries_config: config.d/*.dict
    files:
      dict_one.dict: |
        <yandex>
          <dictionary>
        <name>one</name>
        <source>
            <clickhouse>
                <host>localhost</host>
                <port>9000</port>
                <user>default</user>
                <password/>
                <db>system</db>
                <table>one</table>
            </clickhouse>
        </source>
        <lifetime>60</lifetime>
        <layout><flat/></layout>
        <structure>
            <id>
                <name>dummy</name>
            </id>
            <attribute>
                <name>one</name>
                <expression>dummy</expression>
                <type>UInt8</type>
                <null_value>0</null_value>
            </attribute>
        </structure>
        </dictionary>
        </yandex>        

.spec.configuration.clusters

.spec.configuration.clusters defines the ClickHouse clusters to be installed.

    clusters:

Clusters and Layouts

.clusters.layout defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.

Templates

podTemplate is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate defines the storage volumes. Both of these settings are applied per replica.

Basic Dimensions

Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.

Parent Setting Type Description
.clusters.layout shardsCount Number The number of shards for the cluster.
.clusters.layout replicasCount Number The number of replicas for the cluster.
Basic Dimensions Example

In this example, the podTemplates defines ClickHouses containers into a cluster called all-counts with three shards and two replicas.

- name: all-counts
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2

This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.

<yandex>
    <remote_servers>
        <all-counts>
        
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.3</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.4</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.5</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.6</host>
                    <port>9000</port>
                </replica>
            </shard>
            
        </all-counts>
    </remote_servers>
</yandex>

Specified Dimensions

The templates section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.

In this example, shard0 here has replicasCount specified, while shard1 has 3 replicas explicitly specified, with possibility to customized each replica.

        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2
      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard0
              replicasCount: 3
              weight: 1
              internalReplication: Disabled
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim

            - name: shard1
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                - name: replica1
                - name: replica2

Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate.

      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard2
              replicasCount: 3
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                  port: 9000
                  templates:
                    podTemplate: clickhouse-v19.11.3.11
                    dataVolumeClaimTemplate: default-volume-claim
                    logVolumeClaimTemplate: default-volume-claim

.spec.templates.serviceTemplates

.spec.templates.serviceTemplates represents Kubernetes Service templates, with additional fields.

At the top level is generateName which is used to explicitly specify service name to be created. generateName is able to understand macros for the service level of the object created. The service levels are defined as:

  • CHI
  • Cluster
  • Shard
  • Replica

The macro and service level where they apply are:

Setting CHI Cluster Shard Replica Description
{chi} X X X X ClickHouseInstallation name
{chiID} X X X X short hashed ClickHouseInstallation name (Experimental)
{cluster}   X X X The cluster name
{clusterID}   X X X short hashed cluster name (BEWARE, this is an experimental feature)
{clusterIndex}   X X X 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
{shard}     X X shard name
{shardID}     X X short hashed shard name (BEWARE, this is an experimental feature)
{shardIndex}     X X 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
{replica}       X replica name
{replicaID}       X short hashed replica name (BEWARE, this is an experimental feature)
{replicaIndex}       X 0-based index of the replica in the shard (BEWARE, this is an experimental feature)

.spec.templates.serviceTemplates Example

  templates:
    serviceTemplates:
      - name: chi-service-template
        # generateName understands different sets of macroses,
        # depending on the level of the object, for which Service is being created:
        #
        # For CHI-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        #
        # For Cluster-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        #
        # For Shard-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        #
        # For Replica-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        # 9. {replica} - replica name
        # 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
        # 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
        generateName: "service-{chi}"
        # type ObjectMeta struct from k8s.io/meta/v1
        metadata:
          labels:
            custom.label: "custom.value"
          annotations:
            cloud.google.com/load-balancer-type: "Internal"
            service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
            service.beta.kubernetes.io/azure-load-balancer-internal: "true"
            service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
            service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
        # type ServiceSpec struct from k8s.io/core/v1
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

.spec.templates.volumeClaimTemplates

.spec.templates.volumeClaimTemplates defines the PersistentVolumeClaims. For more information, see the Kubernetes PersistentVolumeClaim page.

.spec.templates.volumeClaimTemplates Example

  templates:
    volumeClaimTemplates:
      - name: default-volume-claim
        # type PersistentVolumeClaimSpec struct from k8s.io/core/v1
        spec:
          # 1. If storageClassName is not specified, default StorageClass
          # (must be specified by cluster administrator) would be used for provisioning
          # 2. If storageClassName is set to an empty string (‘’), no storage class will be used
          # dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
          # (that do not have a specified storageClassName) will be considered for binding to the PVC
          #storageClassName: gold
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

.spec.templates.podTemplates

.spec.templates.podTemplates defines the Pod Templates. For more information, see the Kubernetes Pod Templates.

The following additional sections have been defined for the ClickHouse cluster:

  1. zone
  2. distribution

zone and distribution together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity and affinity.podAntiAffinity are set.

.spec.templates.podTemplates Example

To place a ClickHouse instances in AWS us-east-1a availability zone with one ClickHouse per host:

        zone:
          values:
            - "us-east-1a"
        distribution: "OnePerHost"

To place ClickHouse instances on nodes labeled as clickhouse=allow with one ClickHouse per host:

        zone:
          key: "clickhouse"
          values:
            - "allow"
        distribution: "OnePerHost"

Or the distribution can be Unspecified:

  templates:
    podTemplates:
      # multiple pod templates makes possible to update version smoothly
      # pod template for ClickHouse v18.16.1
      - name: clickhouse-v18.16.1
        # We may need to label nodes with clickhouse=allow label for this example to run
        # See ./label_nodes.sh for this purpose
        zone:
          key: "clickhouse"
          values:
            - "allow"
        # Shortcut version for AWS installations
        #zone:
        #  values:
        #    - "us-east-1a"

        # Possible values for distribution are:
        # Unspecified
        # OnePerHost
        distribution: "Unspecified"

        # type PodSpec struct {} from k8s.io/core/v1
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:18.16.1
              volumeMounts:
                - name: default-volume-claim
                  mountPath: /var/lib/clickhouse
              resources:
                requests:
                  memory: "64Mi"
                  cpu: "100m"
                limits:
                  memory: "64Mi"
                  cpu: "100m"

References