This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Operator Guide

Installation and Management of clickhouse-operator for Kubernetes

The Altinity Kubernetes Operator is an open source project managed and maintained by Altinity Inc. This Operator Guide is created to help users with installation, configuration, maintenance, and other important tasks.

1 - Installation Guide

Basic and custom installation instructions of the clickhouse-operator

Depending on your organization and its needs, there are different ways of installing the Kubernetes clickhouse-operator.

1.1 - Basic Installation Guide

The simple method of installing the Altinity Kubernetes Operator

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

To install the Altinity Kubernetes Operator for Kubernetes:

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during installation - this insures maximum compatibility and that all replicated environments are working from the same version. For more information on installing other versions of the Altinity Kubernetes Operator, see the specific Version Installation Guide.

    The most current version is 0.18.3:

kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
  1. The following will be displayed on a successful installation. For more information on the resources created in the installation, see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created
  1. Verify the installation by running:
kubectl get pods --namespace kube-system

The following will be displayed on a successful installation, with your particular image:

NAME                                   READY   STATUS    RESTARTS      AGE
clickhouse-operator-857c69ffc6-ttnsj   2/2     Running   0             4s
coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
etcd-minikube                          1/1     Running   4 (23h ago)   51d
kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
storage-provisioner                    1/1     Running   9 (23h ago)   51d

1.2 - Custom Installation Guide

How to install a customized Altinity Kubernetes Operator

Users who need to customize their Altinity Kubernetes Operator namespace or can not directly connect to Github from the installation environment can perform a custom install.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Script Install into Namespace

By default, the Altinity Kubernetes Operator installed into the kube-system namespace when using the Basic Installation instructions. To install into a different namespace use the following command replacing {custom namespace here} with the namespace to use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE={custom_namespace_here} bash

For example, to install into the namespace test-clickhouse-operator namespace, use:

curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator-web-installer/clickhouse-operator-install.sh | OPERATOR_NAMESPACE=test-clickhouse-operator bash
Setup ClickHouse Operator into 'test-clickhouse-operator' namespace
No 'test-clickhouse-operator' namespace found. Going to create
namespace/test-clickhouse-operator created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com created
customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com created
serviceaccount/clickhouse-operator created
clusterrole.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-test-clickhouse-operator configured
configmap/etc-clickhouse-operator-files created
configmap/etc-clickhouse-operator-confd-files created
configmap/etc-clickhouse-operator-configd-files created
configmap/etc-clickhouse-operator-templatesd-files created
configmap/etc-clickhouse-operator-usersd-files created
deployment.apps/clickhouse-operator created
service/clickhouse-operator-metrics created

If no OPERATOR_NAMESPACE value is set, then the Altinity Kubernetes Operator will be installed into kube-system.

Manual Install into Namespace

For organizations that can not access GitHub directly from the environment they are installing the Altinity Kubernetes Operator in, they can perform a manual install through the following steps:

  1. Download the install template file: clickhouse-operator-install-template.yaml.

  2. Edit the file and set OPERATOR_NAMESPACE value.

  3. Use the following commands, replacing {your file name} with the name of your YAML file:

    namespace = "custom-clickhouse-operator"
    bash("sed -i s/'${OPERATOR_NAMESPACE}'/test-clickhouse-operator/ clickhouse-operator-install-template.yaml", add_to_text=False)
    bash(f"kubectl apply -f clickhouse-operator-install-template.yaml", add_to_text=False)
    
    try:
    
        retry(bash, timeout=60, delay=1)("kubectl get pods --namespace test-clickhouse-operator "
            "-o=custom-columns=NAME:.metadata.name,STATUS:.status.phase",
            exitcode=0, message="Running", lines=slice(1, None),
            fail_message="not all pods in Running state", add_to_text=true)
    
    finally:
        bash(f"kubectl delete namespace test-clickhouse-operator', add_to_text=False)
    
    kubectl apply -f {your file name}
    

    For example:

    kubectl apply -f customtemplate.yaml
    

Alternatively, instead of using the install template, enter the following into your console (bash is used below, modify depending on your particular shell). Change the OPERATOR_NAMESPACE value to match your namespace.

# Namespace to install operator into
OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE:-clickhouse-operator}"
# Namespace to install metrics-exporter into
METRICS_EXPORTER_NAMESPACE="${OPERATOR_NAMESPACE}"

# Operator's docker image
OPERATOR_IMAGE="${OPERATOR_IMAGE:-altinity/clickhouse-operator:latest}"
# Metrics exporter's docker image
METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE:-altinity/metrics-exporter:latest}"

# Setup Altinity Kubernetes Operator into specified namespace
kubectl apply --namespace="${OPERATOR_NAMESPACE}" -f <( \
    curl -s https://raw.githubusercontent.com/Altinity/clickhouse-operator/master/deploy/operator/clickhouse-operator-install-template.yaml | \
        OPERATOR_IMAGE="${OPERATOR_IMAGE}" \
        OPERATOR_NAMESPACE="${OPERATOR_NAMESPACE}" \
        METRICS_EXPORTER_IMAGE="${METRICS_EXPORTER_IMAGE}" \
        METRICS_EXPORTER_NAMESPACE="${METRICS_EXPORTER_NAMESPACE}" \
        envsubst \
)

Verify Installation

To verify the Altinity Kubernetes Operator is running in your namespace, use the following command:

kubectl get pods -n clickhouse-operator
NAME                                   READY   STATUS    RESTARTS   AGE
clickhouse-operator-5d9496dd48-8jt8h   2/2     Running   0          16s

1.3 - Source Build Guide - 0.18 and Up

How to build the Altinity Kubernetes Operator from source code

For organizations who prefer to build the software directly from source code, they can compile the Altinity Kubernetes Operator and install it into a Docker container through the following process. The following procedure is available for versions of the Altinity Kubernetes Operator 0.18.0 and up.

Binary Build

Binary Build Requirements

  • go-lang compiler: Go.
  • Go mod Package Manager.
  • The source code from the Altinity Kubernetes Operator repository. This can be downloaded using git clone https://github.com/altinity/clickhouse-operator.

Binary Build Instructions

  1. Switch working dir to clickhouse-operator.

  2. Link all packages with the command: echo {root_password} | sudo -S -k apt install -y golang.

  3. Build the sources with go build -o ./clickhouse-operator cmd/operator/main.go.

This creates the Altinity Kubernetes Operator binary. This binary is only used within a kubernetes environment.

Docker Image Build and Usage

Docker Build Requirements

Install Docker Buildx CLI plugin

  1. Download Docker Buildx binary file releases page on GitHub

  2. Create folder structure for plugin

    mkdir -p ~/.docker/cli-plugins/
    
  3. Rename the relevant binary and copy it to the destination matching your OS

    mv buildx-v0.7.1.linux-amd64  ~/.docker/cli-plugins/docker-buildx
    
  4. On Unix environments, it may also be necessary to make it executable with chmod +x:

    chmod +x ~/.docker/cli-plugins/docker-buildx
    
  5. Set buildx as the default builder

    docker buildx install
    
  6. Create config.json file to enable the plugin

    touch ~/.docker/config.json
    
  7. Create config.json file to enable the plugin

    echo "{"experimental": "enabled"}" >> ~/.docker/config.json
    

Docker Build Instructions

  1. Switch working dir to clickhouse-operator

  2. Build docker image with docker: docker build -f dockerfile/operator/Dockerfile -t altinity/clickhouse-operator:dev .

  3. Register freshly build docker image inside kubernetes environment with the following:

    docker save altinity/clickhouse-operator | (eval $(minikube docker-env) && docker load)
    
  4. Install the Altinity Kubernetes Operator as described in either the Basic Build or Custom Build.

1.4 - Specific Version Installation Guide

How to install a specific version of the Altinity Kubernetes Operator

Users may want to install a specific version of the Altinity Kubernetes Operator for a variety of reasons: to maintain parity between different environments, to preserve the version between replicas, or other reasons.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

Altinity Kubernetes Operator Versions After 0.17.0

To install a specific version of the Altinity Kubernetes Operator after version 0.17.0:

  1. Run kubectl and apply the manifest directly from the GitHub Altinity Kubernetes Operator repository, or by downloading the manifest and applying it directly. The format for the URL is:

    https://github.com/Altinity/clickhouse-operator/raw/{OPERATOR_VERSION}/deploy/operator/clickhouse-operator-install-bundle.yaml
    

    Replace the {OPERATOR_VERSION} with the version to install. For example, for the Altinity Kubernetes Operator version 0.18.3, the URL would be:

    https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml

    The command to apply the Docker manifest through kubectl is:

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator created
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system created
    configmap/etc-clickhouse-operator-files created
    configmap/etc-clickhouse-operator-confd-files created
    configmap/etc-clickhouse-operator-configd-files created
    configmap/etc-clickhouse-operator-templatesd-files created
    configmap/etc-clickhouse-operator-usersd-files created
    deployment.apps/clickhouse-operator created
    service/clickhouse-operator-metrics created
    
  2. Verify the installation is complete and the clickhouse-operator pod is running:

    kubectl get pods --namespace kube-system
    

    A similar result to the following will be displayed on a successful installation:

    NAME                                   READY   STATUS    RESTARTS      AGE
    clickhouse-operator-857c69ffc6-q8qrr   2/2     Running   0             5s
    coredns-78fcd69978-nthp2               1/1     Running   4 (23h ago)   51d
    etcd-minikube                          1/1     Running   4 (23h ago)   51d
    kube-apiserver-minikube                1/1     Running   4 (23h ago)   51d
    kube-controller-manager-minikube       1/1     Running   4 (23h ago)   51d
    kube-proxy-lsggn                       1/1     Running   4 (23h ago)   51d
    kube-scheduler-minikube                1/1     Running   4 (23h ago)   51d
    storage-provisioner                    1/1     Running   9 (23h ago)   51d
    
  3. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

1.5 - Upgrade Guide

How to upgrade the Altinity Kubernetes Operator

The Altinity Kubernetes Operator can be upgraded at any time by applying the new manifest from the Altinity Kubernetes Operator GitHub repository.

The following procedures detail how to install a specific version of the Altinity Kubernetes Operator in the default Kubernetes namespace kube-system. For instructions on performing custom installations based on the namespace and other settings, see the Custom Installation Guide.

Requirements

The Altinity Kubernetes Operator for Kubernetes has the following requirements:

Instructions

The following instructions are based on installations of the Altinity Kubernetes Operator greater than version 0.16.0. In the following examples, Altinity Kubernetes Operator version 0.16.0 has been installed and will be upgraded to 0.18.3.

For instructions on installing specific versions of the Altinity Kubernetes Operator, see the Specific Version Installation Guide.

  1. Deploy the Altinity Kubernetes Operator from the manifest directly from GitHub. It is recommended that the version be specified during the installation for maximum compatibilty. In this example, the version being upgraded to is :

    kubectl apply -f https://github.com/Altinity/clickhouse-operator/raw/0.18.3/deploy/operator/clickhouse-operator-install-bundle.yaml
    
  2. The following will be displayed on a successful installation. For more information on the resources created in the installation, see [Altinity Kubernetes Operator Resources]({<ref “operatorresources” >})

    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallations.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseinstallationtemplates.clickhouse.altinity.com configured
    customresourcedefinition.apiextensions.k8s.io/clickhouseoperatorconfigurations.clickhouse.altinity.com configured
    serviceaccount/clickhouse-operator configured
    clusterrole.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    clusterrolebinding.rbac.authorization.k8s.io/clickhouse-operator-kube-system configured
    configmap/etc-clickhouse-operator-files configured
    configmap/etc-clickhouse-operator-confd-files configured
    configmap/etc-clickhouse-operator-configd-files configured
    configmap/etc-clickhouse-operator-templatesd-files configured
    configmap/etc-clickhouse-operator-usersd-files configured
    deployment.apps/clickhouse-operator configured
    service/clickhouse-operator-metrics configured
    
  3. Verify the installation by running:

    The following will be displayed on a successful installation, with your particular image:

    kubectl get pods --namespace kube-system
    
    NAME                                   READY   STATUS    RESTARTS       AGE
    clickhouse-operator-857c69ffc6-dqt5l   2/2     Running   0              29s
    coredns-78fcd69978-nthp2               1/1     Running   3 (14d ago)    50d
    etcd-minikube                          1/1     Running   3 (14d ago)    50d
    kube-apiserver-minikube                1/1     Running   3 (2m6s ago)   50d
    kube-controller-manager-minikube       1/1     Running   3 (14d ago)    50d
    kube-proxy-lsggn                       1/1     Running   3 (14d ago)    50d
    kube-scheduler-minikube                1/1     Running   3 (2m6s ago)   50d
    storage-provisioner                    1/1     Running   7 (48s ago)    50d
    
  4. To verify the version of the Altinity Kubernetes Operator, use the following command:

    kubectl get pods -l app=clickhouse-operator -n kube-system -o jsonpath="{.items[*].spec.containers[*].image}" | tr -s "[[:space:]]" | sort | uniq -c
    
    1 altinity/clickhouse-operator:0.18.3 altinity/metrics-exporter:0.18.3
    

2 - Configuration Guide

How to configure your Altinity Kubernetes Operator cluster.

Depending on your organization’s needs and environment, you can modify your environment to best fit your needs with the Altinity Kubernetes Operator or your cluster settings.

2.1 - ClickHouse Operator Settings

Settings and configurations for the Altinity Kubernetes Operator

Altinity Kubernetes Operator 0.18 and greater

For versions of the Altinity Kubernetes Operator 0.18 and later, the Altinity Kubernetes Operator settings can be modified through the clickhouse-operator-install-bundle.yaml file in the section etc-clickhouse-operator-files. This sets the config.yaml settings that are used to set the user configuration and other settings. For more information, see the sample config.yaml for the Altinity Kubernetes Operator.

Altinity Kubernetes Operator before 0.18

For versions before 0.18, the Altinity Kubernetes Operator settings can be modified through clickhouse-operator-install-bundle.yaml file in the section marked ClickHouse Settings Section.

New User Settings

Setting Default Value Description
chConfigUserDefaultProfile default Sets the default profile used when creating new users.
chConfigUserDefaultQuota default Sets the default quota used when creating new users.
chConfigUserDefaultNetworksIP ::1
127.0.0.1
0.0.0.0
Specifies the networks that the user can connect from. Note that 0.0.0.0 allows access from all networks.
chConfigUserDefaultPassword default The initial password for new users.

ClickHouse Operator Settings

The ClickHouse Operator role can connect to the ClickHouse database to perform the following:

  • Metrics requests
  • Schema Maintenance
  • Drop DNS Cache

Additional users can be created with this role by modifying the usersd XML files.

Setting Default Value Description
chUsername clickhouse_operator The username for the ClickHouse Operator user.
chPassword clickhouse_operator_password The default password for the ClickHouse Operator user.
chPort 8123 The IP port for the ClickHouse Operator user.

Log Parameters

The Log Parameters sections sets the options for log outputs and levels.

Setting Default Value Description
logtostderr true If set to true, submits logs to stderr instead of log files.
alsologtostderr false If true, submits logs to stderr as well as log files.
v 1 Sets V-leveled logging level.
stderrthreshold "" The error threshold. Errors at or above this level will be submitted to stderr.
vmodule "" A comma separated list of modules and their verbose level with {module name} = {log level}. For example: "module1=2,module2=3".
log_backtrace_at "" Location to store the stack backtrace.

Runtime Parameters

The Runtime Parameters section sets the resources allocated for processes such as reconcile functions.

Setting Default Value Description
reconcileThreadsNumber 10 The number threads allocated to manage reconcile requests.
reconcileWaitExclude false ???
reconcileWaitInclude false ???

Template Parameters

Template Parameters sets the values for connection values, user default settings, and other values. These values are based on ClickHouse configurations. For full details, see the ClickHouse documentation page.

2.2 - ClickHouse Cluster Settings

Settings and configurations for clusters and nodes

ClickHouse clusters that are configured on Kubernetes have several options based on the Kubernetes Custom Resources settings. Your cluster may have particular requirements to best fit your organizations needs.

For an example of a configuration file using each of these settings, see the 99-clickhouseinstllation-max.yaml file as a template.

This assumes that you have installed the clickhouse-operator.

Initial Settings

The first section sets the cluster kind and api.

Parent Setting Type Description
None kind String Specifies the type of cluster to install. In this case, ClickHouse. Value value: ClickHouseInstallation
None metadata Object Assigns metadata values for the cluster
metadata name String The name of the resource.
metadata labels Array Labels applied to the resource.
metadata annotation Array Annotations applied to the resource.

Initial Settings Example

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "clickhouse-installation-max"
  labels:
    label1: label1_value
    label2: label2_value
  annotations:
    annotation1: annotation1_value
    annotation2: annotation2_value

.spec.defaults

.spec.defaults section represents default values the sections that follow .specs.defaults.

Parent Setting Type Description
defaults replicasUseFQDN `[Yes No ]`
defaults distributedDDL String Sets the <yandex><distributed_ddl></distributed_ddl></yandex> configuration settings. For more information, see Distributed DDL Queries (ON CLUSTER Clause).
defaults templates Array Sets the pod template types. This is where the template is declared, then defined in the .spec.configuration later.

.spec.defaults Example

  defaults:
    replicasUseFQDN: "no"
    distributedDDL:
      profile: default
    templates:
      podTemplate: clickhouse-v18.16.1
      dataVolumeClaimTemplate: default-volume-claim
      logVolumeClaimTemplate: default-volume-claim
      serviceTemplate: chi-service-template

.spec.configuration

.spec.configuration section represents sources for ClickHouse configuration files. For more information, see the ClickHouse Configuration Files page.

.spec.configuration Example

  configuration:
    users:
      readonly/profile: readonly
      #     <users>
      #        <readonly>
      #          <profile>readonly</profile>
      #        </readonly>
      #     </users>
      test/networks/ip:
        - "127.0.0.1"
        - "::/0"
      #     <users>
      #        <test>
      #          <networks>
      #            <ip>127.0.0.1</ip>
      #            <ip>::/0</ip>
      #          </networks>
      #        </test>
      #     </users>
      test/profile: default
      test/quotas: default

.spec.configuration.zookeeper

.spec.configuration.zookeeper defines the zookeeper settings, and is expanded into the <yandex><zookeeper></zookeeper></yandex> configuration section. For more information, see ClickHouse Zookeeper settings.

.spec.configuration.zookeeper Example

    zookeeper:
      nodes:
        - host: zookeeper-0.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-1.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
        - host: zookeeper-2.zookeepers.zoo3ns.svc.cluster.local
          port: 2181
      session_timeout_ms: 30000
      operation_timeout_ms: 10000
      root: /path/to/zookeeper/node
      identity: user:password

.spec.configuration.profiles

.spec.configuration.profiles defines the ClickHouse profiles that are stored in <yandex><profiles></profiles></yandex>. For more information, see the ClickHouse Server Settings page.

.spec.configuration.profiles Example

    profiles:
      readonly/readonly: 1

expands into

      <profiles>
        <readonly>
          <readonly>1</readonly>
        </readonly>
      </profiles>

.spec.configuration.users

.spec.configuration.users defines the users and is stored in <yandex><users></users></yandex>. For more information, see the Configuration Files page.

.spec.configuration.users Example

  users:
    test/networks/ip:
        - "127.0.0.1"
        - "::/0"

expands into

     <users>
        <test>
          <networks>
            <ip>127.0.0.1</ip>
            <ip>::/0</ip>
          </networks>
        </test>
     </users>

.spec.configuration.settings

.spec.configuration.settings sets other ClickHouse settings such as compression, etc. For more information, see the ClickHouse Server Settings page.

.spec.configuration.settings Example

    settings:
      compression/case/method: "zstd"
#      <compression>
#       <case>
#         <method>zstd</method>
#      </case>
#      </compression>

.spec.configuration.files

.spec.configuration.files creates custom files used in the custer. These are used for custom configurations, such as the ClickHouse External Dictionary.

.spec.configuration.files Example

    files:
      dict1.xml: |
        <yandex>
            <!-- ref to file /etc/clickhouse-data/config.d/source1.csv -->
        </yandex>        
      source1.csv: |
        a1,b1,c1,d1
        a2,b2,c2,d2        
spec:
  configuration:
    settings:
      dictionaries_config: config.d/*.dict
    files:
      dict_one.dict: |
        <yandex>
          <dictionary>
        <name>one</name>
        <source>
            <clickhouse>
                <host>localhost</host>
                <port>9000</port>
                <user>default</user>
                <password/>
                <db>system</db>
                <table>one</table>
            </clickhouse>
        </source>
        <lifetime>60</lifetime>
        <layout><flat/></layout>
        <structure>
            <id>
                <name>dummy</name>
            </id>
            <attribute>
                <name>one</name>
                <expression>dummy</expression>
                <type>UInt8</type>
                <null_value>0</null_value>
            </attribute>
        </structure>
        </dictionary>
        </yandex>        

.spec.configuration.clusters

.spec.configuration.clusters defines the ClickHouse clusters to be installed.

    clusters:

Clusters and Layouts

.clusters.layout defines the ClickHouse layout of a cluster. This can be general, or very granular depending on your requirements. For full information, see Cluster Deployment.

Templates

podTemplate is used to define the specific pods in the cluster, mainly the ones that will be running ClickHouse. The VolumeClaimTemplate defines the storage volumes. Both of these settings are applied per replica.

Basic Dimensions

Basic dimensions are used to define the cluster definitions without specifying particular details of the shards or nodes.

Parent Setting Type Description
.clusters.layout shardsCount Number The number of shards for the cluster.
.clusters.layout replicasCount Number The number of replicas for the cluster.
Basic Dimensions Example

In this example, the podTemplates defines ClickHouses containers into a cluster called all-counts with three shards and two replicas.

- name: all-counts
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2

This is expanded into the following configuration. The IP addresses and DNS configuration are assigned by k8s and the operator.

<yandex>
    <remote_servers>
        <all-counts>
        
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.2</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.3</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.4</host>
                    <port>9000</port>
                </replica>
            </shard>
            
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>192.168.1.5</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>192.168.1.6</host>
                    <port>9000</port>
                </replica>
            </shard>
            
        </all-counts>
    </remote_servers>
</yandex>

Specified Dimensions

The templates section can also be used to specify more than just the general layout. The exact definitions of the shards and replicas can be defined as well.

In this example, shard0 here has replicasCount specified, while shard1 has 3 replicas explicitly specified, with possibility to customized each replica.

        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shardsCount: 3
          replicasCount: 2
      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard0
              replicasCount: 3
              weight: 1
              internalReplication: Disabled
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim

            - name: shard1
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                - name: replica1
                - name: replica2

Other examples are combinations, where some replicas are defined but only one is explicitly differentiated with a different podTemplate.

      - name: customized
        templates:
          podTemplate: clickhouse-v18.16.1
          dataVolumeClaimTemplate: default-volume-claim
          logVolumeClaimTemplate: default-volume-claim
        layout:
          shards:
            - name: shard2
              replicasCount: 3
              templates:
                podTemplate: clickhouse-v18.16.1
                dataVolumeClaimTemplate: default-volume-claim
                logVolumeClaimTemplate: default-volume-claim
              replicas:
                - name: replica0
                  port: 9000
                  templates:
                    podTemplate: clickhouse-v19.11.3.11
                    dataVolumeClaimTemplate: default-volume-claim
                    logVolumeClaimTemplate: default-volume-claim

.spec.templates.serviceTemplates

.spec.templates.serviceTemplates represents Kubernetes Service templates, with additional fields.

At the top level is generateName which is used to explicitly specify service name to be created. generateName is able to understand macros for the service level of the object created. The service levels are defined as:

  • CHI
  • Cluster
  • Shard
  • Replica

The macro and service level where they apply are:

Setting CHI Cluster Shard Replica Description
{chi} X X X X ClickHouseInstallation name
{chiID} X X X X short hashed ClickHouseInstallation name (Experimental)
{cluster}   X X X The cluster name
{clusterID}   X X X short hashed cluster name (BEWARE, this is an experimental feature)
{clusterIndex}   X X X 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
{shard}     X X shard name
{shardID}     X X short hashed shard name (BEWARE, this is an experimental feature)
{shardIndex}     X X 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
{replica}       X replica name
{replicaID}       X short hashed replica name (BEWARE, this is an experimental feature)
{replicaIndex}       X 0-based index of the replica in the shard (BEWARE, this is an experimental feature)

.spec.templates.serviceTemplates Example

  templates:
    serviceTemplates:
      - name: chi-service-template
        # generateName understands different sets of macroses,
        # depending on the level of the object, for which Service is being created:
        #
        # For CHI-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        #
        # For Cluster-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        #
        # For Shard-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        #
        # For Replica-level Service:
        # 1. {chi} - ClickHouseInstallation name
        # 2. {chiID} - short hashed ClickHouseInstallation name (BEWARE, this is an experimental feature)
        # 3. {cluster} - cluster name
        # 4. {clusterID} - short hashed cluster name (BEWARE, this is an experimental feature)
        # 5. {clusterIndex} - 0-based index of the cluster in the CHI (BEWARE, this is an experimental feature)
        # 6. {shard} - shard name
        # 7. {shardID} - short hashed shard name (BEWARE, this is an experimental feature)
        # 8. {shardIndex} - 0-based index of the shard in the cluster (BEWARE, this is an experimental feature)
        # 9. {replica} - replica name
        # 10. {replicaID} - short hashed replica name (BEWARE, this is an experimental feature)
        # 11. {replicaIndex} - 0-based index of the replica in the shard (BEWARE, this is an experimental feature)
        generateName: "service-{chi}"
        # type ObjectMeta struct from k8s.io/meta/v1
        metadata:
          labels:
            custom.label: "custom.value"
          annotations:
            cloud.google.com/load-balancer-type: "Internal"
            service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
            service.beta.kubernetes.io/azure-load-balancer-internal: "true"
            service.beta.kubernetes.io/openstack-internal-load-balancer: "true"
            service.beta.kubernetes.io/cce-load-balancer-internal-vpc: "true"
        # type ServiceSpec struct from k8s.io/core/v1
        spec:
          ports:
            - name: http
              port: 8123
            - name: client
              port: 9000
          type: LoadBalancer

.spec.templates.volumeClaimTemplates

.spec.templates.volumeClaimTemplates defines the PersistentVolumeClaims. For more information, see the Kubernetes PersistentVolumeClaim page.

.spec.templates.volumeClaimTemplates Example

  templates:
    volumeClaimTemplates:
      - name: default-volume-claim
        # type PersistentVolumeClaimSpec struct from k8s.io/core/v1
        spec:
          # 1. If storageClassName is not specified, default StorageClass
          # (must be specified by cluster administrator) would be used for provisioning
          # 2. If storageClassName is set to an empty string (‘’), no storage class will be used
          # dynamic provisioning is disabled for this PVC. Existing, “Available”, PVs
          # (that do not have a specified storageClassName) will be considered for binding to the PVC
          #storageClassName: gold
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 1Gi

.spec.templates.podTemplates

.spec.templates.podTemplates defines the Pod Templates. For more information, see the Kubernetes Pod Templates.

The following additional sections have been defined for the ClickHouse cluster:

  1. zone
  2. distribution

zone and distribution together define zoned layout of ClickHouse instances over nodes. These ensure that the affinity.nodeAffinity and affinity.podAntiAffinity are set.

.spec.templates.podTemplates Example

To place a ClickHouse instances in AWS us-east-1a availability zone with one ClickHouse per host:

        zone:
          values:
            - "us-east-1a"
        distribution: "OnePerHost"

To place ClickHouse instances on nodes labeled as clickhouse=allow with one ClickHouse per host:

        zone:
          key: "clickhouse"
          values:
            - "allow"
        distribution: "OnePerHost"

Or the distribution can be Unspecified:

  templates:
    podTemplates:
      # multiple pod templates makes possible to update version smoothly
      # pod template for ClickHouse v18.16.1
      - name: clickhouse-v18.16.1
        # We may need to label nodes with clickhouse=allow label for this example to run
        # See ./label_nodes.sh for this purpose
        zone:
          key: "clickhouse"
          values:
            - "allow"
        # Shortcut version for AWS installations
        #zone:
        #  values:
        #    - "us-east-1a"

        # Possible values for distribution are:
        # Unspecified
        # OnePerHost
        distribution: "Unspecified"

        # type PodSpec struct {} from k8s.io/core/v1
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:18.16.1
              volumeMounts:
                - name: default-volume-claim
                  mountPath: /var/lib/clickhouse
              resources:
                requests:
                  memory: "64Mi"
                  cpu: "100m"
                limits:
                  memory: "64Mi"
                  cpu: "100m"

References

3 - Resources

Altinity Kubernetes Operator Resources Details

The Altinity Kubernetes Operator creates the following resources on installation to support its functions:

  • Custom Resource Definition
  • Service account
  • Cluster Role Binding
  • Deployment

Custom Resource Definition

The Kubernetes k8s API is extended with the new Kubernetes Cluster Resource Definition kind:ClickHouseInstallation.

To check the Custom Resource Definition:

kubectl get customresourcedefinitions

Expected result:

NAME                                                       CREATED AT
clickhouseinstallations.clickhouse.altinity.com            2022-02-09T17:20:39Z
clickhouseinstallationtemplates.clickhouse.altinity.com    2022-02-09T17:20:39Z
clickhouseoperatorconfigurations.clickhouse.altinity.com   2022-02-09T17:20:39Z

Service Account

The new Service Account clickhouse-operator allows services running from within Pods to be authenticated against the Service Account clickhouse-operator through the apiserver.

To check the Service Account:

kubectl get serviceaccounts -n kube-system

Expected result

NAME                                 SECRETS   AGE
attachdetach-controller              1         23d
bootstrap-signer                     1         23d
certificate-controller               1         23d
clickhouse-operator                  1         5s
clusterrole-aggregation-controller   1         23d
coredns                              1         23d
cronjob-controller                   1         23d
daemon-set-controller                1         23d
default                              1         23d
deployment-controller                1         23d
disruption-controller                1         23d
endpoint-controller                  1         23d
endpointslice-controller             1         23d
endpointslicemirroring-controller    1         23d
ephemeral-volume-controller          1         23d
expand-controller                    1         23d
generic-garbage-collector            1         23d
horizontal-pod-autoscaler            1         23d
job-controller                       1         23d
kube-proxy                           1         23d
namespace-controller                 1         23d
node-controller                      1         23d
persistent-volume-binder             1         23d
pod-garbage-collector                1         23d
pv-protection-controller             1         23d
pvc-protection-controller            1         23d
replicaset-controller                1         23d
replication-controller               1         23d
resourcequota-controller             1         23d
root-ca-cert-publisher               1         23d
service-account-controller           1         23d
service-controller                   1         23d
statefulset-controller               1         23d
storage-provisioner                  1         23d
token-cleaner                        1         23d
ttl-after-finished-controller        1         23d
ttl-controller                       1         23d

Cluster Role Binding

The Cluster Role Binding cluster-operator grants permissions defined in a role to a set of users.

Roles are granted to users, groups or service account. These permissions are granted cluster-wide with ClusterRoleBinding.

To check the Cluster Role Binding:

kubectl get clusterrolebinding

Expected result

NAME                                                   ROLE                                                                               AGE
clickhouse-operator-kube-system                        ClusterRole/clickhouse-operator-kube-system                                        5s
cluster-admin                                          ClusterRole/cluster-admin                                                          23d
kubeadm:get-nodes                                      ClusterRole/kubeadm:get-nodes                                                      23d
kubeadm:kubelet-bootstrap                              ClusterRole/system:node-bootstrapper                                               23d
kubeadm:node-autoapprove-bootstrap                     ClusterRole/system:certificates.k8s.io:certificatesigningrequests:nodeclient       23d
kubeadm:node-autoapprove-certificate-rotation          ClusterRole/system:certificates.k8s.io:certificatesigningrequests:selfnodeclient   23d
kubeadm:node-proxier                                   ClusterRole/system:node-proxier                                                    23d
minikube-rbac                                          ClusterRole/cluster-admin                                                          23d
storage-provisioner                                    ClusterRole/system:persistent-volume-provisioner                                   23d
system:basic-user                                      ClusterRole/system:basic-user                                                      23d
system:controller:attachdetach-controller              ClusterRole/system:controller:attachdetach-controller                              23d
system:controller:certificate-controller               ClusterRole/system:controller:certificate-controller                               23d
system:controller:clusterrole-aggregation-controller   ClusterRole/system:controller:clusterrole-aggregation-controller                   23d
system:controller:cronjob-controller                   ClusterRole/system:controller:cronjob-controller                                   23d
system:controller:daemon-set-controller                ClusterRole/system:controller:daemon-set-controller                                23d
system:controller:deployment-controller                ClusterRole/system:controller:deployment-controller                                23d
system:controller:disruption-controller                ClusterRole/system:controller:disruption-controller                                23d
system:controller:endpoint-controller                  ClusterRole/system:controller:endpoint-controller                                  23d
system:controller:endpointslice-controller             ClusterRole/system:controller:endpointslice-controller                             23d
system:controller:endpointslicemirroring-controller    ClusterRole/system:controller:endpointslicemirroring-controller                    23d
system:controller:ephemeral-volume-controller          ClusterRole/system:controller:ephemeral-volume-controller                          23d
system:controller:expand-controller                    ClusterRole/system:controller:expand-controller                                    23d
system:controller:generic-garbage-collector            ClusterRole/system:controller:generic-garbage-collector                            23d
system:controller:horizontal-pod-autoscaler            ClusterRole/system:controller:horizontal-pod-autoscaler                            23d
system:controller:job-controller                       ClusterRole/system:controller:job-controller                                       23d
system:controller:namespace-controller                 ClusterRole/system:controller:namespace-controller                                 23d
system:controller:node-controller                      ClusterRole/system:controller:node-controller                                      23d
system:controller:persistent-volume-binder             ClusterRole/system:controller:persistent-volume-binder                             23d
system:controller:pod-garbage-collector                ClusterRole/system:controller:pod-garbage-collector                                23d
system:controller:pv-protection-controller             ClusterRole/system:controller:pv-protection-controller                             23d
system:controller:pvc-protection-controller            ClusterRole/system:controller:pvc-protection-controller                            23d
system:controller:replicaset-controller                ClusterRole/system:controller:replicaset-controller                                23d
system:controller:replication-controller               ClusterRole/system:controller:replication-controller                               23d
system:controller:resourcequota-controller             ClusterRole/system:controller:resourcequota-controller                             23d
system:controller:root-ca-cert-publisher               ClusterRole/system:controller:root-ca-cert-publisher                               23d
system:controller:route-controller                     ClusterRole/system:controller:route-controller                                     23d
system:controller:service-account-controller           ClusterRole/system:controller:service-account-controller                           23d
system:controller:service-controller                   ClusterRole/system:controller:service-controller                                   23d
system:controller:statefulset-controller               ClusterRole/system:controller:statefulset-controller                               23d
system:controller:ttl-after-finished-controller        ClusterRole/system:controller:ttl-after-finished-controller                        23d
system:controller:ttl-controller                       ClusterRole/system:controller:ttl-controller                                       23d
system:coredns                                         ClusterRole/system:coredns                                                         23d
system:discovery                                       ClusterRole/system:discovery                                                       23d
system:kube-controller-manager                         ClusterRole/system:kube-controller-manager                                         23d
system:kube-dns                                        ClusterRole/system:kube-dns                                                        23d
system:kube-scheduler                                  ClusterRole/system:kube-scheduler                                                  23d
system:monitoring                                      ClusterRole/system:monitoring                                                      23d
system:node                                            ClusterRole/system:node                                                            23d
system:node-proxier                                    ClusterRole/system:node-proxier                                                    23d
system:public-info-viewer                              ClusterRole/system:public-info-viewer                                              23d
system:service-account-issuer-discovery                ClusterRole/system:service-account-issuer-discovery                                23d
system:volume-scheduler                                ClusterRole/system:volume-scheduler                                                23d

Cluster Role Binding Example

As an example, the role cluster-admin is granted to a service account clickhouse-operator:

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: clickhouse-operator
    namespace: kube-system

Deployment

The Deployment clickhouse-operator runs in the kube-system namespace.

To check the Deployment:

kubectl get deployments --namespace kube-system

Expected result

NAME                  READY   UP-TO-DATE   AVAILABLE   AGE
clickhouse-operator   1/1     1            1           5s
coredns               1/1     1            1           23d

References

4 - Networking Connection Guides

How to connect your ClickHouse Kubernetes cluster network.

Organizations can connect their clickhouse-operator based ClickHouse cluster to their network depending on their environment. The following guides are made to assist users setting up the connections based on their environment.

4.1 - MiniKube Networking Connection Guide

How to connect your ClickHouse Kubernetes cluster network.

Organizations that have set up the Altinity Kubernetes Operator using minikube can connect it to an external network through the following steps.

Prerequisites

The following guide is based on an installed Altinity Kubernetes Operator cluster using minikube for an Ubuntu Linux operating system.

Network Connection Guide

The proper way to connect to the ClickHouse cluster is through the LoadBalancer created during the ClickHouse cluster created process. For example, the following ClickHouse cluster has 2 shards in one replica, applied to the namespace test:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "demo-01"
spec:
  configuration:
    clusters:
      - name: "demo-01"
        layout:
          shardsCount: 2
          replicasCount: 1

This generates the following services in the namespace test:

kubectl get service -n test
NAME                      TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
chi-demo-01-demo-01-0-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      22s
chi-demo-01-demo-01-1-0   ClusterIP      None          <none>        8123/TCP,9000/TCP,9009/TCP      5s
clickhouse-demo-01        LoadBalancer   10.96.67.44   <pending>     8123:32766/TCP,9000:31368/TCP   38s

The LoadBalancer alternates which of the ClickHouse shards to connect to, and should be where all ClickHouse clients connect to.

To open a connection from external networks to the LoadBalancer, use the kubectl port-forward command in the following format:

kubectl port-forward service/{LoadBalancer Service} -n {NAMESPACE} --address={IP ADDRESS} {TARGET PORT}:{INTERNAL PORT}

Replacing the following:

  • LoadBalancer Service: the LoadBalancer service to connect external ports to the Kubernetes environment.
  • NAMESPACE: The namespace for the LoadBalancer.
  • IP ADDRESS: The IP address to bind the service to on the machine running minikube, or 0.0.0.0 to find all IP addresses on the minikube server to the specified port.
  • TARGET PORT: The external port that users will connect to.
  • INTERNAL PORT: The port within the Altinity Kubernetes Operator network.

The kubectl port-forward command must be kept running in the terminal, or placed into the background with the & operator.

In the example above, the following settings will be used to bind all IP addresses on the minikube server to the service clickhouse-demo-01 for ports 9000 and 8123 in the background:

kubectl port-forward service/clickhouse-demo-01 -n test --address=0.0.0.0 9000:9000 8123:8123 &

To test the connection, connect to the external IP address via curl. For ClickHouse HTTP, OK will be returned, while for port 9000 a notice requesting use of port 8123 will be displayed:

curl http://localhost:9000
Handling connection for 9000
Port 9000 is for clickhouse-client program
You must use port 8123 for HTTP.
curl http://localhost:8123
Handling connection for 8123
Ok.

Once verified, connect to the ClickHouse cluster via either HTTP or ClickHouse TCP as needed.

5 - Storage Guide

How to configure storage options for the Altinity Kubernetes Operator

Altinity Kubernetes Operator users have different options regarding persistent storage depending on their environment and situation. The following guides detail how to set up persistent storage for local and cloud storage environments.

5.1 - Persistent Storage Overview

Allocate persistent storage for Altinity Kubernetes Operator clusters

Users setting up storage in their local environments can establish persistent volumes in different formats based on their requirements.

Allocating Space

Space is allocated through the Kubernetes PersistentVolume object. ClickHouse clusters established with the Altinity Kubernetes Operator then use the PersistentVolumeClaim to receive persistent storage.

The PersistentVolume can be set in one of two ways:

  • Manually: Manual allocations set the storage area before the ClickHouse cluster is created. Space is then requested through a PersistentVolumeClaim when the ClickHouse cluster is created.
  • Dynamically: Space is allocated when the ClickHouse cluster is created through the PersistentVolumeClaim, and the Kubernetes controlling software manages the process for the user.

For more information on how persistent volumes are managed in Kubernetes, see the Kubernetes documentation Persistent Volumes.

Storage Types

Data stored for ClickHouse clusters in the following ways:

No Persistent Storage

If no persistent storage claim template is specified, then no persistent storage will be allocated. When Kubernetes is stopped or a new manifest applied, all previous data will be lost.

In this example two shards are specified but has no persistent storage allocated:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "no-persistent"
spec:
  configuration:
    clusters:
      - name: "no-persistent"
        layout:
          shardsCount: 2
          replicasCount: 1

When applied to the namespace test, no persistent storage is found:

kubectl -n test get pv
No resources found

Cluster Wide Storage

If neither the dataVolumeClaimTemplate or the logVolumeClaimTemplate are specified (see below), then all data is stored under the requested volumeClaimTemplate. This includes all information stored in each pod.

In this example, two shards are specified with one volume of storage that is used by the entire pods:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-storage"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
            volumeClaimTemplate: cluster-storage-vc-template
  templates:
    volumeClaimTemplates:
      - name: cluster-storage-vc-template
        spec:
          storageClassName: standard
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi

When applied to the namespace test the following persistent volumes are found. Note that each pod has 500Mb of storage:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                        STORAGECLASS   REASON   AGE
pvc-6e70c36a-f170-47b5-93a6-88175c62b8fe   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-1-0-0   standard                21s
pvc-ca002bc4-0ad2-4358-9546-0298eb8b2152   500Mi      RWO            Delete           Bound    test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-0-0-0   standard                39s

Cluster Wide Split Storage

Applying the dataVolumeClaimTemplate and logVolumeClaimTemplate template types to the Altinity Kubernetes Operator controlled ClickHouse cluster allows for specific data from each ClickHouse pod to be stored in a particular persistent volume:

  • dataVolumeClaimTemplate: Sets the storage volume for the ClickHouse node data. In a traditional ClickHouse server environment, this would be allocated to /var/lib/clickhouse.
  • logVolumeClaimTemplate: Sets the storage volume for ClickHouse node log files. In a traditional ClickHouse server environment, this would be allocated to /var/log/clickhouse-server.

This allows different storage capacities for log data versus ClickHouse database data, as well as only capturing specific data rather than the entire pod.

In this example, two shards have different storage capacity for dataVolumeClaimTemplate and logVolumeClaimTemplate:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "cluster-split-storage"
spec:
  configuration:
    clusters:
      - name: "cluster-split"
        layout:
          shardsCount: 2
          replicasCount: 1
        templates:
          dataVolumeClaimTemplate: data-volume-template
          logVolumeClaimTemplate: log-volume-template
  templates:
    volumeClaimTemplates:
      - name: data-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-volume-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi

In this case, retrieving the PersistentVolume allocations shows two storage volumes per pod based on the specifications in the manifest:

kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                     STORAGECLASS   REASON   AGE
pvc-0b02c5ba-7ca1-4578-b3d9-ff8bb67ad412   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-1-0-0    standard                21s
pvc-4095b3c0-f550-4213-aa53-a08bade7c62c   100Mi      RWO            Delete           Bound    test/log-volume-template-chi-cluster-split-storage-cluster-split-0-0-0    standard                40s
pvc-71384670-c9db-4249-ae7e-4c5f1c33e0fc   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-1-0-0   standard                21s
pvc-9e3fb3fa-faf3-4a0e-9465-8da556cb9eec   500Mi      RWO            Delete           Bound    test/data-volume-template-chi-cluster-split-storage-cluster-split-0-0-0   standard                40s

Pod Mount Based Storage

PersistentVolume objects can be mounted directly into the pod’s mountPath. Any other data is not stored when the container is stopped unless it is covered by another PersistentVolumeClaim.

In the following example, each of the 2 shards in the ClickHouse cluster has the volumes tied to specific mount points:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "pod-split-storage"
spec:
  configuration:
    clusters:
      - name: "pod-split"
        # Templates are specified for this cluster explicitly
        templates:
          podTemplate: pod-template-with-volumes
        layout:
          shardsCount: 2
          replicasCount: 1

  templates:
    podTemplates:
      - name: pod-template-with-volumes
        spec:
          containers:
            - name: clickhouse
              image: yandex/clickhouse-server:21.8
              volumeMounts:
                - name: data-storage-vc-template
                  mountPath: /var/lib/clickhouse
                - name: log-storage-vc-template
                  mountPath: /var/log/clickhouse-server

    volumeClaimTemplates:
      - name: data-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Mi
      - name: log-storage-vc-template
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 100Mi
kubectl -n test get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                                                 STORAGECLASS   REASON   AGE
pvc-37be9f84-7ba5-404e-8299-e95a291014a8   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0   standard                24s
pvc-5b2f8694-326d-41cb-94ec-559725947b45   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0    standard                24s
pvc-84768e78-e44e-4295-8355-208b07330707   500Mi      RWO            Delete           Bound    test/data-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0   standard                43s
pvc-9e123af7-01ce-4ab8-9450-d8ca32b1e3a6   100Mi      RWO            Delete           Bound    test/log-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0    standard                43s