Second Cluster - Persistent Storage
kubectl create namespace test
namespace/test created
We’ve shown how to create ClickHouse® clusters in Kubernetes, how to add Zookeeper so we can create replicas of clusters. Now we’re going to show how to set persistent storage so you can change your cluster configurations without losing your hard work.
The examples here are built from the Altinity Kubernetes Operator for ClickHouse examples, simplified down for our demonstrations.
IMPORTANT NOTE
The Altinity Stable® builds for ClickHouse do not use thelatest
tag. We highly encourage organizations to install a specific version of Altinity Stable builds to maximize compatibility. For information on the latest Altinity Stable Docker images, see the Altinity Stable for ClickHouse Docker page.
Create a new file called sample05.yaml
with the following:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 2
templates:
podTemplate: clickhouse-stable
volumeClaimTemplate: storage-vc-template
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
volumeClaimTemplates:
- name: storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Those who have followed the previous examples will recognize the clusters being created, but there are some new additions:
volumeClaimTemplate
: This is setting up storage, and we’re specifying the class asdefault
. For full details on the different storage classes see the kubectl Storage Class documentationstorage
: We’re going to give our cluster 1 Gigabyte of storage, enough for our sample systems. If you need more space that can be upgraded by changing these settings.podTemplate
: Here we’ll specify what our pod types are going to be. We’ll use the latest version of the ClickHouse containers, but other versions can be specified to best it your needs. For more information, see the Altinity Kubernetes Operator for ClickHouse operator guide.
Save your new configuration file and install it. If you’ve been following this guide and already have the namespace test
operating, this will update it:
kubectl apply -f sample05.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created
Verify it completes with get all
for this namespace, and you should have similar results:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT AGE
demo-01 0.18.3 1 2 4 57ec3f87-9950-4e5e-9b26-13680f66331d Completed 4 clickhouse-demo-01.test.svc.cluster.local 108s
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 81s
chi-demo-01-demo-01-0-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 63s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 45s
chi-demo-01-demo-01-1-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 8s
clickhouse-demo-01 LoadBalancer 10.104.236.138 <pending> 8123:31281/TCP,9000:30052/TCP 98s
Testing persistent storage
Everything is running, let’s verify that our storage is working. We’re going to exec into our cluster with a bash prompt on one of the pods created:
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 32G 26G 4.0G 87% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda2 32G 26G 4.0G 87% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 7.7G 12K 7.7G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
And we can see we have about 1 Gigabyte of storage allocated into our cluster.
Let’s add some data to it. Nothing major, just to show that we can store information, then change the configuration and the data stays.
Exit out of your cluster and launch clickhouse-client
on your LoadBalancer. We’re going to create a database, then create a table in the database, then show both.
SHOW DATABASES
┌─name────┐
│ default │
│ system │
└─────────┘
CREATE DATABASE teststorage
CREATE TABLE teststorage.test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
SHOW DATABASES
┌─name────────┐
│ default │
│ system │
│ teststorage │
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
│ 0 │
└───────┘
┌─dummy─┐
│ 0 │
└───────┘
If you followed the instructions from Zookeeper and Replicas, note at the end when we updated the configuration of our sample cluster that all of the tables and data we made were deleted. Let’s recreate that experiment now with a new configuration.
Create a new file called sample06.yaml
. We’re going to reduce the shards and replicas to 1:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-stable
volumeClaimTemplate: storage-vc-template
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
volumeClaimTemplates:
- name: storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Update the cluster with the following:
kubectl apply -f sample06.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured
Wait until the configuration is done and all of the pods are spun down, then launch a bash prompt on one of the pods and check the storage available:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT AGE
demo-01 0.18.3 1 1 1 776c1a82-44e1-4c2e-97a7-34cef629e698 Completed 4 clickhouse-demo-01.test.svc.cluster.local 2m56s
kubectl -n test exec -it chi-demo-01-demo-01-0-0-0 -- df -h
Filesystem Size Used Avail Use% Mounted on
overlay 32G 26G 4.0G 87% /
tmpfs 64M 0 64M 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/sda2 32G 26G 4.0G 87% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 7.7G 12K 7.7G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 3.9G 0 3.9G 0% /proc/acpi
tmpfs 3.9G 0 3.9G 0% /proc/scsi
tmpfs 3.9G 0 3.9G 0% /sys/firmware
Storage is still there. We can test if our databases are still available by logging into clickhouse:
SHOW DATABASES
┌─name────────┐
│ default │
│ system │
│ teststorage │
└─────────────┘
SELECT * FROM teststorage.test
┌─dummy─┐
│ 0 │
└───────┘
All of our databases and tables are there.
There are different ways of allocating storage - for data, for logging, multiple data volumes for your cluster nodes, but this will get you started in running your own Kubernetes cluster running ClickHouse in your favorite environment.