Persistent Storage Overview
Users setting up storage in their local environments can establish persistent volumes in different formats based on their requirements.
Allocating Space
Space is allocated through the Kubernetes PersistentVolume
object. ClickHouse clusters established with the Altinity Kubernetes Operator then use the PersistentVolumeClaim
to receive persistent storage.
The PersistentVolume
can be set in one of two ways:
- Manually: Manual allocations set the storage area before the ClickHouse cluster is created. Space is then requested through a
PersistentVolumeClaim
when the ClickHouse cluster is created. - Dynamically: Space is allocated when the ClickHouse cluster is created through the
PersistentVolumeClaim
, and the Kubernetes controlling software manages the process for the user.
For more information on how persistent volumes are managed in Kubernetes, see the Kubernetes documentation Persistent Volumes.
Storage Types
Data stored for ClickHouse clusters in the following ways:
No Persistent Storage
If no persistent storage claim template is specified, then no persistent storage will be allocated. When Kubernetes is stopped or a new manifest applied, all previous data will be lost.
In this example two shards are specified but has no persistent storage allocated:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "no-persistent"
spec:
configuration:
clusters:
- name: "no-persistent"
layout:
shardsCount: 2
replicasCount: 1
When applied to the namespace test
, no persistent storage is found:
kubectl -n test get pv
No resources found
Cluster Wide Storage
If neither the dataVolumeClaimTemplate
or the logVolumeClaimTemplate
are specified (see below), then all data is stored under the requested volumeClaimTemplate
. This includes all information stored in each pod.
In this example, two shards are specified with one volume of storage that is used by the entire pods:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "cluster-storage"
spec:
configuration:
clusters:
- name: "cluster-storage"
layout:
shardsCount: 2
replicasCount: 1
templates:
volumeClaimTemplate: cluster-storage-vc-template
templates:
volumeClaimTemplates:
- name: cluster-storage-vc-template
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
When applied to the namespace test
the following persistent volumes are found. Note that each pod has 500Mb of storage:
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-6e70c36a-f170-47b5-93a6-88175c62b8fe 500Mi RWO Delete Bound test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-1-0-0 standard 21s
pvc-ca002bc4-0ad2-4358-9546-0298eb8b2152 500Mi RWO Delete Bound test/cluster-storage-vc-template-chi-cluster-storage-cluster-storage-0-0-0 standard 39s
Cluster Wide Split Storage
Applying the dataVolumeClaimTemplate
and logVolumeClaimTemplate
template types to the Altinity Kubernetes Operator controlled ClickHouse cluster allows for specific data from each ClickHouse pod to be stored in a particular persistent volume:
- dataVolumeClaimTemplate: Sets the storage volume for the ClickHouse node data. In a traditional ClickHouse server environment, this would be allocated to
/var/lib/clickhouse
. - logVolumeClaimTemplate: Sets the storage volume for ClickHouse node log files. In a traditional ClickHouse server environment, this would be allocated to
/var/log/clickhouse-server
.
This allows different storage capacities for log data versus ClickHouse database data, as well as only capturing specific data rather than the entire pod.
In this example, two shards have different storage capacity for dataVolumeClaimTemplate
and logVolumeClaimTemplate
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "cluster-split-storage"
spec:
configuration:
clusters:
- name: "cluster-split"
layout:
shardsCount: 2
replicasCount: 1
templates:
dataVolumeClaimTemplate: data-volume-template
logVolumeClaimTemplate: log-volume-template
templates:
volumeClaimTemplates:
- name: data-volume-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- name: log-volume-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
In this case, retrieving the PersistentVolume allocations shows two storage volumes per pod based on the specifications in the manifest:
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-0b02c5ba-7ca1-4578-b3d9-ff8bb67ad412 100Mi RWO Delete Bound test/log-volume-template-chi-cluster-split-storage-cluster-split-1-0-0 standard 21s
pvc-4095b3c0-f550-4213-aa53-a08bade7c62c 100Mi RWO Delete Bound test/log-volume-template-chi-cluster-split-storage-cluster-split-0-0-0 standard 40s
pvc-71384670-c9db-4249-ae7e-4c5f1c33e0fc 500Mi RWO Delete Bound test/data-volume-template-chi-cluster-split-storage-cluster-split-1-0-0 standard 21s
pvc-9e3fb3fa-faf3-4a0e-9465-8da556cb9eec 500Mi RWO Delete Bound test/data-volume-template-chi-cluster-split-storage-cluster-split-0-0-0 standard 40s
Pod Mount Based Storage
PersistentVolume
objects can be mounted directly into the pod’s mountPath
. Any other data is not stored when the container is stopped unless it is covered by another PersistentVolumeClaim
.
In the following example, each of the 2 shards in the ClickHouse cluster has the volumes tied to specific mount points:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "pod-split-storage"
spec:
configuration:
clusters:
- name: "pod-split"
# Templates are specified for this cluster explicitly
templates:
podTemplate: pod-template-with-volumes
layout:
shardsCount: 2
replicasCount: 1
templates:
podTemplates:
- name: pod-template-with-volumes
spec:
containers:
- name: clickhouse
image: yandex/clickhouse-server:21.8
volumeMounts:
- name: data-storage-vc-template
mountPath: /var/lib/clickhouse
- name: log-storage-vc-template
mountPath: /var/log/clickhouse-server
volumeClaimTemplates:
- name: data-storage-vc-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- name: log-storage-vc-template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
kubectl -n test get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-37be9f84-7ba5-404e-8299-e95a291014a8 500Mi RWO Delete Bound test/data-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0 standard 24s
pvc-5b2f8694-326d-41cb-94ec-559725947b45 100Mi RWO Delete Bound test/log-storage-vc-template-chi-pod-split-storage-pod-split-1-0-0 standard 24s
pvc-84768e78-e44e-4295-8355-208b07330707 500Mi RWO Delete Bound test/data-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0 standard 43s
pvc-9e123af7-01ce-4ab8-9450-d8ca32b1e3a6 100Mi RWO Delete Bound test/log-storage-vc-template-chi-pod-split-storage-pod-split-0-0-0 standard 43s