Zookeeper and Replicas
> kubectl create namespace test
namespace/test created
Now we’ve seen how to setup a basic cluster and upgrade it. Time to step up our game and setup our cluster with Zookeeper, and then add persistent storage to it.
The Altinity Kubernetes Operator does not install or manage Zookeeper. Zookeeper must be provided and managed externally. The samples below are examples on establishing Zookeeper to provide replication support. For more information running and configuring Zookeeper, see the Apache Zookeeper site.
This step can not be skipped - your Zookeeper instance must have been set up externally from your ClickHouse clusters. Whether your Zookeeper installation is hosted by other Docker Images or separate servers is up to you.
Install Zookeeper
Kubernetes Zookeeper Deployment
A really simple method of installing a single Zookeeper node is provided from the Altinity Kubernetes Operatordeployment samples. These provide samples deployments of Grafana, Prometheus, Zookeeper and other applications.
See the Altinity Kubernetes Operator deployment directory for a full list of sample scripts and Kubernetes deployment files.
The instructions below will create a new Kubernetes namespace zoo1ns
, and create a Zookeeper node in that namespace. Kubernetes nodes will refer to that Zookeeper node by the hostname zookeeper.zoo1ns
within the created Kubernetes networks.
To deploy a single Zookeeper node in Kubernetes from the Altinity Kubernetes Operator Github repository:
-
Download the Altinity Kubernetes Operator Github repository, either with
git clone https://github.com/Altinity/clickhouse-operator.git
or by selecting Code->Download Zip from the Altinity Kubernetes Operator GitHub repository . -
From a terminal, navigate to the
deploy/zookeeper
directory and run the following:
> cd clickhouse-operator/deploy/zookeeper
> ./quick-start-volume-emptyDir/zookeeper-1-node-create.sh
namespace/zoo1ns created
service/zookeeper created
service/zookeepers created
<span style="color: yellow;">Warning:</span> policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
poddisruptionbudget.policy/zookeeper-pod-disruption-budget created
statefulset.apps/zookeeper created
- Verify the Zookeeper node is running in Kubernetes:
> kubectl get all --namespace zoo1ns
NAME READY STATUS RESTARTS AGE
pod/zookeeper-0 0/1 Running 0 2s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/zookeeper ClusterIP 10.98.191.251 <none> 2181/TCP,7000/TCP 3s
service/zookeepers ClusterIP None <none> 2888/TCP,3888/TCP 3s
NAME READY AGE
statefulset.apps/zookeeper 0/1 3s
- Kubernetes nodes will be able to refer to the Zookeeper
node by the hostname
zookeeper.zoo1ns
.
Configure Kubernetes with Zookeeper
Once we start replicating clusters, we need Zookeeper to manage them. Create a new file sample03.yaml
and populate it with the following:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 2
replicasCount: 2
templates:
podTemplate: clickhouse-stable
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
Notice that we’re increasing the number of replicas from the sample02.yaml
file in the [First Clusters - No Storage]({<ref “quickcluster”>}) tutorial.
We’ll set up a minimal Zookeeper connecting cluster by applying our new configuration file:
> kubectl apply -f sample03.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 created
Verify it with the following:
kubectl -n test get chi -o wide
NAME VERSION CLUSTERS SHARDS HOSTS TASKID STATUS UPDATED ADDED DELETED DELETE ENDPOINT
demo-01 0.18.1 1 2 4 3c2cc45a-3d38-4e51-9eb5-f84cf4fcd778 Completed 4 clickhouse-demo-01.test.svc.cluster.local
kubectl get service -n test
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-demo-01-demo-01-0-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 110s
chi-demo-01-demo-01-0-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 93s
chi-demo-01-demo-01-1-0 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 48s
chi-demo-01-demo-01-1-1 ClusterIP None <none> 8123/TCP,9000/TCP,9009/TCP 17s
clickhouse-demo-01 LoadBalancer 10.103.31.139 <pending> 8123:30149/TCP,9000:30600/TCP 2m7s
If we log into our cluster and show the clusters, we can show
the updated results and that we have a total of 4 replicas
of demo-01
- two shards for each node with two replicas.
SELECT * FROM system.clusters
┌─cluster──────────────────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───────────────┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─slowdowns_count─┬─estimated_recovery_time─┐
│ all-replicated │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 2 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 3 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-replicated │ 1 │ 1 │ 4 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 2 │ 1 │ 1 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 3 │ 1 │ 1 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ all-sharded │ 4 │ 1 │ 1 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 1 │ chi-demo-01-demo-01-0-0 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 1 │ 1 │ 2 │ chi-demo-01-demo-01-0-1 │ 172.17.0.6 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 2 │ 1 │ 1 │ chi-demo-01-demo-01-1-0 │ 172.17.0.7 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ demo-01 │ 2 │ 1 │ 2 │ chi-demo-01-demo-01-1-1 │ 172.17.0.8 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 1 │ 1 │ 1 │ 127.0.0.1 │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_internal_replication │ 2 │ 1 │ 1 │ 127.0.0.2 │ 127.0.0.2 │ 9000 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_cluster_two_shards_localhost │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_shard_localhost_secure │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9440 │ 0 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 1 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 9000 │ 1 │ default │ │ 0 │ 0 │ 0 │
│ test_unavailable_shard │ 2 │ 1 │ 1 │ localhost │ 127.0.0.1 │ 1 │ 0 │ default │ │ 0 │ 0 │ 0 │
└──────────────────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────┴─────────────────────────┘
Distributed Tables
We have our clusters going - let’s test it out with some distributed tables so we can see the replication in action.
Login to your ClickHouse cluster and enter the following SQL statement:
CREATE TABLE test AS system.one ENGINE = Distributed('demo-01', 'system', 'one')
Once our table is created, perform a SELECT * FROM test
command.
We’ll see nothing because we didn’t give it any data,
but that’s all right.
SELECT * FROM test
┌─dummy─┐
│ 0 │
└───────┘
┌─dummy─┐
│ 0 │
└───────┘
Now let’s test out our results coming in.
Run the following command - this tells us just what shard is
returning the results. It may take a few times, but you’ll
start to notice the host name changes each time you run the
command SELECT hostName() FROM test
:
SELECT hostName() FROM test
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-0-0-0 │
└───────────────────────────┘
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-1-0-0 │
└───────────────────────────┘
SELECT hostName() FROM test
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-0-0-0 │
└───────────────────────────┘
┌─hostName()────────────────┐
│ chi-demo-01-demo-01-1-1-0 │
└───────────────────────────┘
This is showing us that the query is being distributed across different shards. The good news is you can change your configuration files to change the shards and replication however suits your needs.
One issue though: there’s no persistent storage.
If these clusters stop running, your data vanishes.
Next instruction will be on how to add persistent storage
to your ClickHouse clusters running on Kubernetes.
In fact, we can test by creating a new configuration
file called sample04.yaml
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "demo-01"
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-stable
templates:
podTemplates:
- name: clickhouse-stable
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:21.8.10.1.altinitystable
Make sure you’re exited out of your ClickHouse cluster, then install our configuration file:
kubectl apply -f sample04.yaml -n test
clickhouseinstallation.clickhouse.altinity.com/demo-01 configured
Notice that during the update that four pods were deleted, and then two new ones added.
When your clusters are settled down and back down to just 1 shard with 1 replication, log back into your ClickHouse database and select from table test:
SELECT * FROM test
Received exception from server (version 21.8.10):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.test doesn't exist.
command terminated with exit code 60
No persistent storage means any time your clusters are changed over, everything you’ve done is gone. The next article will cover how to correct that by adding storage volumes to your cluster.