Creating your first ClickHouse® cluster
At this point, you’ve got the Altinity Kubernetes Operator for ClickHouse® installed. Now let’s give it something to work with. We’ll start with a simple ClickHouse cluster here: no persistent storage, one replica, and one shard. (We’ll cover those topics over the next couple of steps.)
Creating your first cluster
Now that we have our namespace, we’ll create a simple cluster: one shard, one replica. Copy the following text and save it as manifest01.yaml
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: cluster01
spec:
templates:
podTemplates:
- name: clickhouse-pod-template
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:24.8.14.10501.altinitystable
configuration:
clusters:
- name: cluster01
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-pod-template
When you installed the operator, it defined a custom resource type called a ClickHouseInstallation
; that’s what we’re creating here. A ClickHouseInstallation
contains a ClickHouse server and lots of other useful things. Here we’re creating a ClickHouseInstallation
named cluster01
, and that cluster has one shard and one replica.
NOTE: The YAML above would be simpler if we didn’t specify a particular version of the altinity/clickhouse-server container image. By the time we go through all the exercises in this tutorial, however, things will be simpler because we were specific here. (Hopefully you’re just cutting and pasting anyway.)
Use kubectl
apply to create your ClickHouseInstallation
:
kubectl apply -f manifest01.yaml -n operator
You’ll see this:
clickhouseinstallation.clickhouse.altinity.com/cluster01 created
Verify that your new cluster is running:
kubectl get clickhouseinstallation -n operator
The status of your cluster will be In Progress for a minute or two. (BTW, the operator defines chi
as an abbreviation for clickhouseinstallation
. We’ll use chi
from now on.) When everything is ready, its status will be Completed
:
NAME CLUSTERS HOSTS STATUS HOSTS-COMPLETED AGE SUSPEND
cluster01 1 1 Completed 2m40s
PRO TIP: You can use the awesome kubens
tool to set the default namespace for all kubectl
commands. Type kubens operator
, and kubectl
will use the operator
namespace until you tell it otherwise. See the kubens / kubectx repo to get started. You’re welcome.
Now that we’ve got a ClickHouse cluster up and running, we’ll connect to it and run some basic commands.
Connecting to your cluster with kubectl exec
Let’s talk to our cluster and run a simple ClickHouse query. We can hop in directly through Kubernetes and run the clickhouse-client that’s part of the image. First, we have to get the name of the pod:
kubectl get pods -n operator | grep cluster01
You’ll see this:
chi-cluster01-cluster01-0-0-0 1/1 Running 0 75s
So chi-cluster01-cluster01-0-0-0
is the name of the pod running ClickHouse. We’ll connect to it with kubectl exec
and run clickhouse-client
on it:
kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator -- clickhouse-client
The ClickHouse server will welcome you:
ClickHouse client version 24.8.14.10501.altinitystable (altinity build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.8.14.
chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)
Now that we’re in ClickHouse, let’s run a query and look at some system data:
SELECT
cluster,
host_name,
port
FROM system.clusters
You’ll see something very similar to this:
┌─cluster────────┬─host_name───────────────────┬─port─┐
1. │ all-clusters │ chi-cluster01-cluster01-0-0 │ 9000 │
2. │ all-replicated │ chi-cluster01-cluster01-0-0 │ 9000 │
3. │ all-sharded │ chi-cluster01-cluster01-0-0 │ 9000 │
4. │ cluster01 │ chi-cluster01-cluster01-0-0 │ 9000 │
5. │ default │ localhost │ 9000 │
└────────────────┴─────────────────────────────┴──────┘
5 rows in set. Elapsed: 0.002 sec.
👉 Type exit
to end the clickhouse-client
session.
Let’s get some data in here!
Not so fast! At this point, you’d expect a tutorial to show you how to create a database and put some data into it. However, we haven’t defined any persistent storage for our cluster. If a pod fails, any data it had will be gone when the pod restarts. So we’ll add persistent storage to our cluster next.
👉 Next: Adding persistent storage
Optional: Connecting to your cluster directly
Most of the time when you’re working with ClickHouse, you connect directly to the cluster with clickhouse-connect
. To do that, you’ll need to set up network access for your cluster. The easiest way to do that is with kubectl port-forward
. First, look at the services in your namespace:
kubectl get svc -n operator
You’ll see the ports used by each service:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-cluster01-cluster01-0-0 ClusterIP None <none> 9000/TCP,8123/TCP,9009/TCP 3m53s
clickhouse-cluster01 ClusterIP None <none> 8123/TCP,9000/TCP 3m42s
clickhouse-operator-metrics ClusterIP 10.102.147.251 <none> 8888/TCP,9999/TCP 22m
So to connect directly to the first pod, use this command:
kubectl port-forward chi-cluster01-cluster01-0-0-0 9000:9000 8123:8123 9009:9009 -n operator &
Be sure to put the &
at the end to keep this running in the background.
You’ll see the PID for the process and the ports you can access directly:
[1] 72202
Forwarding from 127.0.0.1:9000 -> 9000
Forwarding from [::1]:9000 -> 9000
Forwarding from 127.0.0.1:8123 -> 8123
Forwarding from [::1]:8123 -> 8123
Forwarding from 127.0.0.1:9009 -> 9009
Forwarding from [::1]:9009 -> 9009
Without a hostname or port, clickhouse-client uses localhost:9000. That makes it easy to connect to our ClickHouse cluster. Just type clickhouse-client
at the command line:
> clickhouse-client
ClickHouse client version 23.5.3.1.
Connecting to localhost:9000 as user default.
Handling connection for 9000
Connected to ClickHouse server version 24.8.14 revision 54472.
ClickHouse client version is older than ClickHouse server. It may lack support for new features.
Handling connection for 9000
chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)
Now you can run SQL statements to your heart’s content. When you exit out of clickhouse-client
, be sure to stop port forwarding (kill 72202
, in this example).
Depending on your Kubernetes setup and provider, you may be able to use a LoadBalancer
to access the cluster directly, but this method, clumsy as it is, will always work. See the documentation for your Kubernetes provider for details.
Also be aware that when we’re using the clickhouse-client
that’s included in the clickhouse-server
container, the version of the client and the server are in sync. If you install clickhouse-client
directly on your machine, there’s no guarantee that the client and server will be the same version. (See the warning message above.) That’s unlikely to cause problems, but it’s something to be aware of if the system starts behaving strangely.
Now let’s add persistent storage to our cluster….