Creating your first ClickHouse® cluster
At this point, you’ve got the Altinity Kubernetes Operator for ClickHouse® installed. Now let’s give it something to work with. We’ll start with a simple ClickHouse cluster here: no persistent storage, one replica, and one shard. (We’ll cover those topics over the next couple of steps.)
Creating a namespace
To keep things organized, we’ll do everything in the quick
namespace. Which means we have to create it:
kubectl create namespace quick
PRO TIP: You can use the awesome kubens
tool to set the default namespace for all kubectl
commands. Type kubens quick
, and kubectl
will use the quick
namespace until you tell it otherwise. See the kubens / kubectx repo to get started. You’re welcome.
Creating your first cluster
Now that we have our namespace, we’ll create a simple cluster: one shard, one replica. Copy the following text and save it as manifest01.yaml
:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: cluster01
spec:
templates:
podTemplates:
- name: clickhouse-pod-template
spec:
containers:
- name: clickhouse
image: altinity/clickhouse-server:24.8.14.10459.altinitystable
configuration:
clusters:
- name: cluster01
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplate: clickhouse-pod-template
When you installed the operator, it defined a custom resource type called a ClickHouseInstallation
; that’s what we’re creating here. A ClickHouseInstallation
contains a ClickHouse server and lots of other useful things. Here we’re creating a cluster named cluster01
, and that cluster has one shard and one replica.
NOTE: The YAML above would be simpler if we didn’t specify a particular version of the altinity/clickhouse-server container image. By the time we go through all the exercises in this tutorial, however, things will be simpler because we were specific here. (Hopefully you’re just cutting and pasting anyway.)
Use kubectl
apply to create your ClickHouseInstallation
:
kubectl apply -f manifest01.yaml -n quick
You’ll see this:
clickhouseinstallation.clickhouse.altinity.com/cluster01 created
Verify that your new cluster is running:
kubectl get clickhouseinstallation -n quick
The status of your cluster will be In Progress for a minute or two. (BTW, the operator defines chi
as an abbreviation for clickhouseinstallation
. We’ll use chi
from now on.) When everything is ready, its status will be Completed
:
NAME CLUSTERS HOSTS STATUS HOSTS-COMPLETED AGE SUSPEND
cluster01 1 1 Completed 2m40s
Now that we’ve got a ClickHouse cluster up and running, we’ll connect to it and run some basic commands.
NOTE: As you know, working with Kubernetes can get complicated. We’ve tested this tutorial in different environments, but it’s possible something will go wrong along the way. If that happens, things like kubectl describe
and kubectl logs
can help you determine what’s going wrong. We’ve run into errors because our cluster didn’t have enough nodes, autoscaling wasn’t turned on, etc. Hopefully you won’t hit any snags, but those commands can help you troubleshoot any obstacles you encounter.
Connecting to your cluster with kubectl exec
Let’s talk to our cluster and run a simple ClickHouse query. We can hop in directly through Kubernetes and run the clickhouse-client that’s part of the image. First, we have to get the name of the pod:
kubectl get pods -n quick
You’ll see this:
NAME READY STATUS RESTARTS AGE
chi-cluster01-cluster01-0-0-0 1/1 Running 0 2m36s
So chi-cluster01-cluster01-0-0-0
is the name of the pod running ClickHouse. We’ll connect to it with kubectl exec
and run clickhouse-client
on it:
kubectl exec -it chi-cluster01-cluster01-0-0-0 -n quick -- clickhouse-client
The ClickHouse server will welcome you:
ClickHouse client version 24.8.14.10459.altinitystable (altinity build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.8.14.
chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)
Now that we’re in ClickHouse, let’s run a query and look at some system data:
SELECT
cluster,
host_name,
port
FROM system.clusters
You’ll see something very similar to this:
┌─cluster────────┬─host_name───────────────────┬─port─┐
1. │ all-clusters │ chi-cluster01-cluster01-0-0 │ 9000 │
2. │ all-replicated │ chi-cluster01-cluster01-0-0 │ 9000 │
3. │ all-sharded │ chi-cluster01-cluster01-0-0 │ 9000 │
4. │ cluster01 │ chi-cluster01-cluster01-0-0 │ 9000 │
5. │ default │ localhost │ 9000 │
└────────────────┴─────────────────────────────┴──────┘
5 rows in set. Elapsed: 0.002 sec.
👉 Type exit
to end the clickhouse-client
session.
Let’s get some data in here!
Not so fast! At this point, you’d expect a tutorial to show you how to put some data into the database you just created. However, we haven’t defined any persistent storage for our cluster. If a pod fails, any data it had will be gone when the pod restarts. So we’ll add persistent storage to our cluster next.
👉 Next: Adding persistent storage
Optional: Connecting to your cluster directly
Most of the time when you’re working with ClickHouse, you connect directly to the cluster with clickhouse-connect. To do that, you’ll need to set up network access for your cluster. The easiest way to do that is with kubectl port-forward. First, look at the services in your namespace:
kubectl get svc -n quick
You’ll see the ports used by each service:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
chi-cluster01-cluster01-0-0 ClusterIP None <none> 9000/TCP,8123/TCP,9009/TCP 35m
clickhouse-cluster01 ClusterIP None <none> 8123/TCP,9000/TCP 35m
So to connect directly to the first pod, use this command:
kubectl port-forward chi-cluster01-cluster01-0-0-0 9000:9000 8123:8123 9009:9009 -n quick &
Be sure to put the &
at the end to keep this running in the background.
You’ll see the PID for the process and the ports you can access directly:
[1] 72202
Forwarding from 127.0.0.1:9000 -> 9000
Forwarding from [::1]:9000 -> 9000
Forwarding from 127.0.0.1:8123 -> 8123
Forwarding from [::1]:8123 -> 8123
Forwarding from 127.0.0.1:9009 -> 9009
Forwarding from [::1]:9009 -> 9009
Without a hostname or port, clickhouse-client uses localhost:9000. That makes it easy to connect to our ClickHouse cluster:
> clickhouse-client
ClickHouse client version 23.5.3.1.
Connecting to localhost:9000 as user default.
Handling connection for 9000
Connected to ClickHouse server version 24.8.14 revision 54472.
ClickHouse client version is older than ClickHouse server. It may lack support for new features.
Handling connection for 9000
chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)
Now you can run SQL statements to your heart’s content. When you exit out of clickhouse-client
, be sure to stop port forwarding (kill 72202
, in this example).
Depending on your Kubernetes setup and provider, you may be able to use a LoadBalancer
to access the cluster directly, but this method, clumsy as it is, should always work. See the documentation for your Kubernetes provider for details.
Also be aware that when we’re using the clickhouse-client
that’s included in the clickhouse-server
container, the version of the client and the server are in sync. If you install clickhouse-client
directly on your machine, there’s no guarantee that the client and server will be the same version. (See the warning message above.) That’s unlikely to cause problems, but it’s something to be aware of if the system starts behaving strangely.
Now let’s add persistent storage to our cluster….