Creating your first ClickHouse® cluster

How to create a cluster and make sure it’s running

At this point, you’ve got the Altinity Kubernetes Operator for ClickHouse® installed. Now let’s give it something to work with. We’ll start with a simple ClickHouse cluster here: no persistent storage, one replica, and one shard. (We’ll cover those topics over the next couple of steps.)

Creating a namespace

To keep things organized, we’ll do everything in the quick namespace. Which means we have to create it:

kubectl create namespace quick

PRO TIP: You can use the awesome kubens tool to set the default namespace for all kubectl commands. Type kubens quick, and kubectl will use the quick namespace until you tell it otherwise. See the kubens / kubectx repo to get started. You’re welcome.

Creating your first cluster

Now that we have our namespace, we’ll create a simple cluster: one shard, one replica. Copy the following text and save it as manifest01.yaml:

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: cluster01
spec:
  templates:
    podTemplates:
      - name: clickhouse-pod-template
        spec:
          containers:
            - name: clickhouse
              image: altinity/clickhouse-server:24.8.14.10459.altinitystable
  configuration:
    clusters:
      - name: cluster01
        layout:
          shardsCount: 1
          replicasCount: 1
        templates:
          podTemplate: clickhouse-pod-template

When you installed the operator, it defined a custom resource type called a ClickHouseInstallation; that’s what we’re creating here. A ClickHouseInstallation contains a ClickHouse server and lots of other useful things. Here we’re creating a cluster named cluster01, and that cluster has one shard and one replica.

NOTE: The YAML above would be simpler if we didn’t specify a particular version of the altinity/clickhouse-server container image. By the time we go through all the exercises in this tutorial, however, things will be simpler because we were specific here. (Hopefully you’re just cutting and pasting anyway.)

Use kubectl apply to create your ClickHouseInstallation:

kubectl apply -f manifest01.yaml -n quick

You’ll see this:

clickhouseinstallation.clickhouse.altinity.com/cluster01 created

Verify that your new cluster is running:

kubectl get clickhouseinstallation -n quick

The status of your cluster will be In Progress for a minute or two. (BTW, the operator defines chi as an abbreviation for clickhouseinstallation. We’ll use chi from now on.) When everything is ready, its status will be Completed:

NAME        CLUSTERS   HOSTS   STATUS      HOSTS-COMPLETED   AGE     SUSPEND  
cluster01   1          1       Completed                     2m40s

Now that we’ve got a ClickHouse cluster up and running, we’ll connect to it and run some basic commands.

NOTE: As you know, working with Kubernetes can get complicated. We’ve tested this tutorial in different environments, but it’s possible something will go wrong along the way. If that happens, things like kubectl describe and kubectl logs can help you determine what’s going wrong. We’ve run into errors because our cluster didn’t have enough nodes, autoscaling wasn’t turned on, etc. Hopefully you won’t hit any snags, but those commands can help you troubleshoot any obstacles you encounter.

Connecting to your cluster with kubectl exec

Let’s talk to our cluster and run a simple ClickHouse query. We can hop in directly through Kubernetes and run the clickhouse-client that’s part of the image. First, we have to get the name of the pod:

kubectl get pods -n quick

You’ll see this:

NAME                            READY   STATUS    RESTARTS   AGE  
chi-cluster01-cluster01-0-0-0   1/1     Running   0          2m36s

So chi-cluster01-cluster01-0-0-0 is the name of the pod running ClickHouse. We’ll connect to it with kubectl exec and run clickhouse-client on it:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n quick -- clickhouse-client

The ClickHouse server will welcome you:

ClickHouse client version 24.8.14.10459.altinitystable (altinity build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.8.14.

chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)

Now that we’re in ClickHouse, let’s run a query and look at some system data:

SELECT
    cluster,
    host_name,
    port
FROM system.clusters

You’ll see something very similar to this:

   ┌─cluster────────┬─host_name───────────────────┬─port─┐
1.  all-clusters    chi-cluster01-cluster01-0-0  9000 
2.  all-replicated  chi-cluster01-cluster01-0-0  9000 
3.  all-sharded     chi-cluster01-cluster01-0-0  9000 
4.  cluster01       chi-cluster01-cluster01-0-0  9000 
5.  default         localhost                    9000 
   └────────────────┴─────────────────────────────┴──────┘

5 rows in set. Elapsed: 0.002 sec.

👉 Type exit to end the clickhouse-client session.

Let’s get some data in here!

Not so fast! At this point, you’d expect a tutorial to show you how to put some data into the database you just created. However, we haven’t defined any persistent storage for our cluster. If a pod fails, any data it had will be gone when the pod restarts. So we’ll add persistent storage to our cluster next.

👉 Next: Adding persistent storage

Optional: Connecting to your cluster directly

Most of the time when you’re working with ClickHouse, you connect directly to the cluster with clickhouse-connect. To do that, you’ll need to set up network access for your cluster. The easiest way to do that is with kubectl port-forward. First, look at the services in your namespace:

kubectl get svc -n quick

You’ll see the ports used by each service:

NAME                          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                      AGE  
chi-cluster01-cluster01-0-0   ClusterIP   None         <none>        9000/TCP,8123/TCP,9009/TCP   35m  
clickhouse-cluster01          ClusterIP   None         <none>        8123/TCP,9000/TCP            35m

So to connect directly to the first pod, use this command:

kubectl port-forward chi-cluster01-cluster01-0-0-0 9000:9000 8123:8123 9009:9009 -n quick &

Be sure to put the & at the end to keep this running in the background.

You’ll see the PID for the process and the ports you can access directly:

[1] 72202

Forwarding from 127.0.0.1:9000 -> 9000    
Forwarding from [::1]:9000 -> 9000  
Forwarding from 127.0.0.1:8123 -> 8123  
Forwarding from [::1]:8123 -> 8123  
Forwarding from 127.0.0.1:9009 -> 9009  
Forwarding from [::1]:9009 -> 9009

Without a hostname or port, clickhouse-client uses localhost:9000. That makes it easy to connect to our ClickHouse cluster:

> clickhouse-client
ClickHouse client version 23.5.3.1.
Connecting to localhost:9000 as user default.
Handling connection for 9000
Connected to ClickHouse server version 24.8.14 revision 54472.

ClickHouse client version is older than ClickHouse server. It may lack support for new features.

Handling connection for 9000  
chi-cluster01-cluster01-0-0-0.chi-cluster01-cluster01-0-0.quick.svc.cluster.local :)

Now you can run SQL statements to your heart’s content. When you exit out of clickhouse-client, be sure to stop port forwarding (kill 72202, in this example).

Depending on your Kubernetes setup and provider, you may be able to use a LoadBalancer to access the cluster directly, but this method, clumsy as it is, should always work. See the documentation for your Kubernetes provider for details.

Also be aware that when we’re using the clickhouse-client that’s included in the clickhouse-server container, the version of the client and the server are in sync. If you install clickhouse-client directly on your machine, there’s no guarantee that the client and server will be the same version. (See the warning message above.) That’s unlikely to cause problems, but it’s something to be aware of if the system starts behaving strangely.

Now let’s add persistent storage to our cluster….

👉 Next: Adding persistent storage