Working with Replication

Using the ReplicatedMergeTree engine

At this point our ClickHouse® cluster has replication enabled. Whether you used Zookeeper or ClickHouse Keeper, we can now create ReplicatedMergeTree tables that automatically keep our replicas in sync.

Current database structure

The analytics database currently has a table on each of our two hosts. We can take a look from any of the pods:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator -- clickhouse-client

SELECT
    name,
    engine
FROM system.tables
WHERE database = 'analytics'

We’ve got our MergeTree table, as we’d expect:

   ┌─name───────┬─engine────┐
1. │ page_views │ MergeTree │
   └────────────┴───────────┘

1 row in set. Elapsed: 0.003 sec.

👉 Type exit to end the clickhouse-client session.

We set up page_views to hold our data. Let’s run SELECT * FROM analytics.page_views against all of the pods:

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "SELECT * from analytics.page_views;"
done

Running query on chi-cluster01-cluster01-0-0-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	USA
2025-01-01 12:05:00	102	/products	facebook.com	desktop	Canada
2025-01-01 12:10:00	103	/cart	twitter.com	tablet	UK
2025-01-02 14:00:00	101	/checkout	google.com	mobile	USA
2025-01-06 08:20:00	110	/blog	twitter.com	desktop	Australia
Running query on chi-cluster01-cluster01-0-1-0...

As you can see, host chi-cluster01-cluster01-0-0-0 has data, but its replica, chi-cluster01-cluster01-0-1-0, doesn’t have any.

(By the way, we’ll use this for...do technique to execute a single command on multiple nodes. If we need to do something more in-depth, we’ll connect to a node and use clickhouse-connect interactively.)

What we need is a new table with a ReplicatedMergeTree engine. When we insert data into that table, the new data is written to the replicas. (It replicates updates and deletes as well.) Having multiple synchronized copies of data is one of the basic principles of highly available systems, and the ReplicatedMergeTree gives us that.

So let’s create the table. We’ll do this using ON CLUSTER to put a new table with that engine on both hosts:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator -- clickhouse-client

CREATE TABLE analytics.page_views_replicated ON CLUSTER cluster01
(
    event_time DateTime,
    user_id UInt32,
    page_url String,
    referrer_url String,
    device String,
    country String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/analytics/page_views', '{replica}')
ORDER BY event_time;

That creates the table on both hosts:

   ┌─host────────────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
1. │ chi-cluster01-cluster01-0-1 │ 9000 │      0 │       │                   1 │                0 │
2. │ chi-cluster01-cluster01-0-0 │ 9000 │      0 │       │                   0 │                0 │
   └─────────────────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.067 sec.

👉 Type exit to end the clickhouse-client session.

So we’ve got our ReplicatedMergeTree tables; let’s query them.

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "SELECT * from analytics.page_views_replicated;"
done

Our results are underwhelming:

Running query on chi-cluster01-cluster01-0-0-0...
Running query on chi-cluster01-cluster01-0-1-0...

Our ReplicatedMergeTree table has the same schema as our MergeTree table, but it isn’t initialized with any data. We’ll have to do that ourselves. It’s easy enough; we’ll just connect to one of our pods and insert everything from analytics.page_views into analytics.page_views_replicated. The data added to one pod will be replicated to the other. Connect to a pod:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator -- clickhouse-client

Now copy the data:

INSERT INTO analytics.page_views_replicated SELECT * FROM analytics.page_views;

👉 Type exit to end the clickhouse-client session.

Now we should get the same results when we query each of the two hosts:

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "SELECT * from analytics.page_views_replicated;"
done

Running query on chi-cluster01-cluster01-0-0-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	USA
2025-01-01 12:05:00	102	/products	facebook.com	desktop	Canada
2025-01-01 12:10:00	103	/cart	twitter.com	tablet	UK
2025-01-02 14:00:00	101	/checkout	google.com	mobile	USA
2025-01-06 08:20:00	110	/blog	twitter.com	desktop	Australia
Running query on chi-cluster01-cluster01-0-1-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	USA
2025-01-01 12:05:00	102	/products	facebook.com	desktop	Canada
2025-01-01 12:10:00	103	/cart	twitter.com	tablet	UK
2025-01-02 14:00:00	101	/checkout	google.com	mobile	USA
2025-01-06 08:20:00	110	/blog	twitter.com	desktop	Australia

All the replicas are in sync. Now if we insert data into one host, the data should be copied to its replica as well. Here goes:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator \
  -- clickhouse-client -q "INSERT INTO analytics.page_views_replicated (event_time, user_id, page_url, referrer_url, device, country)  VALUES('2025-01-01 12:00:00', 106, '/home', 'google.com', 'mobile', 'Nigeria')"

That’s the first record we’ve inserted with a user from Nigeria. Let’s query all our hosts to see where that data is:

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "SELECT * FROM analytics.page_views_replicated where country = 'Nigeria';"
done

Running query on chi-cluster01-cluster01-0-0-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	Nigeria
Running query on chi-cluster01-cluster01-0-1-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	Nigeria

We added data to chi-cluster01-cluster01-0-0-0 but nowhere else. As you can see from the output, however, that value was also copied to its replica, chi-cluster01-cluster01-0-1-0. That’s what we want from replication: whenever we make a change to our data, it’s automatically synchronized with all of our replicas.

For housekeeping, now that we’ve got all of our data into the analytics.page_views_replicated table, we can drop the original analytics.page_views table:

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "DROP TABLE analytics.page_views;"
done

The last thing we’ll do with replication is create a new Distributed table that works with our replicated data. Whenever we do a QUERY, UPDATE, or DELETE, we’ll do that against the Distributed table. Connect to one of the nodes in our cluster:

kubectl exec -it chi-cluster01-cluster01-0-0-0 -n operator -- clickhouse-client

CREATE TABLE analytics.page_views_distributed ON CLUSTER cluster01 AS analytics.page_views_replicated
    ENGINE = Distributed(cluster01, analytics, page_views_replicated, rand())

   ┌─host────────────────────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
1. │ chi-cluster01-cluster01-0-0 │ 9000 │      0 │       │                   1 │                0 │
2. │ chi-cluster01-cluster01-0-1 │ 9000 │      0 │       │                   0 │                0 │
   └─────────────────────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘

2 rows in set. Elapsed: 0.066 sec.

👉 Type exit to end the clickhouse-client session.

Now querying our new Distributed table should show the same results from every host:

for pod in 0-0-0 0-1-0; do
  echo "Running query on chi-cluster01-cluster01-$pod..."
  kubectl exec -it chi-cluster01-cluster01-$pod -n operator \
  -- clickhouse-client -q "SELECT * FROM analytics.page_views_distributed where country = 'Nigeria';"
done

Sure enough, we get the same results everywhere:

Running query on chi-cluster01-cluster01-0-0-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	Nigeria
Running query on chi-cluster01-cluster01-0-1-0...
2025-01-01 12:00:00	101	/home	google.com	mobile	Nigeria

Where we are

That’s as far as we’ll go in this tutorial. Along the way, we:

Installed the Altinity Kubernetes Operator for ClickHouse
Created a ClickHouse cluster with a single shard and a single node
Added persistent storage to the cluster
Enabled replication with either Zookeeper or ClickHouse Keeper
Created a ReplicatedMergeTree database to use replication.

Although this ClickHouse cluster isn’t quite ready for managing massive data sets, all the basics are in place.

What’s next

Feel free to go ahead to the Adding a second shard to your cluster page, but be aware that you should only use shards if you have more than 5 TB of data.

What you add to your running cluster is up to you, but there are lots of things the operator can do. Here are just a few, but you can see the table of contents for the docs in the operator's repo:

Get complete details on installing the operator, configuring the operator, or upgrading the operator
Upgrading to a new version of ClickHouse
To monitor your ClickHouse cluster, you can adding Grafana support and Prometheus support
See the details of the ClickHouseInstallation custom resource
ClickHouse server settings, users, and profiles
Custom podTemplates (versions, resources)
Custom volumeClaimTemplate, multi-disk configurations
Separate settingsTemplates for different shards or teplicas
Distribution of shards and replicas across your Kubernetes cluster
Templates for services (names, pods, cloud annotations)
Defining the HostNetwork
Templates and auto-templates for chi