ZooKeeper Recovery

How to recover when Zookeeper has issues.

If there are issues with your ZooKeeper environment managing your ClickHouse clusters, the following steps can resolve them. If there are still issues, support is available to current Altinity customers.

Raising a Support Ticket

Altinity accepts support cases from its support partners via the Altinity Zendesk Portal, email, or Slack.

To log a ticket in the support portal:

  1. Login to https://altinity.zendesk.com using your email address.
  2. Press the “Add +” button to open a case.
  3. Enter case topic and details. Please include relevant information such as:
    1. ClickHouse version
    2. Error messages
    3. How to recreate the problem if you know

You can log a ticket by sending the above information to support@altinity.com using your registered email address or to the shared Slack channel if available.

Fault Diagnosis and Remediation

The following procedures can resolve issues.

  • IMPORTANT NOTE: Some procedures shown below may have a degree of risk depending on the underlying problem. These procedures are marked with Call Support and include raising a support ticket as the first step.

Restarting a crashed ClickHouse server

ClickHouse servers are managed by systemd and normally restart following a crash. If a server does not restart automatically, follow these steps:

  1. Access the ClickHouse error log for the failed server at /var/lib/clickhouse-server/clickhouse-server.err.log.
  2. Examine the last log entry and look for a stack trace showing the cause of the failure.
  3. If there is a stack trace:
    1. If the problem is obvious, fix the problem and run systemctl restart clickhouse-server to restart. Confirm that the server restarts.
    2. If the problem is not obvious, open an Altinity Support Ticket and provide the error log message.
  4. If there is no stack trace, ClickHouse may have been terminated by the OOM-killer due to excessive memory usage:
    1. Open the most recent syslog file at /var/log/syslog.
    2. Look for OOM-killer messages.
    3. If found, see Handling out-of-memory errors below.
    4. If the problem is not obvious, raise a support ticket and provide a description of the problem.

Replacing a failed cluster node

  1. Ensure the old node is truly offline and will not return.

  2. Create a new node with the same macros.xml definitions as the previous node.

  3. If possible use the same hostname as the failed node.

  4. Copy the metadata folder from a healthy replica.

  5. Set the force_restore_data so that ClickHouse wipes out existing ZooKeeper information for the node and replicates all data:

    sudo -u clickhouse touch /var/lib/clickhouse/flags/force_restore_data

  6. Start ClickHouse.

  7. Wait until all tables are replicated. You can check progress using:

    SELECT count(*) FROM system.replication_queue

Replacing a failed zookeeper node

  1. Configure ZooKeeper on a new server.
  2. Use the same hostname and myid as the failed node if possible.
  3. Start ZooKeeper on the new node.
  4. Verify the new node can connect to the ensemble.
  5. If the ZooKeeper environment does not support dynamic confirmation changes:
    1. If the new node has a different hostname or myid, modify zoo.cfg on the other nodes of the ensemble and restart them.
    2. ClickHouse’s sessions will be interrupted during this process.
  6. Make changes in ClickHouse configuration files if needed. A restart might be required for the changes to take effect.

Recovering from complete ZooKeeper loss (Call Support)

Complete loss of ZooKeeper is a serious event and should be avoided at all costs by proper ZooKeeper management. Follow this procedure only if you have lost all data in ZooKeeper as it is time-intensive and will cause affected tables to be unavailable.

  1. Raise a support ticket before taking any steps.
  2. Ensure that ZooKeeper is empty and working properly.
  3. Follow the instructions to convert ClickHouse replicated tables to non replicated from ClickHouse.tech.
  4. Once you can see the data, pick one of the non-replicated tables which will be used as the source. This will be used to resync the data to all other nodes.
  5. On that node follow the instructions to change ClickHouse non-replicated tables to replicated from ClickHouse.tech.
  6. On the remaining nodes drop MergeTree tables and create a Replicated table once again using the same definition as the healthy table.
  7. ClickHouse will sync from the healthy table to all other tables.

Read-only tables

Read-only tables occur when ClickHouse cannot access ZooKeeper to record inserts on a replicated table.

  1. Login with clickhouse-client.

  2. Execute the following query to confirm that ClickHouse can connect to ZooKeeper:

    $ clickhouse-client -q "select * from system.zookeeper where path='/'"

  3. This query should return one or more ZNode directories.

  4. Execute the following query to check the state of the table.

    SELECT * from system.replicas where table='table_name'

  5. If there are connectivity problems, check the following.

    1. Ensure the <zookeeper> tag in ClickHouse configuration has the correct ZooKeeper host names and ports.
    2. Ensure that ZooKeeper is running.
    3. Ensure that ZooKeeper is accepting connections. Login to the ZooKeeper host and try to connect using zkClient.sh.

Last modified 2021.03.22: Promoted Zookeeper higher. (bf18e46)