Monitoring a cluster

How to monitor and manager your clusters’ performance.

Monitoring a cluster

Altinity.Cloud integrates Grafana into its monitoring tools. From a cluster, you can quickly access the following monitoring views:

  • Cluster Metrics
  • Queries
  • Logs

How to Access Cluster Metrics

To access the metrics views for your cluster:

  1. From the Clusters view, select the cluster to monitor.
  2. From Monitoring, select the drop down View in Grafana and select from one of the following options:
    1. Cluster Metrics
    2. Queries
    3. Logs
  3. Each metric view opens in a separate tab.

Cluster Metrics

Cluster Metrics displays how the cluster is performing from a hardware and connection standpoint.

Cluster Monitoring View

Some of the metrics displayed here include:

  • DNS and Distributed Connection Errors: Displays the rate of any connection issues.
  • Select Queries: The number of select queries submitted to the cluster.
  • Zookeeper Transactions: The communications between the zookeeper nodes.
  • ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.

Queries

The Queries monitoring page displays the performance of clusters, including the top requests, queries that require the most memory, and other benchmarks. This can be useful in identifying queries that can cause performance issues and refactoring them to be more efficient.

Query Monitoring View

Log Metrics

The Log monitoring page displays the logs for your clusters, and allows you to make queries directly on them. If there’s a specific detail you’re trying to iron out, the logs are the most granular way of tracking down those issues.

Log Monitoring View

Cluster alerts

The Cluster Alerts module allows users to set up when they are notified for a set fo events. Alerts can either be a popup, displaying the alert when the user is logged into Altinity.Cloud, or email so they can receive an alert even when they are not logged into Altinty.Cloud.

To set which alerts you receive:

  1. From the Clusters view, select the cluster to for alerts.

  2. Select Alerts.

    Cluster Alerts
  3. Add the Email address to send alerts to.

  4. Select whether to receive a Popup or Email alert for the following events:

    1. ClickHouse Version Upgrade: Alert triggered when the version of ClickHouse that is installed in the cluster has a new update.
    2. Cluster Rescale: Alert triggered when the cluster is rescaled, such as new shards added.
    3. Cluster Stop: Alert triggered when some event has caused the cluster to stop running.
    4. Cluster Resume: Alert triggered when a cluster that was stopped has resumed operations.

Cluster health check

From the Clusters View, you can see the health status of your cluster and its nodes at a glance.

How to Check Node Health

The quick health check of your cluster’s nodes is displayed from the Clusters View. Next to the cluster name is a summary of your nodes’ statuses, indicating the total number of nodes and how many nodes are available.

View the Access Point

How to Check Cluster Health

The overall health of the cluster is shown in the Health row of the cluster summary, showing the number of health checks passed.

View the Access Point

Click checked passed to view a detailed view of the cluster’s health.

How to View a Cluster’s Health Checks

The cluster’s Health Check module displays the status of the following health checks:

  • Access point availability check
  • Distributed query check
  • Zookeeper availability check
  • Zookeeper contents check
  • Readonly replica check
  • Delayed inserts check

To view details on what queries are used to verify the health check, select the caret for each health check.

Cluster Health Details

Accessing cluster logs

Altinity.Cloud provides the cluster log details so users can track down specific issues or performance bottlenecks.

To access a cluster’s logs:

  1. From the Clusters view, select the cluster to for alerts.
  2. Select Logs.
  3. From the Log Page, you can display the number of rows to view, or filter logs by specific text.
  4. To download the logs, select the download icon in the upper right corner (A).
  5. To refresh the logs page, select the refresh icon (B).
Cluster Logs Page

The following logs are available:

  • ACM Logs: These logs are specific to Altinity.Cloud issues and include the following:
    • System Log: Details the system actions such as starting a cluster, updating endpoints, and other details.
    • API Log: Displays updates to the API and activities.
  • ClickHouse Logs: Displays the Common Log that stores ClickHouse related events. From this view a specific host can be selected form the dropdown box.
  • Backup Logs: Displays backup events from the clickhouse-backup service. Log details per cluster host can be selected from the dropdown box.
  • Operator Logs: Displays logs from the Altinity Kubernetes Operator service, which is used to manage cluster replication cluster and communications in the Kubernetes environment.

Notifications

Notifications allow you to see any messages related to your Altinity.Cloud account. For example: billing, service issues, etc.

To access your notifications:

  1. From the upper right corner of the top navigation bar, select your user ID, then Notifications.

    Access notifications

Notifications History

The Notifications History page shows the notifications for your account, including the following:

  • Message: The notifications message.
  • Level: The priority level which can be:
    • Danger: Critical notifications that can effect your clusters or account.
    • Warning: Notifications of possible issues that are less than critical.
    • News: Notifications of general news and updates in Altinity.Cloud.
    • Info: Updates for general information.