Monitoring a cluster
Monitoring a cluster
Altinity.Cloud integrates Grafana into its monitoring tools. From a cluster, you can quickly access the following monitoring views:
- Cluster Metrics
How to Access Cluster Metrics
To access the metrics views for your cluster:
- From the Clusters view, select the cluster to monitor.
- From Monitoring, select the drop down View in Grafana and select from one of the following options:
- Cluster Metrics
- Each metric view opens in a separate tab.
Cluster Metrics displays how the cluster is performing from a hardware and connection standpoint.
Some of the metrics displayed here include:
- DNS and Distributed Connection Errors: Displays the rate of any connection issues.
- Select Queries: The number of select queries submitted to the cluster.
- Zookeeper Transactions: The communications between the zookeeper nodes.
- ClickHouse Data Size on Disk: The total amount of data the ClickHouse database is using.
The Queries monitoring page displays the performance of clusters, including the top requests, queries that require the most memory, and other benchmarks. This can be useful in identifying queries that can cause performance issues and refactoring them to be more efficient.
The Log monitoring page displays the logs for your clusters, and allows you to make queries directly on them. If there’s a specific detail you’re trying to iron out, the logs are the most granular way of tracking down those issues.
The Cluster Alerts module allows users to set up when they are notified for a set fo events. Alerts can either be a popup, displaying the alert when the user is logged into Altinity.Cloud, or email so they can receive an alert even when they are not logged into Altinty.Cloud.
To set which alerts you receive:
From the Clusters view, select the cluster to for alerts.
Add the Email address to send alerts to.
Select whether to receive a Popup or Email alert for the following events:
- ClickHouse Version Upgrade: Alert triggered when the version of ClickHouse that is installed in the cluster has a new update.
- Cluster Rescale: Alert triggered when the cluster is rescaled, such as new shards added.
- Cluster Stop: Alert triggered when some event has caused the cluster to stop running.
- Cluster Resume: Alert triggered when a cluster that was stopped has resumed operations.
Cluster health check
From the Clusters View, you can see the health status of your cluster and its nodes at a glance.
How to Check Node Health
The quick health check of your cluster’s nodes is displayed from the Clusters View. Next to the cluster name is a summary of your nodes’ statuses, indicating the total number of nodes and how many nodes are available.
How to Check Cluster Health
The overall health of the cluster is shown in the Health row of the cluster summary, showing the number of health checks passed.
Click checked passed to view a detailed view of the cluster’s health.
How to View a Cluster’s Health Checks
The cluster’s Health Check module displays the status of the following health checks:
- Access point availability check
- Distributed query check
- Zookeeper availability check
- Zookeeper contents check
- Readonly replica check
- Delayed inserts check
To view details on what queries are used to verify the health check, select the caret for each health check.
Accessing cluster logs
Altinity.Cloud provides the cluster log details so users can track down specific issues or performance bottlenecks.
To access a cluster’s logs:
- From the Clusters view, select the cluster to for alerts.
- Select Logs.
- From the Log Page, you can display the number of rows to view, or filter logs by specific text.
- To download the logs, select the download icon in the upper right corner (A).
- To refresh the logs page, select the refresh icon (B).
The following logs are available:
- ACM Logs: These logs are specific to Altinity.Cloud issues and include the following:
- System Log: Details the system actions such as starting a cluster, updating endpoints, and other details.
- API Log: Displays updates to the API and activities.
- ClickHouse Logs: Displays the Common Log that stores ClickHouse related events. From this view a specific host can be selected form the dropdown box.
- Backup Logs: Displays backup events from the
clickhouse-backupservice. Log details per cluster host can be selected from the dropdown box.
- Operator Logs: Displays logs from the Altinity Kubernetes Operator service, which is used to manage cluster replication cluster and communications in the Kubernetes environment.
Notifications allow you to see any messages related to your Altinity.Cloud account. For example: billing, service issues, etc.
To access your notifications:
From the upper right corner of the top navigation bar, select your user ID, then Notifications.
The Notifications History page shows the notifications for your account, including the following:
- Message: The notifications message.
- Level: The priority level which can be:
- Danger: Critical notifications that can effect your clusters or account.
- Warning: Notifications of possible issues that are less than critical.
- News: Notifications of general news and updates in Altinity.Cloud.
- Info: Updates for general information.