Apache Superset is a powerful and convenient way to create dashboards to display information for your organization. This presents information that allows for quick visual understand of trends, where events are happening, and keeps your team on top of the massive amounts of data coming into your organization.
When combined with ClickHouse and its capacity to analyze terabytes of information in milliseconds, the integration of Superset and ClickHouse can reduce the sea of seemingly disconnected data flowing through into an understandable whole.
Integrating Superset with ClickHouse is broken down into three major steps:
Connect Superset to ClickHouse.
Connect Superset dashboard elements to ClickHouse queries.
1 - Install Superset
Superset installation instructions
Three methods of installing Superset are provided here:
Direct installation: Directly download the Superset Python modules into an operating system.
Docker based installation: Use Docker containers to provide Superset, whether as stand alone containers or part of the Kubernetes environment.
Managed installation: Use managed services to provide Superset, and connect it to your ClickHouse.
The following instructions are based an Ubuntu 20.04 environment. For information on how to install Superset for other operating systems, see Installing Superset from Scratch.
Direct Installation Prerequisites
Before installing Superset on Ubuntu 20.04, the following packages must be installed, and will require administrative access on the machine.
Note that Superset requires Python, specifically Python 3.7.9 and above as of this document. These instructions are modified from the original to include installing python3-venv as part of the other prerequisites.
To install Superset directly into the Ubuntu 20.04 operating system:
A virtual environment is highly recommended to keep all of the Superset python requirements contained. Use the following commands to create your Superset virtual environment, activate it, then upgrade pip:
Install Superset with the following in your virtual environment. The command superset fab create-admin sets the admin user password, which will be used to login and update administrative level Superset settings:
pip install apache-superset
superset db upgrade
superset fab create-admin
Install the ClickHouse SQLAlchemy - this allows Superset to communicate with ClickHouse. There are two drivers currently in development. As of this time, we recommend the clickhouse-sqlalchemy driver. To install clickhouse-sqlalchemy, use the pip command:
pip install clickhouse-sqlalchemy
For those who want to enable TLS communications with services like Altinity.Cloud, verify the versions:
The clickhouse-driver version should be greater than 0.2.0.
The clickhouse-sqlalchemy driver should be 0.1.6 or greater.
Once Superset is installed, start it with the following command:
superset run -p 8088 --with-threads --reload --debugger
Access the Superset web interface with a browser at the host name, port 8088. For example: http://localhost:8088. The default administrative account will be admin, with the password set from the command superset fab create-admin as listed above.
For organizations that prefer Docker based installations, or want to add Superset to a Kubernetes environment along with ClickHouse on Kubernetes, Superset can be installed with a few Docker commands.
The following instructions use the existing Apache Superset docker-compose file from the Github repository.
docker 19 and above, with either docker-compose version 1.29 and above, or the built in docker compose command that’s part of Docker 20.10 and above.
Docker Based Installation Instructions
To install Superset with a Docker Container:
Download the Superset configuration files from the GitHub repository:
git clone https://github.com/apache/superset
Enter the superset directory and set the clickhouse-driver and the clickhouse-sqlalchemy version requirements:
echo"clickhouse-driver>=0.2.0" >> ./docker/requirements-local.txt
echo"clickhouse-sqlalchemy>=0.1.6" >> ./docker/requirements-local.txt
Run either docker-compose or docker compose to download the required images and start the Superset Docker containers:
docker-compose -f docker-compose-non-dev.yml up
Starting Superset From Docker
Installing and starting Superset from Docker is the same docker-compose or docker compose installation command:
docker-compose -f docker-compose-non-dev.yml up
Access the Superset web interface with a browser at the server’s host name or IP address on port 8088. For example: http://localhost:8088. The default administrative account will be admin, with the password admin.
For organizations that prefer managed services, Preset Cloud offers Superset with with clickhouse-sqlalchemy. This can then be connected to an Altinity.Cloud account.
Production Installation Tips
The examples provided above are useful for development and testing environments. For full production environments installation and deployments, refer to the Superset Documentation site.
Some recommendations for organizations that want to install Superset in a production environment:
Replace the default SQLite database used to store the Superset settings with something more robust like PostgreSQL or MySQL. For an example of the process, see the article Migrating Superset to Postgres.
After Superset has been setup and the ClickHouse SQLAlchemy drivers installed, the two systems can be connected together. The following details how to connect Superset to an existing ClickHouse database.
The SQLAlchemy format uses the following elements:
SQLAlchemy Connection Type: Sets the driver to use to connect.
username: The database account to use for authentication.
password: The password for the database account being used.
url: The full hostname to the database. This does not include tags such as HTTP or HTTPS - just the host name.
options: Any additional options. For example, if the database connection is encrypted, then the option secure=true will be required. For more information on setting up ClickHouse network encryption, see Network Hardening.
Select Test Connection to verify the connection.
If the connection works, then click Add.
Once finished, the new database connection to ClickHouse will be available for use.
verify: Default to true. If true, then the certificate used for encryption must be validated. Set to false if using a self-signed certificate. For more information on ClickHouse certificates, see Network Hardening.
What ports are used for the Superset to ClickHouse Connection?
The Superset connection to ClickHouse uses the TCP connection over the following ports:
If using the encrypted connection, then the option secure=yes is required.
3 - Create Charts from ClickHouse Data
Build stunning dashboards populated by ClickHouse data.
Once Superset has been installed and connected to ClickHouse, new charts can be created that draw their visualizations from the ClickHouse data.
To change the Virtual Dataset to use, select the … in the upper right under the Dataset heading, then select Change Dataset.
Select the Virtual Dataset to use, then click Proceed to acknowledge that changing the chart’s dataset may break the chart.
Set the following required options:
Visualization Type: How the data will be displayed in the chart (A).
Name: This is above the chart. Select it once, then set the name (B).
Depending on the Visualization Type selected, complete the other options.
When finished, select + Save to save the chart.
Provide the chart name.
Select a Dashboard to add the chart to, or enter a new Dashboard name and a title.
Select Save when finished, or Save and Go To Dashboard to view the new chart.
Adding Charts FAQ
When I Add a Deck.gl Chart, I Get the Message ‘No Token Warning’. How do I fix it?
The NO_TOKEN_WARNING when using charts such as deck.gl Arc occurs when there is no API token to a map service for generating the map used with the geographic data. In this example, the geographic information provides flight information, but there is no token for the mapbox service to retrieve the map data.
A token can be retrieved from mapbox after completing their registration process. After obtaining the token, place it in the file superset_config.py with the variable MAPBOX_API_KEY. For example:
For Docker based installations, this is in the directory docker/pythonpath_dev. Once added, restart Superset: