Configuring Data Lake Catalogs
The ACM makes it easy to work with data lakes in ClickHouse®. Clicking the Data Lake Catalogs menu item lets you create a data lake in Altinity.Cloud or connect to an AWS Glue data lake.
Defining an database with a DataLakeEngine
database engine
The first time you try to create a data lake, you’ll need to set the allow_experimental_database_iceberg
property:
Figure 1 - Iceberg database support not enabled
Clicking the button enables the property, although it may take a short while:
Figure 2 - Iceberg database support being enabled
You can click the button until support is enabled. Then you’ll see this dialog:
Figure 3 - Creating a database with a DataLakeCatalog
engine
In this example, we’re creating a new database named sales
with a DataLakeCatalog
engine. The text area at the bottom of the dialog changes to reflect your choices. Click the button to create your new database. The connection will be successful, but in Figure 4, no data is in that database yet:
Figure 4 - Database is connected, but has no tables
Connecting to an AWS Glue data lake
The first time you try to work with an AWS Glue data lake, you’ll need to set the allow_experimental_database_glue_catalog
property:
Figure 5 - Glue catalog support not enabled
Clicking the button enables the property, although it may take a short while:
Figure 6 - Glue catalog support being enabled
You can click the button until support is enabled. Then you’ll see this dialog:
Figure 7 - Connecting to an AWS Glue data lake
In Figure 7, we’re connecting to an AWS Glue data lake and creating a database named sales
that will give us access to the Glue catalog. In addition to the ClickHouse database name, select an AWS region and enter your access key and secret key. The text area at the bottom of the dialog changes to reflect your choices.
Click the button to create the connection and database. The ACM will use the metadata in the Glue catalog to create new tables; querying the ClickHouse tables will bring results from the Glue catalog. When the connection is complete, you’ll see a success message and a list of tables in your database:
Figure 8 - The tables created from the connection to the Glue catalog
Going to the Schema tab of the Cluster Explorer shows the tables created in the sales
database based on the metadata in the Glue catalog:
Figure 9 - Tables available through the Glue catalog