Using OAuth with ClickHouse®
Altinity’s Project Antalya supports authenticating users via OAuth tokens issued by an external Identity Provider (IdP) such as Keycloak, Microsoft Entra ID, or any service that supports OpenID Connect (OIDC). This allows you to manage users at an organizational level instead of defining them in ClickHouse and other systems. OAuth support first shipped in Altinity Antalya 26.1, and is officially GA as of Altinity Antalya 26.3.
The power of OAuth
Managing access to ClickHouse with OAuth has substantial advantages over a static username and password:
- Tokens that expire - Even if a credential is compromised, its lifespan is short, so the potential for damage is limited.
- Fine-grained access control - Given the claims in an access token, ClickHouse can determine exactly what roles and grants apply.
- Audit trail - An OAuth server provides a full audit trail of every token issued to a service account.
When someone or something authenticates with an IdP, they get three tokens:
- An access token that tells ClickHouse the roles held by anyone who has this token
- An ID token that identifies a user or application
- A refresh token that ClickHouse can use to refresh an access token.
A ClickHouse server configured to use OAuth manages these tokens for you. Typically, a refresh token has a much longer lifecycle than an access token. If the access token has expired, ClickHouse can send the refresh token to the IdP to get a new one. If the refresh token has expired as well, the user or application has to reauthenticate with the IdP.
Most importantly, given the claims in an access token, ClickHouse can determine what roles and grants the holder of that access token should have. You have the simplicity of managing users with OAuth without giving up the full power of ClickHouse’s RBAC.
That’s as much depth as we’ll go into with OAuth. If you’d like to learn about the underlying technology, we recommend Aaron Parecki's excellent tutorial, OAuth 2 Simplified.
ClickHouse and LLMs
We’ll get into the details in a minute, but OAuth is a powerful tool for managing access permissions for an LLM. From a ClickHouse server’s perspective, an LLM querying ClickHouse looks like any other non-human request. With OAuth in the picture, the LLM authenticates with a client ID and secret, then receives an access token in return. That access token tells ClickHouse exactly what access the LLM has. No matter what someone might ask the LLM to do, it can only see what its access token allows. Nothing more.
Access control enforces itself automatically when the LLM asks a question, without any application-level logic. The LLM can’t work around it by asking a question a different way or trying a different query — if the access token doesn’t have the role, ClickHouse gives the LLM a firm NO.
Enabling token authentication
You don’t need to do anything here; token authentication is enabled by default. However, you can disable it in config.xml if you want:
<enable_token_auth>0</enable_token_auth>
When disabled, token processors are not parsed, TokenAccessStorage is not available, and authentication via tokens (--jwt option or Authorization: Bearer header) is rejected.
Defining token processors
To use token authentication, you need to add a token_processors section to config.xml and define at least one token processor in it. The parameters that define a token processor are different for different token processor types.
Common parameters
type- The type of token processor. Supported values arejwt_static_key,jwt_static_jwks,jwt_dynamic_jwks,azure, andopenid. Mandatory, case-insensitive.token_cache_lifetime- The maximum lifetime of a cached token in seconds. Optional, default:3600.username_claim- The name of the claim (field) that will be treated as the ClickHouse username. Optional, default:sub.groups_claim- The name of the claim (field) that contains list of groups user belongs to. This claim will be looked up in the token itself (in case token is a valid JWT, e.g. in Keycloak) or in the response from/userinfo. Optional, default:groups.
To reduce the number of requests to the IdP, tokens are cached internally for a maximum period of token_cache_lifetime seconds. If a token expires sooner than token_cache_lifetime, then the cache entry for this token will only be valid while the token is valid. If the token lifetime is longer than token_cache_lifetime, cache entry for this token will be valid for token_cache_lifetime.
There are additional specific parameters for each token processor type, some of which are mandatory. Any parameters that are not required for the current processor type are ignored.
JWT (JSON Web Token)
JWT itself is a source of information about a user. It is decoded locally and its integrity is verified using either a local static key or a JWKS (JSON Web Key Set) that can be local or remote.
JWT with a local static key
<clickhouse>
<token_processors>
<my_static_key_validator>
<type>jwt_static_key</type>
<algo>HS256</algo>
<static_key>my_static_secret</static_key>
</my_static_key_validator>
</token_processors>
</clickhouse>
Parameters:
algo- The algorithm for signature validation. Mandatory. Supported values:
| HMAC | RSA | ECDSA | PSS | EdDSA |
|---|---|---|---|---|
| HS256 | RS256 | ES256 | PS256 | Ed25519 |
| HS384 | RS384 | ES384 | PS384 | Ed448 |
| HS512 | RS512 | ES512 | PS512 | |
| ES256K |
None is also supported, but it is not recommended and must NEVER be used in production.
claims- A string containing a JSON object that should be contained in the token payload. If this parameter is defined, a token without a corresponding payload will be considered invalid. Optional.static_key- The key for symmetric algorithms. Mandatory forHS*family algorithms.static_key_in_base64- Indicates if thestatic_keykey is base64-encoded. Optional, default:False.public_key- The public key for asymmetric algorithms. Mandatory except forHS*family algorithms andNone.private_key- The private key for asymmetric algorithms. Optional.public_key_password- The public key password. Optional.private_key_password- The private key password. Optional.expected_issuer- The expected value of theiss(issuer) claim in the JWT. If specified, tokens with a different issuer will be rejected. Optional.expected_audience- The expected value of theaud(audience) claim in the JWT. If specified, tokens with a different audience will be rejected. Optional.allow_no_expiration- Iftrue, tokens without theexp(expiration) claim are accepted. Otherwise they are rejected. Optional, default:false.
JWT with a static JWKS
<clickhouse>
<token_processors>
<my_static_jwks_validator>
<type>jwt_static_jwks</type>
<static_jwks>{"keys": [{"kty": "RSA", "alg": "RS256", "kid": "mykid", "n": "_public_key_mod_", "e": "AQAB"}]}</static_jwks>
</my_static_jwks_validator>
</token_processors>
</clickhouse>
Parameters:
static_jwks- The content of the JWKS in JSONstatic_jwks_file- The path to a file with the JWKSclaims- A string containing a JSON object that should be contained in the token payload. If this parameter is defined, a token without the corresponding payload will be considered invalid. Optional.verifier_leeway- Specifies the clock skew tolerance in seconds. Useful for handling small differences in system clocks between ClickHouse and the token issuer. Optional.expected_issuer- The expected value of theiss(issuer) claim in the JWT. If specified, tokens with a different issuer will be rejected. Optional.expected_audience- The expected value of theaud(audience) claim in the JWT. If specified, tokens with a different audience will be rejected. Optional.allow_no_expiration- Iftrue, tokens without theexp(expiration) claim are accepted. Otherwise they are rejected. Optional, default:false.
NOTES:
- Only one of
static_jwksorstatic_jwks_filekeys must be present in a verifier - For JWKS-based validators (
jwt_static_jwksandjwt_dynamic_jwks), RS* and ES* family algorithms are supported.
JWT with remote JWKS
<clickhouse>
<token_processors>
<basic_auth_server>
<type>jwt_dynamic_jwks</type>
<jwks_uri>http://localhost:8000/.well-known/jwks.json</jwks_uri>
<jwks_cache_lifetime>3600</jwks_cache_lifetime>
</basic_auth_server>
</token_processors>
</clickhouse>
Parameters:
jwks_uri- The JWKS endpoint. Mandatory.jwks_cache_lifetime- The time period in seconds between resend requests for refreshing the JWKS. Optional, default:3600.claims- A string containing a JSON object that should be contained in the token payload. If this parameter is defined, a token without a corresponding payload will be considered invalid. Optional.verifier_leeway- Specifies the clock skew tolerance in seconds. Useful for handling small differences in system clocks between ClickHouse and the token issuer. Optional.expected_issuer- The expected value of theiss(issuer) claim in the JWT. If specified, tokens with a different issuer will be rejected. Optional.expected_audience- The expected value of theaud(audience) claim in the JWT. If specified, tokens with a different audience will be rejected. Optional.allow_no_expiration- Iftrue, tokens without theexp(expiration) claim are accepted. Otherwise they are rejected. Optional, default:false.
Processors with external providers
Some tokens cannot be decoded and validated locally; Antalya works with external service providers in that case. Currently supported provider types are azure and OpenID (a generic type).
Azure
<clickhouse>
<token_processors>
<azure>
<type>azure</type>
<token_cache_lifetime>60</token_cache_lifetime>
</azure>
</token_processors>
<user_directories>
<token>
<processor>azuure</processor>
<common_roles>
<token_test_role_1 />
</common_roles>
</token>
</user_directories>
</clickhouse>
No additional parameters are required.
OpenID
<clickhouse>
<token_processors>
<keycloak>
<type>OpenID</type>
<userinfo_endpoint>http://keycloak:8080/realms/grafana/protocol/openid-connect/userinfo</userinfo_endpoint>
<token_introspection_endpoint>http://keycloak:8080/realms/grafana/protocol/openid-connect/token/introspect</token_introspection_endpoint>
<token_cache_lifetime>60</token_cache_lifetime>
</keycloak>
</token_processors>
<user_directories>
<token>
<processor>keycloak</processor>
<common_roles>
<reader_role />
</common_roles>
<roles_transform>s/-/_/g</roles_transform>
</token>
</user_directories>
</clickhouse>
NOTE: Either configuration_endpoint or both userinfo_endpoint and token_introspection_endpoint (and, optionally, jwks_uri) shall be set. If none of them are set or all three are set, this is an invalid configuration that will not be parsed.
Parameters:
configuration_endpoint- The URI of the OpenID configuration (often ends with.well-known/openid-configuration).userinfo_endpoint- The URI of the endpoint that returns user information in exchange for a valid token.token_introspection_endpoint- The URI of the token introspection endpoint (returns information about a valid token).jwks_uri- The URI of the OpenID configuration (often ends with.well-known/jwks.json)jwks_cache_lifetime- The time period in seconds between resend requests for refreshing the JWKS. Optional, default:3600.verifier_leeway- Specifies the clock skew tolerance in seconds. Useful for handling small differences in system clocks between ClickHouse and the token issuer. Optional, default:60expected_issuer- The expected value of theiss(issuer) claim in the JWT. If specified, tokens with a different issuer will be rejected. Optional.expected_audience- The expected value of theaud(audience) claim in the JWT. If specified, tokens with a different audience will be rejected. Optional.allow_no_expiration- Iftrue, tokens without theexp(expiration) claim are accepted. Otherwise they are rejected. Optional, default:false.
Sometimes a token is a valid JWT. In that case, the token will be decoded and validated locally if the configuration endpoint returns a JWKS URI (or jwks_uri is specified alongside userinfo_endpoint and token_introspection_endpoint).
Enabling token authentication for a user
Enabling token authentication with SQL is simple:
CREATE USER my_user IDENTIFIED WITH jwt;
You can also enable token authentication for a user in users.xml. Specify jwt section instead of password or other similar sections in the user definition.
Parameters:
claims- An optional string containing a json object that should be contained in the token payload.
Example (goes into users.xml):
<clickhouse>
<my_user>
<jwt>
<claims>{"resource_access":{"account": {"roles": ["view-profile"]}}}</claims>
</jwt>
</my_user>
</clickhouse>
Here, the JWT payload must contain ["view-profile"] on path resource_access.account.roles, otherwise authentication will not succeed even with a valid JWT.
NOTE: Per-user claims are enforced only when the token is a JWT (validated by a JWT processor such as jwt_static_key or jwt_dynamic_jwks). When the user authenticates with an opaque (access) token (e.g. via Azure, OpenID, or Google token processors), claims are not checked and authentication succeeds if the token is otherwise valid.
{
. . .
"resource_access": {
"account": {
"roles": ["view-profile"]
}
},
. . .
}
NOTE: A user definition cannot have JWT authentication together with any other authentication method. The presence of any other sections such as password alongside jwt will force ClickHouse to shut down.
Using an IdP as an external user directory
If there is no suitable user pre-defined in ClickHouse, authentication is still possible: Identity Provider can be used as source of user information. To allow this, add token section to the user_directories section of the config.xml file.
At each login attempt, ClickHouse tries to find the user definition locally and authenticate it as usual. If a token is provided but the user is not defined, ClickHouse will treat the user as externally defined and will try to validate the token and get user information from the specified processor. If validated successfully, the user will be considered existing and authenticated. The user will be assigned roles from the list specified in the roles section. All this implies that the SQL-driven Access Control and Account Management is enabled and roles are created using the CREATE ROLE statement.
Example
<clickhouse>
<user_directories>
<token>
<processor>token_processor_name</processor>
<common_roles>
<token_test_role_1 />
</common_roles>
<default_profile>my_profile</default_profile>
<roles_filter>
\bclickhouse-[a-zA-Z0-9]+\b
</roles_filter>
<roles_transform>s/-/_/g</roles_transform>
</token>
</user_directories>
</clickhouse>
NOTE: For now, no more than one token section can be defined inside user_directories. This may change in the future.
Parameters
processor— The name of one of the processors defined intoken_processorsconfig section described above. This parameter is mandatory.common_roles— A list of locally defined roles that will be assigned to each user retrieved from the IdP. Optional.default_profile— The name of a locally defined settings profile that will be assigned to each user retrieved from the IdP. If the profile does not exist, a warning will be logged and the user will be created without a profile. Optional.roles_filter— A regex string for groups filtering. Only groups matching this regex will be mapped to roles. Optional.roles_transform— A sed-style transform pattern to apply to group names before mapping to roles. Format:s/pattern/replacement/flags. Thegflag applies the replacement globally (all occurrences). In the example above,s/-/_/gconvertsclickhouse-grp-dbatoclickhouse_grp_dba. Optional.