You are viewing documentation for Looker 24.8. Click this link to see the most recent documentation.

Databricks

Encrypting network traffic

It is a best practice to encrypt network traffic between the Looker application and your database. Consider one of the options described on the Enabling secure database access documentation page.

Create a Looker user

Looker authenticates to Databricks by personal access tokens. Follow the Databricks documentation to create a personal access token for a Databricks user to use in Looker.

Add permissions to this user with GRANT.

At a minimum, the Looker user should have SELECT and READ_METADATA.

GRANT SELECT ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`

Server information

Follow the Databricks documentation to find the HTTP Path for your Databricks cluster. This will be referred to as <YOUR_HTTP_PATH> on this page.

Setting up persistent derived tables

To use persistent derived tables, create a separate database.

CREATE DATABASE <YOUR_SCRATCH_DATABASE>

This will also require additional write-based user permissions to be granted.

GRANT SELECT CREATE MODIFY ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`

Creating the Looker connection to your database

In the Admin section of Looker, select Connections, and then click Add Connection.

Fill out the connection details. The majority of the settings are common to most database dialects. See the Connecting Looker to your database documentation page for information. Some of the settings are described next:

Name: Specify the name of the connection. This is how you will refer to the connection in LookML projects.
Dialect: Specify the dialect Databricks.
Host: Specify the hostname.
Port: Specify the database port. The default is 443.
Database: Specify the database name. The default is default.
Username: Enter the value token (do not enter the Databricks user email in this field).
Password: Enter the personal access token created earlier.
Enable PDTs: Use this toggle to enable persistent derived tables. When PDTs are enabled, the Connection window reveals additional PDT settings and the PDT Overrides section.
Temp Database: Enter the database you would like to use to store PDTs.
Max number of PDT builder connections: Specify the number of possible concurrent PDT builds on this connection. Setting this value too high could negatively impact query times. For more information, see the Connecting Looker to your database documentation page.
Additional JDBC parameters: Add any additional Spark JDBC parameters.

Note: The following parameters are required: transportMode=http;httpPath=<YOUR_HTTP_PATH>
Datagroup and PDT Maintenance Schedule: A cron expression that indicates when Looker should check datagroups and persistent derived tables. Read more about this setting in the Datagroup and PDT Maintenance Schedule documentation.

Warning: The default for Datagroup and PDT Maintenance Schedule is five minutes. If you keep the default value, Looker will call your Databricks database every five minutes. This frequency can prevent Databricks clusters from shutting down, which may cause unexpected costs. Consider changing the default to a less frequent interval to prevent these costs.
SSL: Check to use SSL connections.
Verify SSL: Check to enforce strict SSL certificate verification.
Max connections per node: You can leave this setting at the default value initially. Read more about this setting in the Max connections per node section of the Connecting Looker to your database documentation page.
Connection Pool Timeout: You can leave this setting at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.
SQL Runner Precache: To cause SQL Runner not to preload table information and to load table information only when a table is selected, uncheck this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.
Database Time Zone: Specify the time zone used in the database. Leave this field blank if you do not want time zone conversion. See the Using time zone settings documentation page for more information.

Click Test to test the connection and make sure that it is configured correctly. If you see Can Connect, then press Connect. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles. See the Testing database connectivity documentation page for troubleshooting information.

Looker functionality with Databricks Unity Catalog

For Looker connections to a Databricks database with Unity Catalog enabled, most Looker functionality will access schemas from the default catalog only, such as in the following scenarios:

When generating a new LookML project from database schema, Looker will create the project files based on the tables in the Unity Catalog default catalog.
For existing projects, when using the Looker IDE to creating a view from a table, Looker can create view files only from the tables in the Unity Catalog default catalog.
When using SQL Runner, you can select only schemas from the Unity Catalog default catalog.

Feature support

For Looker to support some features, your database dialect must also support them.

Databricks supports the following features as of Looker 24.8:

Feature	Supported?
Support Level	Supported
Looker (Google Cloud core)	Yes
Symmetric Aggregates	Yes
Derived Tables	Yes
Persistent SQL Derived Tables	Yes
Persistent Native Derived Tables	Yes
Stable Views	Yes
Query Killing	Yes
SQL-based Pivots	Yes
Timezones	Yes
SSL	Yes
Subtotals	Yes
JDBC Additional Params	Yes
Case Sensitive	Yes
Location Type	Yes
List Type	Yes
Percentile	Yes
Distinct Percentile	No
SQL Runner Show Processes	No
SQL Runner Describe Table	Yes
SQL Runner Show Indexes	No
SQL Runner Select 10	Yes
SQL Runner Count	Yes
SQL Explain	Yes
Oauth Credentials	No
Context Comments	Yes
Connection Pooling	No
HLL Sketches	No
Aggregate Awareness	Yes
Incremental PDTs	Yes
Milliseconds	Yes
Microseconds	Yes
Materialized Views	No
Approximate Count Distinct	No