Databricks

Encrypting network traffic

It is a best practice to encrypt network traffic between the Looker application and your database. Consider one of the options described on the Enabling secure database access documentation page.

Create a Looker user

Looker authenticates to Databricks by personal access tokens. Follow the Databricks documentation to create a personal access token for a Databricks user to use in Looker.

Add permissions to this user with GRANT.

At a minimum, the Looker user should have SELECT and READ_METADATA.

GRANT SELECT ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`

Server information

Follow the Databricks documentation to find the HTTP Path for your Databricks cluster. This will be referred to as <YOUR_HTTP_PATH> on this page.

Setting up persistent derived tables

To use persistent derived tables, create a separate database.

CREATE DATABASE <YOUR_SCRATCH_DATABASE>

This will also require additional write-based user permissions to be granted.

GRANT SELECT CREATE MODIFY ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`

Creating the Looker connection to your database

In the Admin section of Looker, select Connections, and then click Add Connection.

Fill out the connection details. The majority of the settings are common to most database dialects. See the Connecting Looker to your database documentation page for information. Some of the settings are described next:

  • Name: Specify the name of the connection. This is how you will refer to the connection in LookML projects.
  • Dialect: Specify the dialect Databricks.
  • Host: Specify the hostname.
  • Port: Specify the database port. The default is 443.
  • Database: Specify the database name. The default is default.
  • Username: Enter the value token (do not enter the Databricks user email in this field).
  • Password: Enter the personal access token created earlier.
  • Enable PDTs: Use this toggle to enable persistent derived tables. When PDTs are enabled, the Connection window reveals additional PDT settings and the PDT Overrides section.
  • Temp Database: Enter the database you would like to use to store PDTs.
  • Max number of PDT builder connections: Specify the number of possible concurrent PDT builds on this connection. Setting this value too high could negatively impact query times. For more information, see the Connecting Looker to your database documentation page.
  • Additional JDBC parameters: Add any additional Spark JDBC parameters.

  • Datagroup and PDT Maintenance Schedule: A cron expression that indicates when Looker should check datagroups and persistent derived tables. Read more about this setting in the Datagroup and PDT Maintenance Schedule documentation.

  • SSL: Check to use SSL connections.

  • Verify SSL: Check to enforce strict SSL certificate verification.

  • Max connections per node: You can leave this setting at the default value initially. Read more about this setting in the Max connections per node section of the Connecting Looker to your database documentation page.

  • Connection Pool Timeout: You can leave this setting at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.

  • SQL Runner Precache: To cause SQL Runner not to preload table information and to load table information only when a table is selected, uncheck this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.

  • Database Time Zone: Specify the time zone used in the database. Leave this field blank if you do not want time zone conversion. See the Using time zone settings documentation page for more information.

Click Test to test the connection and make sure that it is configured correctly. If you see Can Connect, then press Connect. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles. See the Testing database connectivity documentation page for troubleshooting information.

Looker functionality with Databricks Unity Catalog

For Looker connections to a Databricks database with Unity Catalog enabled, most Looker functionality will access schemas from the default catalog only, such as in the following scenarios:

  • When generating a new LookML project from database schema, Looker will create the project files based on the tables in the Unity Catalog default catalog.
  • For existing projects, when using the Looker IDE to creating a view from a table, Looker can create view files only from the tables in the Unity Catalog default catalog.
  • When using SQL Runner, you can select only schemas from the Unity Catalog default catalog.

Feature support

For Looker to support some features, your database dialect must also support them.

Databricks supports the following features as of Looker 24.8:

Feature Supported?
Support Level
Supported
Looker (Google Cloud core)
Yes
Symmetric Aggregates
Yes
Derived Tables
Yes
Persistent SQL Derived Tables
Yes
Persistent Native Derived Tables
Yes
Stable Views
Yes
Query Killing
Yes
SQL-based Pivots
Yes
Timezones
Yes
SSL
Yes
Subtotals
Yes
JDBC Additional Params
Yes
Case Sensitive
Yes
Location Type
Yes
List Type
Yes
Percentile
Yes
Distinct Percentile
No
SQL Runner Show Processes
No
SQL Runner Describe Table
Yes
SQL Runner Show Indexes
No
SQL Runner Select 10
Yes
SQL Runner Count
Yes
SQL Explain
Yes
Oauth Credentials
No
Context Comments
Yes
Connection Pooling
No
HLL Sketches
No
Aggregate Awareness
Yes
Incremental PDTs
Yes
Milliseconds
Yes
Microseconds
Yes
Materialized Views
No
Approximate Count Distinct
No