Databricks

Stay organized with collections Save and categorize content based on your preferences.

Create a Looker user

Looker authenticates to Databricks by personal access tokens. Follow the Databricks documentation to create a personal access token for a Databricks user to use in Looker.

Add permissions to this user with GRANT.

At a minimum, the Looker user should have SELECT and READ_METADATA.

GRANT SELECT ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`

Server information

Follow the Databricks documentation to find the HTTP Path for your Databricks cluster. This will be referred to as <YOUR_HTTP_PATH> on this page.

Setting up persistent derived tables

To use persistent derived tables, create a separate database.

CREATE DATABASE <YOUR_SCRATCH_DATABASE>

This will also require additional write-based user permissions to be granted.

GRANT SELECT CREATE MODIFY ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`

Setting up the Looker connection

Select Connections from the Database section in the Admin panel. On the Connections page, click the Add Connection button. Looker displays the Connection Settings page. The fields that the Connection Settings page displays depend on your dialect selection. The majority of the settings are common to most database dialects and are described on the Connecting Looker to your database documentation page.

  • Name: Specify the name of the connection. This is how you will refer to the connection in LookML projects.
  • Dialect: Specify the dialect Databricks.
  • Host: Specify the hostname.
  • Port: Specify the database port. The default is 443.
  • Database: Specify the database name. The default is default.
  • Username: Enter the value token (do not enter the Databricks user email in this field).
  • Password: Enter the personal access token created earlier.
  • Persistent Derived Tables: Check this box to enable persistent derived tables. This reveals the Temp Database field and the PDT Overrides column.
  • Temp Database: Enter the database you would like to use to store PDTs.
  • Max PDT Builder Connections: Specify the number of possible concurrent PDT builds on this connection. Setting this value too high could negatively impact query times. For more information, see the Connecting Looker to your database documentation page.
  • Additional Params: Add any additional Spark JDBC parameters.

  • PDT And Datagroup Maintenance Schedule: A cron expression that indicates when Looker should check datagroups and persistent derived tables. Read more about this setting in our PDT and datagroup maintenance schedule documentation.

  • SSL: Check to use SSL connections.

  • Verify SSL Cert: Check to enforce strict SSL certificate verification.

  • Max Connections: Can be left at the default value initially. Read more about this setting in the Max Connections section of the Connecting Looker to your database documentation page.

  • Connection Pool Timeout: Can be left at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.

  • SQL Runner Precache: To cause SQL Runner not to preload table information and to load table information only when a table is selected, uncheck this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.

  • Database Time Zone: Specify the time zone used in the database. Leave this field blank if you do not want time zone conversion. See the Using time zone settings documentation page for more information.

Click Test These Settings to test the connection and make sure that it is configured correctly. If you see Can Connect, then press Add Connection. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles.

For more information about connection settings, see the Connecting Looker to your database documentation page.

Feature support

For Looker to support some features, your database dialect must also support them.

Databricks supports the following features as of Looker 23.4:

Feature Supported?
Support Level
Supported
Symmetric Aggregates
Yes
Derived Tables
Yes
Persistent SQL Derived Tables
Yes
Persistent Native Derived Tables
Yes
Stable Views
Yes
Query Killing
Yes
Pivots
Yes
Timezones
Yes
SSL
Yes
Subtotals
Yes
JDBC Additional Params
Yes
Case Sensitive
Yes
Location Type
Yes
List Type
Yes
Percentile
Yes
Distinct Percentile
No
SQL Runner Show Processes
No
SQL Runner Describe Table
Yes
SQL Runner Show Indexes
No
SQL Runner Select 10
Yes
SQL Runner Count
Yes
SQL Explain
Yes
Oauth Credentials
No
Context Comments
Yes
Connection Pooling
No
HLL Sketches
No
Aggregate Awareness
Yes
Incremental PDTs
No
Milliseconds
Yes
Microseconds
Yes
Materialized Views
No
Approximate Count Distinct
No

Next steps

After you have completed the database configuration, connect to the database from Looker.