Encrypting network traffic
It is a best practice to encrypt network traffic between the Looker application and your database. Consider one of the options described on the Enabling secure database access documentation page.
Create a Looker user
Looker authenticates to Databricks by personal access tokens. Follow the Databricks documentation to create a personal access token for a Databricks user to use in Looker.
Add permissions to this user with GRANT
.
At a minimum, the Looker user should have SELECT
and READ_METADATA
.
GRANT SELECT ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_DATABASE> TO `<looker>@<your.databricks.com>`
Server information
Follow the Databricks documentation to find the HTTP Path for your Databricks cluster. This will be referred to as <YOUR_HTTP_PATH>
on this page.
Setting up persistent derived tables
To use persistent derived tables, create a separate database.
CREATE DATABASE <YOUR_SCRATCH_DATABASE>
This will also require additional write-based user permissions to be granted.
GRANT SELECT CREATE MODIFY ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`
GRANT READ_METADATA ON DATABASE <YOUR_SCRATCH_DATABASE> TO `<looker>@<your.databricks.com>`
Creating the Looker connection to your database
In the Admin section of Looker, select Connections, and then click Add Connection.
Fill out the connection details. The majority of the settings are common to most database dialects. See the Connecting Looker to your database documentation page for information. Some of the settings are described next:
- Name: Specify the name of the connection. This is how you will refer to the connection in LookML projects.
- Dialect: Specify the dialect Databricks.
- Host: Specify the hostname.
- Port: Specify the database port. The default is 443.
- Database: Specify the database name. The default is
default
. - Username: Enter the value
token
(do not enter the Databricks user email in this field). - Password: Enter the personal access token created earlier.
- Enable PDTs: Use this toggle to enable persistent derived tables. When PDTs are enabled, the Connection window reveals additional PDT settings and the PDT Overrides section.
- Temp Database: Enter the database you would like to use to store PDTs.
- Max number of PDT builder connections: Specify the number of possible concurrent PDT builds on this connection. Setting this value too high could negatively impact query times. For more information, see the Connecting Looker to your database documentation page.
Additional JDBC parameters: Add any additional Spark JDBC parameters.
Datagroup and PDT Maintenance Schedule: A
cron
expression that indicates when Looker should check datagroups and persistent derived tables. Read more about this setting in the Datagroup and PDT Maintenance Schedule documentation.SSL: Check to use SSL connections.
Verify SSL: Check to enforce strict SSL certificate verification.
Max connections per node: You can leave this setting at the default value initially. Read more about this setting in the Max connections per node section of the Connecting Looker to your database documentation page.
Connection Pool Timeout: You can leave this setting at the default value initially. Read more about this setting in the Connection Pool Timeout section of the Connecting Looker to your database documentation page.
SQL Runner Precache: To cause SQL Runner not to preload table information and to load table information only when a table is selected, uncheck this option. Read more about this setting in the SQL Runner Precache section of the Connecting Looker to your database documentation page.
Database Time Zone: Specify the time zone used in the database. Leave this field blank if you do not want time zone conversion. See the Using time zone settings documentation page for more information.
Click Test to test the connection and make sure that it is configured correctly. If you see Can Connect, then press Connect. This runs the rest of the connection tests to verify that the service account was set up correctly and with the proper roles. See the Testing database connectivity documentation page for troubleshooting information.
Looker functionality with Databricks Unity Catalog
For Looker connections to a Databricks database with Unity Catalog enabled, most Looker functionality will access schemas from the default catalog only, such as in the following scenarios:
- When generating a new LookML project from database schema, Looker will create the project files based on the tables in the Unity Catalog default catalog.
- For existing projects, when using the Looker IDE to creating a view from a table, Looker can create view files only from the tables in the Unity Catalog default catalog.
- When using SQL Runner, you can select only schemas from the Unity Catalog default catalog.
Feature support
For Looker to support some features, your database dialect must also support them.
Databricks supports the following features as of Looker 24.8:
Feature | Supported? |
---|---|
Support Level | Supported |
Looker (Google Cloud core) | Yes |
Symmetric Aggregates | Yes |
Derived Tables | Yes |
Persistent SQL Derived Tables | Yes |
Persistent Native Derived Tables | Yes |
Stable Views | Yes |
Query Killing | Yes |
SQL-based Pivots | Yes |
Timezones | Yes |
SSL | Yes |
Subtotals | Yes |
JDBC Additional Params | Yes |
Case Sensitive | Yes |
Location Type | Yes |
List Type | Yes |
Percentile | Yes |
Distinct Percentile | No |
SQL Runner Show Processes | No |
SQL Runner Describe Table | Yes |
SQL Runner Show Indexes | No |
SQL Runner Select 10 | Yes |
SQL Runner Count | Yes |
SQL Explain | Yes |
Oauth Credentials | No |
Context Comments | Yes |
Connection Pooling | No |
HLL Sketches | No |
Aggregate Awareness | Yes |
Incremental PDTs | Yes |
Milliseconds | Yes |
Microseconds | Yes |
Materialized Views | No |
Approximate Count Distinct | No |