distribution_style

Usage

view: my_view {
  derived_table: {
    distribution_style: all
    ...
  }
}
Hierarchy
distribution_style

- or -

distribution_style
Default Value
None

Accepts
A distribution style (all or even)
Special Rules
distribution_style is supported only on specific dialects

Definition

distribution_style lets you specify how the query for a persistent derived table (PDT) or an aggregate table is distributed across the nodes in a database.

See the Dialect support for distribution_style section below for the list of dialects that support distribution_style.

The distribution_style parameter works only with tables that are persistent, such as PDTs and aggregate tables. distribution_style is not supported for derived tables without a persistence strategy.

In addition, the distribution_style parameter is not supported for derived tables that are defined using create_process or sql_create.

Lastly, distribution_style and distribution should not be used at the same time. If you want to distribute the rows of a table to different Redshift nodes based on a column value, use distribution. Otherwise, use distribution_style to choose a different distribution strategy.

Redshift offers four distribution styles, which are described in the Amazon Redshift documentation on distribution styles:

  • ALL Distribution: All rows are fully copied to each node. You can accomplish this type of distribution in Looker by using distribution_style: all.
  • EVEN Distribution: Rows are distributed to different nodes in a round-robin fashion. You can accomplish this type of distribution in Looker by using distribution_style: even.
  • KEY Distribution: Rows are distributed to different nodes based on unique values within a particular column. You can accomplish this type of distribution in Looker by using the distribution parameter.
  • AUTO Distribution Redshift assigns an optimal distribution style based on the size of the table data. Looker does not support this distribution type.

See the Amazon Redshift documentation on distribution styles for choosing the appropriate distribution strategy. If you don't specify a distribution_style, and don't use the distribution parameter instead, Looker will default to all.

Examples

Create a customer_order_facts derived table with a distribution style of all:

view: customer_order_facts {
  derived_table: {
    sql:
      SELECT
        customer_id,
        COUNT(*) AS lifetime_orders
      FROM
        order
      GROUP BY 1 ;;
    persist_for: "24 hours"
    distribution_style: all
  }
}

Dialect support for distribution_style

The ability to use distribution_style depends on the database dialect your Looker connection is using. In the latest release of Looker, the following dialects support distribution_style:

Dialect Supported?
Actian Avalanche
No
Amazon Athena
No
Amazon Aurora MySQL
No
Amazon Redshift
Yes
Apache Druid
No
Apache Druid 0.13+
No
Apache Druid 0.18+
No
Apache Hive 2.3+
No
Apache Hive 3.1.2+
No
Apache Spark 3+
No
ClickHouse
No
Cloudera Impala 3.1+
No
Cloudera Impala 3.1+ with Native Driver
No
Cloudera Impala with Native Driver
No
DataVirtuality
No
Databricks
No
Denodo 7
No
Denodo 8
No
Dremio
No
Dremio 11+
No
Exasol
No
Firebolt
No
Google BigQuery Legacy SQL
No
Google BigQuery Standard SQL
No
Google Cloud PostgreSQL
No
Google Cloud SQL
No
Google Spanner
No
Greenplum
No
HyperSQL
No
IBM Netezza
No
MariaDB
No
Microsoft Azure PostgreSQL
No
Microsoft Azure SQL Database
No
Microsoft Azure Synapse Analytics
No
Microsoft SQL Server 2008+
No
Microsoft SQL Server 2012+
No
Microsoft SQL Server 2016
No
Microsoft SQL Server 2017+
No
MongoBI
No
MySQL
No
MySQL 8.0.12+
No
Oracle
No
Oracle ADWC
No
PostgreSQL 9.5+
No
PostgreSQL pre-9.5
No
PrestoDB
No
PrestoSQL
No
SAP HANA 2+
No
SingleStore
No
SingleStore 7+
No
Snowflake
No
Teradata
No
Trino
No
Vector
No
Vertica
No