Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Neste documento, mostramos como usar a interface de linha de comando (CLI) do Dataform de código aberto para desenvolver fluxos de trabalho localmente usando o terminal.
Com a CLI do Dataform de código aberto, é possível inicializar, compilar, testar e executar o Dataform Core localmente, fora do Google Cloud.
A CLI do Dataform é compatível com Application Default Credentials (ADC).
Com o ADC, é possível disponibilizar credenciais para seu aplicativo em uma variedade
de ambientes, como desenvolvimento ou produção local, sem precisar
modificar o código do aplicativo. Para usar o ADC, primeiro é necessário
fornecer suas credenciais ao ADC.
Antes de começar
Antes de instalar a CLI do Dataform, instale o
NPM.
Instalar a CLI do Dataform
Para instalar a CLI do Dataform, execute o seguinte comando:
npm i -g @dataform/cli@^3.0.0-beta
Inicializar um projeto do Dataform
Para inicializar um novo projeto do Dataform,
execute o seguinte comando no diretório do projeto:
dataform init . PROJECT_NAMEDEFAULT_LOCATION
Substitua:
PROJECT_NAME: o nome do projeto.
DEFAULT_LOCATION (opcional): o local em que
você quer que o Dataform grave dados do BigQuery. Se não estiver definido, o Dataform vai determinar o local com base nos conjuntos de dados referenciados pela consulta SQL. Isso funciona da seguinte maneira:
Se a consulta fizer referência a conjuntos de dados do mesmo local, o Dataform usará esse local.
Se a consulta fizer referência a conjuntos de dados de dois ou mais locais diferentes, um erro vai ocorrer. Para mais detalhes sobre essa limitação, consulte
Replicação de conjuntos de dados entre regiões.
Se a consulta não fizer referência a nenhum conjunto de dados, o local padrão do Dataform será a multirregião US. Para escolher um local diferente, defina o local padrão. Como alternativa, use a
variável de sistema @@location
na sua consulta. Para mais informações, consulte
Especificar locais.
Atualizar o Dataform Core
Para atualizar o framework principal do Dataform, atualize o dataformCoreVersion
no arquivo workflow_settings.yaml e execute novamente a instalação do NPM:
npm i
Atualizar a CLI do Dataform
Para atualizar a ferramenta de linha de comando do Dataform, execute o seguinte comando:
npm i -g @dataform/cli@^3.0.0-beta.2
Criar um arquivo de credenciais
O Dataform exige um arquivo de credenciais para se conectar a serviços remotos
e criar o arquivo .df-credentials.json no disco.
Para criar o arquivo de credenciais, siga estas etapas:
Execute este comando:
dataform init-creds
Siga o assistente init-creds, que orienta você na criação do arquivo de credenciais.
Criar um projeto
Um projeto vazio do Dataform no Dataform Core 3.0.0-beta.0
ou em uma versão mais recente tem a seguinte estrutura:
project-dir
├── definitions
├── includes
└── workflow_settings.yaml
Para criar um projeto do Dataform e implantar recursos no BigQuery, execute o seguinte comando:
YOUR_GOOGLE_CLOUD_PROJECT_ID: o ID do projeto do Google Cloud .
DEFAULT_LOCATION (opcional): o local em que você quer que o Dataform grave os dados do BigQuery. Se não estiver definido, o Dataform vai determinar o local com base nos conjuntos de dados referenciados pela consulta SQL. Isso funciona da seguinte maneira:
Se a consulta fizer referência a conjuntos de dados do mesmo local, o Dataform usará esse local.
Se a consulta fizer referência a conjuntos de dados de dois ou mais locais diferentes, um erro vai ocorrer. Para mais detalhes sobre essa limitação, consulte
Replicação de conjuntos de dados entre regiões.
Se a consulta não fizer referência a nenhum conjunto de dados, o local padrão do Dataform será a multirregião US. Para escolher um local diferente, defina o local padrão. Como alternativa, use a
variável de sistema @@location
na sua consulta. Para mais informações, consulte
Especificar locais.
Clonar um projeto
Para clonar um projeto do Dataform de um repositório Git de terceiros,
siga as instruções do seu provedor Git.
Depois que o repositório for clonado, execute o seguinte comando no diretório do repositório clonado:
dataform install
Definir uma tabela
Armazene as definições na pasta definitions/.
Para definir uma tabela, execute o seguinte comando:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-09-04 UTC."],[[["\u003cp\u003eThe Dataform CLI allows for local development of SQL workflows, enabling users to initialize, compile, test, and run Dataform core outside of Google Cloud.\u003c/p\u003e\n"],["\u003cp\u003eThe Dataform CLI supports Application Default Credentials (ADC) for managing credentials across different environments without code modification.\u003c/p\u003e\n"],["\u003cp\u003eDataform projects can be initialized, cloned, and managed locally, and the CLI provides commands to define tables, manual assertions, and custom SQL operations.\u003c/p\u003e\n"],["\u003cp\u003eUsers can compile code and view the output in real-time, either in the terminal or as a JSON object, including options for custom compilation variables.\u003c/p\u003e\n"],["\u003cp\u003eThe CLI facilitates code execution, dry runs, and full refreshes in BigQuery, with commands available to get help on specific command options.\u003c/p\u003e\n"]]],[],null,["This document shows you how to use the open-source Dataform command-line interface (CLI) to locally develop workflows by using the terminal.\n\n\u003cbr /\u003e\n\nWith the open-source Dataform CLI, you can initialize, compile, test,\nand run [Dataform core](/dataform/docs/overview#dataform-core) locally,\noutside of Google Cloud.\n\nThe Dataform CLI supports [Application Default Credentials (ADC)](/docs/authentication/application-default-credentials).\nWith ADC, you can make credentials available to your application in a variety\nof environments, such as local development or production, without needing to\nmodify your application code. To use ADC, you must first\n[provide your credentials](/docs/authentication/provide-credentials-adc) to ADC.\n| **Note:** As of [Dataform core `3.0.0.`](https://github.com/dataform-co/dataform/releases/tag/3.0.0), Dataform doesn't distribute a Docker image. You can build your own Docker image of Dataform, which you can use to run the equivalent of Dataform CLI commands. To build your own Docker image, see [Containerize an application](https://docs.docker.com/guides/workshop/02_our_app/) in the Docker documentation.\n\nBefore you begin\n\nBefore installing the Dataform CLI, install\n[NPM](https://www.npmjs.com/get-npm).\n\nInstall Dataform CLI\n\n- To install Dataform CLI, run the following command:\n\n npm i -g @dataform/cli@^3.0.0-beta\n\nInitialize a Dataform project\n\n- To initialize a new Dataform project,\n run the following command inside your project directory:\n\n dataform init . \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e \u003cvar translate=\"no\"\u003eDEFAULT_LOCATION\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e: the name of your project.\n - \u003cvar translate=\"no\"\u003eDEFAULT_LOCATION\u003c/var\u003e (optional): the location where\n you want Dataform to write BigQuery data. If unset,\n Dataform determines the location based on the datasets that\n your SQL query references. This works as follows:\n\n - If your query references datasets from the same location, Dataform uses that location.\n - If your query references datasets from two or more different locations, an error occurs. For details about this limitation, see [Cross-region dataset replication](/bigquery/docs/data-replication).\n - If your query doesn't reference any datasets, the default location for Dataform is the `US` multi-region. To choose a different location, set the default location. Alternatively, use the [`@@location` system variable](/bigquery/docs/reference/system-variables) in your query. For more information, see [Specify locations](/bigquery/docs/locations#specify_locations).\n\nUpdate Dataform core\n\n- To update the Dataform core framework, update the `dataformCoreVersion`\n in `workflow_settings.yaml` file, then re-run NPM install:\n\n npm i\n\nUpdate Dataform CLI\n\n- To update the Dataform CLI tool, run the following command:\n\n npm i -g @dataform/cli@^3.0.0-beta.2\n\nCreate a credentials file\n\nDataform requires a credentials file to connect to remote services\nand create the `.df-credentials.json` file on your disk.\n| **Warning:** If you use a source control system, don't commit the `df-credentials.json`. file to your repository to protect these access credentials. We recommend that you add the `.df-credentials.json` file to `.gitignore`.\n\nTo create the credentials file, follow these steps:\n\n1. Run the following command:\n\n dataform init-creds\n\n2. Follow the `init-creds` wizard that walks you through credentials file creation.\n\nCreate a project\n\nAn empty Dataform project in Dataform core `3.0.0-beta.0`\nor later has the following structure: \n\n project-dir\n ├── definitions\n ├── includes\n └── workflow_settings.yaml\n\n- To create a Dataform project to deploy assets to\n BigQuery, run the following command:\n\n dataform init \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e --default-database \u003cvar translate=\"no\"\u003eYOUR_GOOGLE_CLOUD_PROJECT_ID\u003c/var\u003e --default-location \u003cvar translate=\"no\"\u003eDEFAULT_LOCATION\u003c/var\u003e\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_NAME\u003c/var\u003e: the name of your project.\n - \u003cvar translate=\"no\"\u003eYOUR_GOOGLE_CLOUD_PROJECT_ID\u003c/var\u003e: your Google Cloud project ID.\n - \u003cvar translate=\"no\"\u003eDEFAULT_LOCATION\u003c/var\u003e (optional): the location where you\n want Dataform to write BigQuery data. If unset,\n Dataform determines the location based on the datasets that\n your SQL query references. This works as follows:\n\n - If your query references datasets from the same location, Dataform uses that location.\n - If your query references datasets from two or more different locations, an error occurs. For details about this limitation, see [Cross-region dataset replication](/bigquery/docs/data-replication).\n - If your query doesn't reference any datasets, the default location for Dataform is the `US` multi-region. To choose a different location, set the default location. Alternatively, use the [`@@location` system variable](/bigquery/docs/reference/system-variables) in your query. For more information, see [Specify locations](/bigquery/docs/locations#specify_locations).\n\nClone a project\n\nTo clone an existing Dataform project from a third-party Git repository,\nfollow the instructions from your Git provider.\n\n- Once the repository is cloned, run the following command inside\n the cloned repository directory:\n\n dataform install\n\nDefine a table\n\nStore definitions in the `definitions/` folder.\n\n- To define a table, run the following command:\n\n echo \"config { type: '\u003cvar translate=\"no\"\u003eTABLE_TYPE\u003c/var\u003e' } \u003cvar translate=\"no\"\u003eSELECT_STATEMENT\u003c/var\u003e\" \u003e definitions/\u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e.sqlx\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eTABLE_TYPE\u003c/var\u003e: the type of the table: `table`, `incremental`, or `view`.\n - \u003cvar translate=\"no\"\u003eSELECT_STATEMENT\u003c/var\u003e: a `SELECT` statement that defines the table.\n - \u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e: the name for the table definition file.\n\nThe following code sample defines a view in the `example` SQLX file. \n\n echo \"config { type: 'view' } SELECT 1 AS test\" \u003e definitions/example.sqlx\n\nDefine a manual assertion\n\nStore definitions in the `definitions/` folder.\n\n- To define a manual assertion, run the following command:\n\n echo \"config { type: 'assertion' } \u003cvar translate=\"no\"\u003eSELECT_STATEMENT\u003c/var\u003e\" \u003e definitions/\u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e.sqlx\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eSELECT_STATEMENT\u003c/var\u003e: a `SELECT` statement that defines the assertion.\n - \u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e: the name for the custom SQL operation definition file.\n\nDefine a custom SQL operation\n\nStore definitions in the `definitions/` folder.\n\n- To define a custom SQL operation, run the following command:\n\n echo \"config { type: 'operations' } \u003cvar translate=\"no\"\u003eSQL_QUERY\u003c/var\u003e\" \u003e definitions/\u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e.sqlx\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eSQL_QUERY\u003c/var\u003e: your custom SQL operation.\n - \u003cvar translate=\"no\"\u003eFILE\u003c/var\u003e: the name for the custom SQL operation definition file.\n\nView compilation output\n\nDataform compiles your code in real time.\n\n- To view the output of the compilation process in the terminal,\n run the following command:\n\n dataform compile\n\n- To view the output of the compilation process as a JSON object,\n run the following command:\n\n dataform compile --json\n\n- To view the output of the compilation with\n [custom compilation variables](/dataform/docs/configure-compilation#compilation-variables),\n run the following command:\n\n dataform compile --vars=\u003cvar translate=\"no\"\u003eSAMPLE_VAR\u003c/var\u003e=\u003cvar translate=\"no\"\u003eSAMPLE_VALUE\u003c/var\u003e,foo=bar\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eSAMPLE_VAR\u003c/var\u003e: your custom compilation variable.\n - \u003cvar translate=\"no\"\u003eSAMPLE_VALUE\u003c/var\u003e: the value of your custom compilation variable.\n\nRun code\n\nTo run your code, Dataform accesses BigQuery to\ndetermine its current state and tailor the resulting SQL accordingly.\n\n- To run the code of your Dataform project, run the following command:\n\n dataform run\n\n- To run the code of your Dataform project in BigQuery\n with [custom compilation variables](/dataform/docs/configure-compilation#compilation-variables),\n run the following command:\n\n dataform run --vars=\u003cvar translate=\"no\"\u003eSAMPLE_VAR\u003c/var\u003e=\u003cvar translate=\"no\"\u003eSAMPLE_VALUE\u003c/var\u003e,sampleVar2=sampleValue2\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eSAMPLE_VAR\u003c/var\u003e: your custom compilation variable.\n - \u003cvar translate=\"no\"\u003eSAMPLE_VALUE\u003c/var\u003e: the value of your custom compilation variable.\n- To run the code of your Dataform project in BigQuery\n and rebuild all tables from scratch, run the following command:\n\n dataform run --full-refresh\n\nWithout `--full-refresh`, Dataform updates incremental tables\nwithout rebuilding them from scratch.\n\n- To perform a dry run of your code against BigQuery,\n run the following command:\n\n dataform run --dry-run\n\n| **Note:** In Dataform core `3.0.2` or later, Dataform performs a dry run of each action in your compilation output against BigQuery, and requires corresponding IAM permissions. To verify compilation without a dry run, [view compilation output](#view_compilation_output).\n\nGet help\n\n- To list all of the available commands and options,\n run the following command:\n\n dataform help\n\n- To view a description of a specific command,\n run the following command:\n\n dataform help \u003cvar translate=\"no\"\u003eCOMMAND\u003c/var\u003e\n\n Replace \u003cvar translate=\"no\"\u003eCOMMAND\u003c/var\u003e with the command you want to learn about.\n\nWhat's next\n\n- To learn more about Dataform CLI, see [Dataform CLI reference](/dataform/docs/reference/dataform-cli-reference)\n- To learn more about Dataform, see [Dataform overview](/dataform/docs/overview)."]]