Gravar no BigQuery usando um esquema de tabela
Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Gravar do Dataflow em uma tabela nova ou atual do BigQuery, fornecendo um esquema de tabela
Mais informações
Para ver a documentação detalhada que inclui este exemplo de código, consulte:
Exemplo de código
Exceto em caso de indicação contrária, o conteúdo desta página é licenciado de acordo com a Licença de atribuição 4.0 do Creative Commons, e as amostras de código são licenciadas de acordo com a Licença Apache 2.0. Para mais detalhes, consulte as políticas do site do Google Developers. Java é uma marca registrada da Oracle e/ou afiliadas.
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis code sample demonstrates how to write data from Dataflow to a new or existing BigQuery table by providing a table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe example uses a custom data type \u003ccode\u003eMyData\u003c/code\u003e with fields for \u003ccode\u003ename\u003c/code\u003e and \u003ccode\u003eage\u003c/code\u003e, and defines a corresponding BigQuery table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe code utilizes \u003ccode\u003eBigQueryIO.write()\u003c/code\u003e to write data to BigQuery, including options to specify the table destination, format the data, set the create disposition, and provide the schema.\u003c/p\u003e\n"],["\u003cp\u003eThe provided code shows how to use Application Default Credentials for authenticating to Dataflow, a required step for executing the pipeline.\u003c/p\u003e\n"],["\u003cp\u003eThe pipeline uses the Storage Write API method to write to the BigQuery table for improved performance, setting up the pipeline options via command-line arguments.\u003c/p\u003e\n"]]],[],null,["# Write to BigQuery using a table schema\n\nWrite from Dataflow to a new or existing BigQuery table, by providing a table schema\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Write from Dataflow to BigQuery](/dataflow/docs/guides/write-to-bigquery)\n\nCode sample\n-----------\n\n### Java\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import com.google.api.services.bigquery.model.TableFieldSchema;\n import com.google.api.services.bigquery.model.TableRow;\n import com.google.api.services.bigquery.model.TableSchema;\n import java.util.Arrays;\n import java.util.List;\n import org.apache.beam.sdk.Pipeline;\n import org.apache.beam.sdk.coders.DefaultCoder;\n import org.apache.beam.sdk.extensions.avro.coders.AvroCoder;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;\n import org.apache.beam.sdk.options.PipelineOptionsFactory;\n import org.apache.beam.sdk.transforms.Create;\n\n public class BigQueryWriteWithSchema {\n // A custom datatype for the source data.\n @DefaultCoder(AvroCoder.class)\n public static class MyData {\n public String name;\n public Long age;\n\n public MyData() {}\n\n public MyData(String name, Long age) {\n this.name = name;\n this.age = age;\n }\n }\n\n public static void main(String[] args) {\n // Example source data.\n final List\u003cMyData\u003e data = Arrays.asList(\n new MyData(\"Alice\", 40L),\n new MyData(\"Bob\", 30L),\n new MyData(\"Charlie\", 20L)\n );\n\n // Define a table schema. A schema is required for write disposition CREATE_IF_NEEDED.\n TableSchema schema = new TableSchema()\n .setFields(\n Arrays.asList(\n new TableFieldSchema()\n .setName(\"user_name\")\n .setType(\"STRING\")\n .setMode(\"REQUIRED\"),\n new TableFieldSchema()\n .setName(\"age\")\n .setType(\"INT64\") // Defaults to NULLABLE\n )\n );\n\n // Parse the pipeline options passed into the application. Example:\n // --projectId=$PROJECT_ID --datasetName=$DATASET_NAME --tableName=$TABLE_NAME\n // For more information, see https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options\n PipelineOptionsFactory.register(ExamplePipelineOptions.class);\n ExamplePipelineOptions options = PipelineOptionsFactory.fromArgs(args)\n .withValidation()\n .as(ExamplePipelineOptions.class);\n\n // Create a pipeline and apply transforms.\n Pipeline pipeline = Pipeline.create(options);\n pipeline\n // Create an in-memory PCollection of MyData objects.\n .apply(Create.of(data))\n // Write the data to a new or existing BigQuery table.\n .apply(BigQueryIO.\u003cMyData\u003ewrite()\n .to(String.format(\"%s:%s.%s\",\n options.getProjectId(),\n options.getDatasetName(),\n options.getTableName()))\n .withFormatFunction(\n (MyData x) -\u003e new TableRow().set(\"user_name\", x.name).set(\"age\", x.age))\n .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)\n .withSchema(schema)\n .withMethod(Write.Method.STORAGE_WRITE_API)\n );\n pipeline.run().waitUntilFinish();\n }\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=dataflow)."]]