Mit einem Tabellenschema in BigQuery schreiben
Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Schreiben Sie von Dataflow in eine neue oder vorhandene BigQuery-Tabelle. Geben Sie dazu ein Tabellenschema an.
Weitere Informationen
Eine ausführliche Dokumentation, die dieses Codebeispiel enthält, finden Sie hier:
Codebeispiel
Nächste Schritte
Wenn Sie nach Codebeispielen für andere Google Cloud -Produkte suchen und filtern möchten, können Sie den Google Cloud -Beispielbrowser verwenden.
Sofern nicht anders angegeben, sind die Inhalte dieser Seite unter der Creative Commons Attribution 4.0 License und Codebeispiele unter der Apache 2.0 License lizenziert. Weitere Informationen finden Sie in den Websiterichtlinien von Google Developers. Java ist eine eingetragene Marke von Oracle und/oder seinen Partnern.
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis code sample demonstrates how to write data from Dataflow to a new or existing BigQuery table by providing a table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe example uses a custom data type \u003ccode\u003eMyData\u003c/code\u003e with fields for \u003ccode\u003ename\u003c/code\u003e and \u003ccode\u003eage\u003c/code\u003e, and defines a corresponding BigQuery table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe code utilizes \u003ccode\u003eBigQueryIO.write()\u003c/code\u003e to write data to BigQuery, including options to specify the table destination, format the data, set the create disposition, and provide the schema.\u003c/p\u003e\n"],["\u003cp\u003eThe provided code shows how to use Application Default Credentials for authenticating to Dataflow, a required step for executing the pipeline.\u003c/p\u003e\n"],["\u003cp\u003eThe pipeline uses the Storage Write API method to write to the BigQuery table for improved performance, setting up the pipeline options via command-line arguments.\u003c/p\u003e\n"]]],[],null,["# Write to BigQuery using a table schema\n\nWrite from Dataflow to a new or existing BigQuery table, by providing a table schema\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Write from Dataflow to BigQuery](/dataflow/docs/guides/write-to-bigquery)\n\nCode sample\n-----------\n\n### Java\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import com.google.api.services.bigquery.model.TableFieldSchema;\n import com.google.api.services.bigquery.model.TableRow;\n import com.google.api.services.bigquery.model.TableSchema;\n import java.util.Arrays;\n import java.util.List;\n import org.apache.beam.sdk.Pipeline;\n import org.apache.beam.sdk.coders.DefaultCoder;\n import org.apache.beam.sdk.extensions.avro.coders.AvroCoder;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;\n import org.apache.beam.sdk.options.PipelineOptionsFactory;\n import org.apache.beam.sdk.transforms.Create;\n\n public class BigQueryWriteWithSchema {\n // A custom datatype for the source data.\n @DefaultCoder(AvroCoder.class)\n public static class MyData {\n public String name;\n public Long age;\n\n public MyData() {}\n\n public MyData(String name, Long age) {\n this.name = name;\n this.age = age;\n }\n }\n\n public static void main(String[] args) {\n // Example source data.\n final List\u003cMyData\u003e data = Arrays.asList(\n new MyData(\"Alice\", 40L),\n new MyData(\"Bob\", 30L),\n new MyData(\"Charlie\", 20L)\n );\n\n // Define a table schema. A schema is required for write disposition CREATE_IF_NEEDED.\n TableSchema schema = new TableSchema()\n .setFields(\n Arrays.asList(\n new TableFieldSchema()\n .setName(\"user_name\")\n .setType(\"STRING\")\n .setMode(\"REQUIRED\"),\n new TableFieldSchema()\n .setName(\"age\")\n .setType(\"INT64\") // Defaults to NULLABLE\n )\n );\n\n // Parse the pipeline options passed into the application. Example:\n // --projectId=$PROJECT_ID --datasetName=$DATASET_NAME --tableName=$TABLE_NAME\n // For more information, see https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options\n PipelineOptionsFactory.register(ExamplePipelineOptions.class);\n ExamplePipelineOptions options = PipelineOptionsFactory.fromArgs(args)\n .withValidation()\n .as(ExamplePipelineOptions.class);\n\n // Create a pipeline and apply transforms.\n Pipeline pipeline = Pipeline.create(options);\n pipeline\n // Create an in-memory PCollection of MyData objects.\n .apply(Create.of(data))\n // Write the data to a new or existing BigQuery table.\n .apply(BigQueryIO.\u003cMyData\u003ewrite()\n .to(String.format(\"%s:%s.%s\",\n options.getProjectId(),\n options.getDatasetName(),\n options.getTableName()))\n .withFormatFunction(\n (MyData x) -\u003e new TableRow().set(\"user_name\", x.name).set(\"age\", x.age))\n .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)\n .withSchema(schema)\n .withMethod(Write.Method.STORAGE_WRITE_API)\n );\n pipeline.run().waitUntilFinish();\n }\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=dataflow)."]]