테이블 스키마를 사용하여 BigQuery에 쓰기
컬렉션을 사용해 정리하기
내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.
테이블 스키마를 제공하여 Dataflow에서 새 BigQuery 테이블 또는 기존 BigQuery 테이블에 쓰기
더 살펴보기
이 코드 샘플이 포함된 자세한 문서는 다음을 참조하세요.
코드 샘플
Java
Dataflow에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다.
자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.
달리 명시되지 않는 한 이 페이지의 콘텐츠에는 Creative Commons Attribution 4.0 라이선스에 따라 라이선스가 부여되며, 코드 샘플에는 Apache 2.0 라이선스에 따라 라이선스가 부여됩니다. 자세한 내용은 Google Developers 사이트 정책을 참조하세요. 자바는 Oracle 및/또는 Oracle 계열사의 등록 상표입니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis code sample demonstrates how to write data from Dataflow to a new or existing BigQuery table by providing a table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe example uses a custom data type \u003ccode\u003eMyData\u003c/code\u003e with fields for \u003ccode\u003ename\u003c/code\u003e and \u003ccode\u003eage\u003c/code\u003e, and defines a corresponding BigQuery table schema.\u003c/p\u003e\n"],["\u003cp\u003eThe code utilizes \u003ccode\u003eBigQueryIO.write()\u003c/code\u003e to write data to BigQuery, including options to specify the table destination, format the data, set the create disposition, and provide the schema.\u003c/p\u003e\n"],["\u003cp\u003eThe provided code shows how to use Application Default Credentials for authenticating to Dataflow, a required step for executing the pipeline.\u003c/p\u003e\n"],["\u003cp\u003eThe pipeline uses the Storage Write API method to write to the BigQuery table for improved performance, setting up the pipeline options via command-line arguments.\u003c/p\u003e\n"]]],[],null,["# Write to BigQuery using a table schema\n\nWrite from Dataflow to a new or existing BigQuery table, by providing a table schema\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Write from Dataflow to BigQuery](/dataflow/docs/guides/write-to-bigquery)\n\nCode sample\n-----------\n\n### Java\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import com.google.api.services.bigquery.model.TableFieldSchema;\n import com.google.api.services.bigquery.model.TableRow;\n import com.google.api.services.bigquery.model.TableSchema;\n import java.util.Arrays;\n import java.util.List;\n import org.apache.beam.sdk.Pipeline;\n import org.apache.beam.sdk.coders.DefaultCoder;\n import org.apache.beam.sdk.extensions.avro.coders.AvroCoder;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write;\n import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.Write.CreateDisposition;\n import org.apache.beam.sdk.options.PipelineOptionsFactory;\n import org.apache.beam.sdk.transforms.Create;\n\n public class BigQueryWriteWithSchema {\n // A custom datatype for the source data.\n @DefaultCoder(AvroCoder.class)\n public static class MyData {\n public String name;\n public Long age;\n\n public MyData() {}\n\n public MyData(String name, Long age) {\n this.name = name;\n this.age = age;\n }\n }\n\n public static void main(String[] args) {\n // Example source data.\n final List\u003cMyData\u003e data = Arrays.asList(\n new MyData(\"Alice\", 40L),\n new MyData(\"Bob\", 30L),\n new MyData(\"Charlie\", 20L)\n );\n\n // Define a table schema. A schema is required for write disposition CREATE_IF_NEEDED.\n TableSchema schema = new TableSchema()\n .setFields(\n Arrays.asList(\n new TableFieldSchema()\n .setName(\"user_name\")\n .setType(\"STRING\")\n .setMode(\"REQUIRED\"),\n new TableFieldSchema()\n .setName(\"age\")\n .setType(\"INT64\") // Defaults to NULLABLE\n )\n );\n\n // Parse the pipeline options passed into the application. Example:\n // --projectId=$PROJECT_ID --datasetName=$DATASET_NAME --tableName=$TABLE_NAME\n // For more information, see https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options\n PipelineOptionsFactory.register(ExamplePipelineOptions.class);\n ExamplePipelineOptions options = PipelineOptionsFactory.fromArgs(args)\n .withValidation()\n .as(ExamplePipelineOptions.class);\n\n // Create a pipeline and apply transforms.\n Pipeline pipeline = Pipeline.create(options);\n pipeline\n // Create an in-memory PCollection of MyData objects.\n .apply(Create.of(data))\n // Write the data to a new or existing BigQuery table.\n .apply(BigQueryIO.\u003cMyData\u003ewrite()\n .to(String.format(\"%s:%s.%s\",\n options.getProjectId(),\n options.getDatasetName(),\n options.getTableName()))\n .withFormatFunction(\n (MyData x) -\u003e new TableRow().set(\"user_name\", x.name).set(\"age\", x.age))\n .withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED)\n .withSchema(schema)\n .withMethod(Write.Method.STORAGE_WRITE_API)\n );\n pipeline.run().waitUntilFinish();\n }\n }\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=dataflow)."]]