使用投影和过滤功能读取数据
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
使用采用 DIRECT_READ 模式的 BigQueryIO 连接器进行列投影和过滤。
深入探索
如需查看包含此代码示例的详细文档,请参阅以下内容:
代码示例
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],[],[[["\u003cp\u003eThis code sample demonstrates how to use the BigQueryIO connector in DIRECT_READ mode.\u003c/p\u003e\n"],["\u003cp\u003eColumn projection and filtering are implemented within the BigQuery read operation.\u003c/p\u003e\n"],["\u003cp\u003eThe code reads data from a specified BigQuery table, filters rows based on the condition "age > 18", and selects only the "user_name" and "age" fields.\u003c/p\u003e\n"],["\u003cp\u003eAuthentication for Dataflow is required, and Application Default Credentials (ADC) should be set up for local development.\u003c/p\u003e\n"],["\u003cp\u003eThe provided code is in Java, and is ran on Dataflow, to process the results of the table operation.\u003c/p\u003e\n"]]],[],null,["Use the BigQueryIO connector in DIRECT_READ mode with column projection and filtering.\n\nExplore further\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Read from BigQuery to Dataflow](/dataflow/docs/guides/read-from-bigquery)\n\nCode sample \n\nJava\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import com.google.common.collect.ImmutableMap;\n import java.util.List;\n import org.apache.beam.sdk.Pipeline;\n import org.apache.beam.sdk.managed.Managed;\n import org.apache.beam.sdk.options.PipelineOptionsFactory;\n import org.apache.beam.sdk.transforms.MapElements;\n import org.apache.beam.sdk.values.Row;\n import org.apache.beam.sdk.values.TypeDescriptors;\n\n public class BigQueryReadWithProjectionAndFiltering {\n public static void main(String[] args) {\n // Parse the pipeline options passed into the application. Example:\n // --projectId=$PROJECT_ID --datasetName=$DATASET_NAME --tableName=$TABLE_NAME\n // For more information, see https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options\n PipelineOptionsFactory.register(ExamplePipelineOptions.class);\n ExamplePipelineOptions options = PipelineOptionsFactory.fromArgs(args)\n .withValidation()\n .as(ExamplePipelineOptions.class);\n\n String tableSpec = String.format(\"%s:%s.%s\",\n options.getProjectId(),\n options.getDatasetName(),\n options.getTableName());\n\n ImmutableMap\u003cString, Object\u003e config = ImmutableMap.\u003cString, Object\u003ebuilder()\n .put(\"table\", tableSpec)\n .put(\"row_restriction\", \"age \u003e 18\")\n .put(\"fields\", List.of(\"user_name\", \"age\"))\n .build();\n\n // Create a pipeline and apply transforms.\n Pipeline pipeline = Pipeline.create(options);\n pipeline\n .apply(Managed.read(Managed.BIGQUERY).withConfig(config)).getSinglePCollection()\n .apply(MapElements\n .into(TypeDescriptors.strings())\n // Access individual fields in the row.\n .via((Row row) -\u003e {\n String output = String.format(\"Name: %s, Age: %s%n\",\n row.getString(\"user_name\"),\n row.getInt64(\"age\"));\n System.out.println(output);\n return output;\n }));\n pipeline.run().waitUntilFinish();\n }\n }\n\nWhat's next\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=dataflow)."]]