Lee desde varios temas de Kafka en Dataflow
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
Muestra cómo crear una canalización de Dataflow que lea desde varios temas de Kafka y realiza una lógica empresarial diferente según el nombre del tema.
Explora más
Para obtener documentación en la que se incluye esta muestra de código, consulta lo siguiente:
Muestra de código
Salvo que se indique lo contrario, el contenido de esta página está sujeto a la licencia Atribución 4.0 de Creative Commons, y los ejemplos de código están sujetos a la licencia Apache 2.0. Para obtener más información, consulta las políticas del sitio de Google Developers. Java es una marca registrada de Oracle o sus afiliados.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],[],[],[],null,["# Read from multiple Kafka topics to Dataflow\n\nShows how to create a Dataflow pipeline that reads from multiple Kafka topics and performs different business logic based on the topic name.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Read from Apache Kafka to Dataflow](/dataflow/docs/guides/read-from-kafka)\n\nCode sample\n-----------\n\n### Java\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import java.util.List;\n import org.apache.beam.sdk.Pipeline;\n import org.apache.beam.sdk.PipelineResult;\n import org.apache.beam.sdk.io.TextIO;\n import org.apache.beam.sdk.io.kafka.KafkaIO;\n import org.apache.beam.sdk.options.Description;\n import org.apache.beam.sdk.options.PipelineOptionsFactory;\n import org.apache.beam.sdk.options.StreamingOptions;\n import org.apache.beam.sdk.transforms.Filter;\n import org.apache.beam.sdk.transforms.MapElements;\n import org.apache.beam.sdk.values.TypeDescriptors;\n import org.apache.kafka.common.serialization.LongDeserializer;\n import org.apache.kafka.common.serialization.StringDeserializer;\n import org.joda.time.Duration;\n import org.joda.time.Instant;\n\n public class KafkaReadTopics {\n\n public static Pipeline createPipeline(Options options) {\n String topic1 = options.getTopic1();\n String topic2 = options.getTopic2();\n\n // Build the pipeline.\n var pipeline = Pipeline.create(options);\n var allTopics = pipeline\n .apply(KafkaIO.\u003cLong, String\u003eread()\n .withTopics(List.of(topic1, topic2))\n .withBootstrapServers(options.getBootstrapServer())\n .withKeyDeserializer(LongDeserializer.class)\n .withValueDeserializer(StringDeserializer.class)\n .withMaxReadTime(Duration.standardSeconds(10))\n .withStartReadTime(Instant.EPOCH)\n );\n\n // Create separate pipeline branches for each topic.\n // The first branch filters on topic1.\n allTopics\n .apply(Filter.by(record -\u003e record.getTopic().equals(topic1)))\n .apply(MapElements\n .into(TypeDescriptors.strings())\n .via(record -\u003e record.getKV().getValue()))\n .apply(TextIO.write()\n .to(topic1)\n .withSuffix(\".txt\")\n .withNumShards(1)\n );\n\n // The second branch filters on topic2.\n allTopics\n .apply(Filter.by(record -\u003e record.getTopic().equals(topic2)))\n .apply(MapElements\n .into(TypeDescriptors.strings())\n .via(record -\u003e record.getKV().getValue()))\n .apply(TextIO.write()\n .to(topic2)\n .withSuffix(\".txt\")\n .withNumShards(1)\n );\n return pipeline;\n }\n }\n\n### Python\n\n\nTo authenticate to Dataflow, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import argparse\n\n import apache_beam as beam\n\n from apache_beam.io.kafka import ReadFromKafka\n from apache_beam.io.textio import WriteToText\n from apache_beam.options.pipeline_options import PipelineOptions\n\n\n def read_from_kafka() -\u003e None:\n # Parse the pipeline options passed into the application. Example:\n # --bootstrap_server=$BOOTSTRAP_SERVER --output=$STORAGE_BUCKET --streaming\n # For more information, see\n # https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options\n class MyOptions(PipelineOptions):\n @staticmethod\n def _add_argparse_args(parser: argparse.ArgumentParser) -\u003e None:\n parser.add_argument('--bootstrap_server')\n parser.add_argument('--output')\n\n options = MyOptions()\n with beam.Pipeline(options=options) as pipeline:\n # Read from two Kafka topics.\n all_topics = pipeline | ReadFromKafka(consumer_config={\n \"bootstrap.servers\": options.bootstrap_server\n },\n topics=[\"topic1\", \"topic2\"],\n with_metadata=True,\n max_num_records=10,\n start_read_time=0\n )\n\n # Filter messages from one topic into one branch of the pipeline.\n (all_topics\n | beam.Filter(lambda message: message.topic == 'topic1')\n | beam.Map(lambda message: message.value.decode('utf-8'))\n | \"Write topic1\" \u003e\u003e WriteToText(\n file_path_prefix=options.output + '/topic1/output',\n file_name_suffix='.txt',\n num_shards=1))\n\n # Filter messages from the other topic.\n (all_topics\n | beam.Filter(lambda message: message.topic == 'topic2')\n | beam.Map(lambda message: message.value.decode('utf-8'))\n | \"Write topic2\" \u003e\u003e WriteToText(\n file_path_prefix=options.output + '/topic2/output',\n file_name_suffix='.txt',\n num_shards=1))\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=dataflow)."]]