本頁面由 Cloud Translation API 翻譯而成。

使用 Dataflow 將 Pub/Sub 內容傳送至 Cloud Storage

使用 Dataflow 將 Pub/Sub 訊息串流至 Cloud Storage。

深入探索

如需包含這個程式碼範例的詳細說明文件，請參閱下列內容：

使用 Dataflow 和 Cloud Storage 從 Pub/Sub 串流訊息

程式碼範例

Java

在試用這個範例之前，請先按照Java「Pub/Sub 快速入門導覽課程：使用用戶端程式庫」中的操作說明進行設定。詳情請參閱 Pub/Sub Java API 參考說明文件。

如要驗證 Pub/Sub，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


import java.io.IOException;
import org.apache.beam.examples.common.WriteOneFilePerWindow;
import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.gcp.pubsub.PubsubIO;
import org.apache.beam.sdk.options.Default;
import org.apache.beam.sdk.options.Description;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.options.StreamingOptions;
import org.apache.beam.sdk.options.Validation.Required;
import org.apache.beam.sdk.transforms.windowing.FixedWindows;
import org.apache.beam.sdk.transforms.windowing.Window;
import org.joda.time.Duration;

public class PubSubToGcs {
  /*
   * Define your own configuration options. Add your own arguments to be processed
   * by the command-line parser, and specify default values for them.
   */
  public interface PubSubToGcsOptions extends StreamingOptions {
    @Description("The Cloud Pub/Sub topic to read from.")
    @Required
    String getInputTopic();

    void setInputTopic(String value);

    @Description("Output file's window size in number of minutes.")
    @Default.Integer(1)
    Integer getWindowSize();

    void setWindowSize(Integer value);

    @Description("Path of the output file including its filename prefix.")
    @Required
    String getOutput();

    void setOutput(String value);
  }

  public static void main(String[] args) throws IOException {
    // The maximum number of shards when writing output.
    int numShards = 1;

    PubSubToGcsOptions options =
        PipelineOptionsFactory.fromArgs(args).withValidation().as(PubSubToGcsOptions.class);

    options.setStreaming(true);

    Pipeline pipeline = Pipeline.create(options);

    pipeline
        // 1) Read string messages from a Pub/Sub topic.
        .apply("Read PubSub Messages", PubsubIO.readStrings().fromTopic(options.getInputTopic()))
        // 2) Group the messages into fixed-sized minute intervals.
        .apply(Window.into(FixedWindows.of(Duration.standardMinutes(options.getWindowSize()))))
        // 3) Write one file to GCS for every window of messages.
        .apply("Write Files to GCS", new WriteOneFilePerWindow(options.getOutput(), numShards));

    // Execute the pipeline and wait until it finishes running.
    pipeline.run().waitUntilFinish();
  }
}

Python

在試用這個範例之前，請先按照Python「Pub/Sub 快速入門導覽課程：使用用戶端程式庫」中的操作說明進行設定。詳情請參閱 Pub/Sub Python API 參考說明文件。