Dataflow ML은 Dataflow의 강력한 기능을 Apache Beam의 RunInference API와 결합합니다.
개발자는 RunInference API를 사용하여 모델의 특성과 속성을 정의하고 이 구성을 RunInference 변환에 전달합니다. 이 기능을 사용하면 모델의 구현 세부정보를 몰라도 사용자가 Dataflow 파이프라인에서 모델을 실행할 수 있습니다. TensorFlow, PyTorch 등 데이터에 가장 적합한 프레임워크를 선택할 수 있습니다.
파이프라인에서 여러 모델 실행
RunInference 변환을 사용하여 Dataflow 파이프라인에 여러 추론 모델을 추가합니다. 코드 세부정보를 포함한 자세한 내용은 Apache Beam 문서의 멀티 모델 파이프라인을 참조하세요.
가속기를 사용해야 하는 일괄 또는 스트리밍 파이프라인의 경우 NVIDIA GPU 기기에서 Dataflow 파이프라인을 실행할 수 있습니다. 자세한 내용은 GPU로 Dataflow 파이프라인 실행을 참조하세요.
Dataflow ML 문제 해결
이 섹션에서는 Dataflow ML을 사용할 때 유용할 수 있는 문제 해결 전략과 링크를 제공합니다.
스택은 각 텐서 크기가 같을 것으로 예상함
RunInference API를 사용할 때 크기가 다른 이미지, 또는 길이가 다른 단어 임베딩을 제공하면 다음 오류가 발생할 수 있습니다.
File "/beam/sdks/python/apache_beam/ml/inference/pytorch_inference.py", line 232, in run_inference batched_tensors = torch.stack(key_to_tensor_list[key]) RuntimeError: stack expects each tensor to be equal size, but got [12] at entry 0 and [10] at entry 1 [while running 'PyTorchRunInference/ParDo(_RunInferenceDoFn)']
이 오류는 RunInference API가 크기가 다른 텐서 요소를 일괄 처리할 수 없기 때문에 발생합니다. 해결 방법은 Apache Beam 문서의 텐서 요소를 일괄 처리할 수 없음을 참조하세요.
대형 모델에서 메모리 부족 오류 방지
중형 또는 대형 ML 모델을 로드하면 머신의 메모리가 부족할 수 있습니다.
Dataflow는 ML 모델을 로드할 때 메모리 부족 (OOM) 오류를 방지하는 데 도움이 되는 도구를 제공합니다. 다음 표를 사용하여 시나리오에 적합한 방식을 결정하세요.
시나리오
솔루션
모델이 메모리에 적합할 만큼 작습니다.
추가 구성 없이 RunInference 변환을 사용합니다. RunInference 변환은 스레드 간에 모델을 공유합니다. 머신에 CPU 코어당 모델 하나가 적합하면 파이프라인에서 기본 구성을 사용할 수 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[[["\u003cp\u003eDataflow ML facilitates both prediction and inference pipelines, as well as data preparation for training ML models.\u003c/p\u003e\n"],["\u003cp\u003eDataflow ML supports both batch and streaming data pipelines, utilizing the \u003ccode\u003eRunInference\u003c/code\u003e API (from Apache Beam 2.40.0) and \u003ccode\u003eMLTransform\u003c/code\u003e API (from Apache Beam 2.53.0).\u003c/p\u003e\n"],["\u003cp\u003eThe system is compatible with model handlers for popular frameworks like PyTorch, scikit-learn, TensorFlow, ONNX, and TensorRT, with options for custom handlers for other frameworks.\u003c/p\u003e\n"],["\u003cp\u003eDataflow ML enables the use of multiple inference models within a single pipeline via the \u003ccode\u003eRunInference\u003c/code\u003e transform and supports the use of GPUs for pipelines that need them.\u003c/p\u003e\n"],["\u003cp\u003eDataflow ML also provides troubleshooting guidance for common issues, including tensor size mismatch errors and out-of-memory errors when dealing with large models.\u003c/p\u003e\n"]]],[],null,["# About Dataflow ML\n\nYou can use Dataflow ML's scale data processing abilities for\n[prediction and inference pipelines](#prediction) and for\n[data preparation for training](#data-prep).\n\n**Figure 1.** The complete Dataflow ML workflow.\n\nRequirements and limitations\n----------------------------\n\n- Dataflow ML supports batch and streaming pipelines.\n- The `RunInference` API is supported in Apache Beam 2.40.0 and later versions.\n- The `MLTransform` API is supported in Apache Beam 2.53.0 and later versions.\n- Model handlers are available for PyTorch, scikit-learn, TensorFlow, ONNX, and TensorRT. For unsupported frameworks, you can use a custom model handler.\n\nData preparation for training\n-----------------------------\n\n- Use the `MLTransform` feature to prepare your data for training ML models. For\n more information, see\n [Preprocess data with `MLTransform`](/dataflow/docs/machine-learning/ml-preprocess-data).\n\n- Use Dataflow with ML-OPS frameworks, such as\n [Kubeflow Pipelines](https://www.kubeflow.org/docs/components/pipelines/v1/introduction/)\n (KFP) or [TensorFlow Extended](https://www.tensorflow.org/tfx) (TFX).\n To learn more, see [Dataflow ML in ML workflows](/dataflow/docs/machine-learning/ml-data).\n\nPrediction and inference pipelines\n----------------------------------\n\nDataflow ML combines the power of Dataflow with\nApache Beam's\n[`RunInference` API](https://beam.apache.org/documentation/ml/about-ml/).\nWith the `RunInference` API, you define the model's characteristics and properties\nand pass that configuration to the `RunInference` transform. This feature\nallows users to run the model within their\nDataflow pipelines without needing to know\nthe model's implementation details. You can choose the framework that best\nsuits your data, such as TensorFlow and PyTorch.\n\nRun multiple models in a pipeline\n---------------------------------\n\nUse the `RunInference` transform to add multiple inference models to\nyour Dataflow pipeline. For more information, including code details,\nsee [Multi-model pipelines](https://beam.apache.org/documentation/ml/about-ml/#multi-model-pipelines)\nin the Apache Beam documentation.\n\nBuild a cross-language pipeline\n-------------------------------\n\nTo use RunInference with a Java pipeline,\n[create a cross-language Python transform](https://beam.apache.org/documentation/programming-guide/#1312-creating-cross-language-python-transforms). The pipeline calls the\ntransform, which does the preprocessing, postprocessing, and inference.\n\nFor detailed instructions and a sample pipeline, see\n[Using RunInference from the Java SDK](https://beam.apache.org/documentation/ml/multi-language-inference/).\n\nUse GPUs with Dataflow\n----------------------\n\nFor batch or streaming pipelines that require the use of accelerators, you can\nrun Dataflow pipelines on NVIDIA GPU devices. For more information, see\n[Run a Dataflow pipeline with GPUs](/dataflow/docs/gpu/use-gpus).\n\nTroubleshoot Dataflow ML\n------------------------\n\nThis section provides troubleshooting strategies and links that you might find\nhelpful when using Dataflow ML.\n\n### Stack expects each tensor to be equal size\n\nIf you provide images of different sizes or word embeddings of different lengths\nwhen using the `RunInference` API, the following error might occur: \n\n File \"/beam/sdks/python/apache_beam/ml/inference/pytorch_inference.py\", line 232, in run_inference batched_tensors = torch.stack(key_to_tensor_list[key]) RuntimeError: stack expects each tensor to be equal size, but got [12] at entry 0 and [10] at entry 1 [while running 'PyTorchRunInference/ParDo(_RunInferenceDoFn)']\n\nThis error occurs because the `RunInference` API can't batch tensor elements of\ndifferent sizes. For workarounds, see\n[Unable to batch tensor elements](https://beam.apache.org/documentation/ml/about-ml/#unable-to-batch-tensor-elements)\nin the Apache Beam documentation.\n\n### Avoid out-of-memory errors with large models\n\nWhen you load a medium or large ML model, your machine might run out of memory.\nDataflow provides tools to help avoid out-of-memory (OOM) errors\nwhen loading ML models. Use the following table to determine the appropriate\napproach for your scenario.\n\nFor more information about memory management with Dataflow, see\n[Troubleshoot Dataflow out of memory errors](/dataflow/docs/guides/troubleshoot-oom).\n\nWhat's next\n-----------\n\n- Explore the [Dataflow ML notebooks](https://github.com/apache/beam/tree/master/examples/notebooks/beam-ml) in GitHub.\n- Get in-depth information about using ML with Apache Beam in Apache Beam's [AI/ML pipelines](https://beam.apache.org/documentation/ml/overview/) documentation.\n- Learn more about the [`RunInference` API](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.html#apache_beam.ml.inference.RunInference).\n- Learn about the [metrics](https://beam.apache.org/documentation/ml/runinference-metrics/) that you can use to monitor your `RunInference` transform."]]