How to run inference workloads from a Dataflow Java pipeline
Danny McCormick
Software Engineer, Google Cloud
Many data engineers desire the ability to more easily run their python-based machine learning workloads from Java or Go pipelines. There are a number of reasons why users want to do this: they might already have a pipeline written in a language that they want to enrich with ML, they might have access to special sources or transforms written in that language, or they might just prefer working in Java or Go. Whatever the reason, with Dataflow's multi-language capabilities, running Python ML workloads is now easy.
There are 2 main ways of running multi-language inference in Dataflow:
1) Run using the Java or Go RunInference transforms. If you have a Python model developed with frameworks like TensorFlow, PyTorch, or Sklearn and you want to run inference with that model while keeping most of your pre/post-processing in Java or Go, you can use the Java or Go RunInference transforms. Under the covers, Dataflow will automatically spin up a Python worker to run your inference.
The RunInference transform will automatically handle most of your boilerplate code like batching your inputs, efficiently sharing the model, and loading your model from a remote filesystem or model hub. As of Beam 2.47, RunInference has full support for TensorFlow, PyTorch, Sklearn, XGBoost, ONNX, and TensorRT models. You can also build your own custom model handler to support other frameworks. For a full example pipeline using the Java RunInference transform, see https://github.com/apache/beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/multilanguage/SklearnMnistClassification.java.
2) Create a custom Python transform to do all of your pre/postprocessing and inference (this can still leverage the RunInference transform). This can be invoked using an External Transform and allows you to use your favorite Python libraries for pre/post-processing.
For a full example walkthrough using an External Transform, see https://beam.apache.org/documentation/ml/multi-language-inference/.
Running inference from Java or Go at scale is easy with Dataflow. Pipelines that make use of this can unlock the full potential of python machine learning frameworks without overhauling existing logic or losing access to Java APIs.
Learn More
If you're interested in learning more, you can try walking through an end to end example of performing inference from a Java pipeline in the apache/beam repository.