On the Apache Beam website, you can find the Apache Beam Programming Guide, a complete guide that walks you through the various basic concepts of building pipelines using the Apache Beam SDKs. These concepts include:
- PCollections - the PCollection abstraction represents a potentially distributed, multi-element data set, that acts as the pipeline's data. Beam transforms use PCollection objects as inputs and outputs.
- Transforms - these are the operations in your pipeline. A transform takes a PCollection (or multiple PCollections) as input, performs an operation that you specify on each element in that collection, and produces a new output PCollection.
- Pipeline I/O - Beam provides read and write transforms for a number of common data storage types, as well as allows you to create your own.