Variant Transforms tool

Variant Transforms is an open-source tool used with Cloud Life Sciences. It is based on Apache Beam and uses Dataflow.

You can use Variant Transforms to transform and load hundreds of thousands of files, millions of samples, and billions of records in a scalable manner. You can use the Variant Transforms preprocessor to validate VCF files and identify inconsistencies.

The typical workflow for using the tool consists of the following steps:

  1. Storing raw VCF files in Cloud Storage.
  2. Using the Variant Transforms tool to load the VCF files from Cloud Storage into BigQuery.

You can then use BigQuery to analyze the variants.

You should familiarize yourself with the BigQuery variants schema for information on how the tool loads VCF files into BigQuery tables.