Variant Transforms Tool

Variant Transforms is an open-source tool used with Cloud Genomics. It is based on Apache Beam and uses Cloud Dataflow.

Using the tool allows you to transform and load hundreds of thousands of files, millions of samples, and billions of records in a scalable manner. The tool also has a preprocessor which you can use to validate VCF files and identify inconsistencies.

The typical workflow for using the tool comprises the following:

  1. Storing raw VCF files in Cloud Storage.
  2. Using the Variant Transforms tool to load the VCF files from Cloud Storage into BigQuery.

You can then use BigQuery to analyze the variants.

You should familiarize yourself with the BigQuery variants schema for information on how the tool loads VCF files into BigQuery tables.

