Running the Automated Speech Recognition (ASR) model

This tutorial shows you how to train an Automated Speech Recognition (ASR) model using the publicly available Librispeech ASR corpus dataset with Tensor2Tensor on a Cloud TPU.

Model description

The speech recognition model is just one of the models in the Tensor2Tensor library. Tensor2Tensor (T2T) is a library of deep learning models and datasets as well as a set of scripts that allow you to train the models and to download and prepare the data. This model does speech-to-text conversion.

Before you begin

Before starting this tutorial, follow the steps below to check that your Google Cloud Platform project is correctly set up.

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the project selector page

  3. Make sure that billing is enabled for your Google Cloud Platform project. Learn how to enable billing.

  4. This walkthrough uses billable components of Google Cloud Platform. Check the Cloud TPU pricing page to estimate your costs. Be sure to clean up resources you create when you've finished with them to avoid unnecessary charges.

Set up your resources

This section provides information on setting up Cloud Storage storage, VM, and Cloud TPU resources for tutorials.

Create a Cloud Storage bucket

You need a Cloud Storage bucket to store the data you use to train your model and the training results. The ctpu up tool used in this tutorial sets up default permissions for the Cloud TPU service account. If you want finer-grain permissions, review the access level permissions.

The bucket location must be in the same region as your virtual machine (VM) and your TPU node. VMs and TPU nodes are located in specific zones, which are subdivisions within a region.

  1. Go to the Cloud Storage page on the GCP Console.

    Go to the Cloud Storage page

  2. Create a new bucket, specifying the following options:

    • A unique name of your choosing.
    • Default storage class: Standard
    • Location: Specify a bucket location in the same region where you plan to create your TPU node. See TPU types and zones to learn where various TPU types are available.

Use the ctpu tool

This section demonstrates using the Cloud TPU provisioning tool (ctpu) for creating and managing Cloud TPU project resources. The resources are comprised of a virtual machine (VM) and a Cloud TPU resource that have the same name. These resources must reside in the same region/zone as the bucket you just created.

You can also set up your VM and TPU resources using gcloud commands or through the Cloud Console. See the creating and deleting TPUs page to learn all the ways you can set up and manage your Compute Engine VM and Cloud TPU resources.

Run ctpu up to create resources

  1. Open a Cloud Shell window.

    Open Cloud Shell

  2. Run gcloud config set project <Your-Project> to use the project where you want to create Cloud TPU.

  3. Run ctpu up specifying the flags shown for either a Cloud TPU device or Pod slice. Refer to CTPU Reference for flag options and descriptions.

  4. Set up a Cloud TPU device:

    $ ctpu up 

    The following configuration message appears:

    ctpu will use the following configuration:
    Name: [your TPU's name]
    Zone: [your project's zone]
    GCP Project: [your project's name]
    TensorFlow Version: 1.14
     Machine Type: [your machine type]
     Disk Size: [your disk size]
     Preemptible: [true or false]
    Cloud TPU:
     Size: [your TPU size]
     Preemptible: [true or false]
    OK to create your Cloud TPU resources with the above configuration? [Yn]:

    Press y to create your Cloud TPU resources.

The ctpu up command creates a virtual machine (VM) and Cloud TPU services.

From this point on, a prefix of (vm)$ means you should run the command on the Compute Engine VM instance.

Verify your Compute Engine VM

When the ctpu up command has finished executing, verify that your shell prompt has changed from username@project to username@tpuname. This change shows that you are now logged into your Compute Engine VM.

Add disk space to your VM

T2T conveniently packages data generation for many common open-source datasets in its t2t-datagen script. The script downloads the data, preprocess it, and makes it ready for training. To do so, it needs local disk space.

You can skip this step if you used ctpu up to create your Compute Engine VM since it provides 250 GB of disk space for your VM. If you set up your Compute Engine VM using gcloud commands or the Cloud Console, and did not specify the VM disk size to be at least 200 GB, follow the instructions below.

  • Follow the Compute Engine guide to add a disk to your Compute Engine VM.
  • Set the disk size to 200 GB (the recommended minimum size).
  • Set When deleting instance to Delete disk to ensure that the disk is removed when you remove the VM.

Make a note of the path to your new disk. For example: /mnt/disks/mnt-dir.

Generate the training and evaluation datasets

On your Compute Engine VM:

  1. Create the following environment variables for directories:

    (vm)$ DATA_DIR=$STORAGE_BUCKET/data/


    • YOUR-BUCKET-NAME is the name of your Cloud Storage bucket.
    • DATA_DIR is a location on Cloud Storage that holds the training and evaluation data.
    • OUT_DIR specifies the directory where checkpoints and summaries are stored during model training. If the folder is missing, the program creates one. When using a Cloud TPU, the output_dir must be a Cloud Storage path (gs://...). You can reuse an existing folder to load current checkpoint data and to store additional checkpoints.
    • YOUR-TMP_DIRECTORY is a location to use to store temporary data. If you added a disk to your Compute Engine VM, this will be a location on the added disk (for example, /mnt/disks/mnt-dir/t2t_tmp. Otherwise, it will be a temporary directory on your VM (for example, /tmp/t2t_tmp).
  2. If you added a new disk to your Compute Engine VM, create a temporary directory on the added disk.

    (vm)$ mkdir $TMP_DIR
  3. Use the t2t-datagen script to generate both the full dataset and the small clean version, which you will use for evaluation.

    As the audio import in t2t-datagen uses sox to generate normalized waveforms, first, install it on your workstation (for example, apt-get install sox) and then run the following commands.

    (vm)$  t2t-datagen --problem=librispeech --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR
    (vm)$  t2t-datagen --problem=librispeech_clean --data_dir=$DATA_DIR --tmp_dir=$TMP_DIR

The problem librispeech_train_full_test_clean trains on the full dataset but evaluate on the clean dataset.

You can also use librispeech_clean_small which is a small version of the clean dataset.

You can view the data on Cloud Storage by going to the Google Cloud Platform Console and choosing Storage from the left-hand menu. Click the name of the bucket that you created for this tutorial.

Training the model

To train a model on Cloud TPU run the trainer with big batches and truncated sequences.

(vm)$ t2t-trainer \
  --model=transformer \
  --hparams_set=transformer_librispeech_tpu \
  --problem=librispeech_train_full_test_clean \
  --train_steps=210000 \
  --eval_steps=3 \
  --local_eval_frequency=100 \
  --data_dir=$DATA_DIR \
  --output_dir=$OUT_DIR \
  --use_tpu \

After this step is completed, run the training again for more steps with smaller batch size and full sequences. This training take approximately 11 hours on a v3-8 TPU node.

(vm)$ t2t-trainer \
  --model=transformer \
  --hparams_set=transformer_librispeech_tpu \
  --hparams=max_length=295650,max_input_seq_length=3650,max_target_seq_length=650,batch_size=6 \
  --problem=librispeech_train_full_test_clean \
  --train_steps=230000 \
  --eval_steps=3 \
  --local_eval_frequency=100 \
  --data_dir=$DATA_DIR \
  --output_dir=$OUT_DIR \
  --use_tpu \

Clean up

To avoid incurring charges to your GCP account for the resources used in this topic:

  1. Disconnect from the Compute Engine VM:

    (vm)$ exit

    Your prompt should now be user@projectname, showing you are in the Cloud Shell.

  2. In your Cloud Shell, run ctpu delete with the --zone flag you used when you set up the Cloud TPU to delete your Compute Engine VM and your Cloud TPU:

    $ ctpu delete [optional: --zone]
  3. Run ctpu status to make sure you have no instances allocated to avoid unnecessary charges for TPU usage. The deletion might take several minutes. A response like the one below indicates there are no more allocated instances:

    2018/04/28 16:16:23 WARNING: Setting zone to "us-central1-b"
    No instances currently exist.
            Compute Engine VM:     --
            Cloud TPU:             --
  4. Run gsutil as shown, replacing YOUR-BUCKET-NAME with the name of the Cloud Storage bucket you created for this tutorial:

    $ gsutil rm -r gs://YOUR-BUCKET-NAME

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Need help? Visit our support page.