Google Cloud Platform

Exploring TensorFlow samples in Google Cloud Datalab

Many people think designing deep learning models and training neural networks is complex and time-consuming, taking days or even weeks of work. But it doesn’t have to be. There are a number of tools you can use right now to help you quickly develop and iterate on machine learning models.

One such tool is Cloud Datalab. Although it’s more commonly associated with BigQuery, Datalab increasingly serves as a workspace for machine learning. With it, you can rapidly test out new ideas for hyper-parameters, layer order, layer type, and different datasets.

Once you instantiate Datalab, it provides a complete interactive environment so that both our samples and your code modifications are ready to run without any setup or configuration. You can simply “run all” to generate each sample’s intended output, while viewing both source code and the accompanying instruction text.

Please explore and optionally fork our Datalab notebooks repository on GitHub. To run these samples:

  1. Set up Google Cloud Datalab.

  2. Once it is running, browse to the docs/samples/TensorFlow directory.

  3. Open a TensorFlow notebook that you find interesting and click “Run” on any code block.

In the samples provided, we aim to keep each notebook’s execution time below an hour. We do so by cutting training data, reducing the number of training steps, and simplifying targets. Thus it is very easy to run your own experiments by modifying the code, parameters, or training data. These examples, along with your modifications, can also be the starting point of your own deployment.

The following TensorFlow-oriented Datalab samples (the section headers are also links to directories in a GitHub repository containing the sample code) show you how you might build a few different models to solve real-world problems.

Basic TensorFlow RNN model trained on simulated data

This notebook demonstrates how to use TensorFlow to build a basic RNN model that can predict a future trend curve, given a set of historical time-series data. It uses simulated data: sine waves with varying frequencies and amplitude.

This model that that we will be training, will learn the pattern and simulate future data based on prior data. For example, here is a prediction result (blue is history curve, green is target curve, and red is predicted curve):

image2saft.PNG

TensorFlow’s estimator API is used to help build the training and test graphs. The inference graph includes an LSTM connected to a few hidden layers.

LSTM Punctuation Model With TensorFlow

This notebook shows how to use TensorFlow to build a model that predicts basic punctuation, given input text that intentionally omits it. For example, given:

last december the european commission proposed updating the existing customs union with turkey and extending bilateral trade relations once negotiations have been completed the agreement would still have to be approved by the parliament before it could enter into force

It produces:

last december , the european commission proposed updating the existing customs union with turkey and extending bilateral trade relations once negotiations have been completed . the agreement would still have to be approved by the parliament before it could enter into force .

Training a full punctuation model with good accuracy requires much more sample text data, and additional time. This sample demonstrates what we can do with a limited dataset and a minimal compute resource (single Datalab virtual machine).

Image-to-Captions Model with TensorFlow

This notebook demonstrates how to build an image-to-text model with TensorFlow. We call it our Show and Tell Model: given an image, the model generates captions.

image3vdid.JPEG

Generated captions:

  • a cat laying on top of a rug next to a cat
  • a cat laying on the floor next to a cat
  • a cat laying on top of a rug next to a cat
To significantly reduce training time, we train our model on only images of dogs and cats, representing roughly 8% of the full dataset. The image’s embeddings are generated using the inception-v3 model (comparatively very fast), creating the initial state of the LSTM word sequence model.

image1o6jr.JPEG

We hope you have found these sample scenarios useful as prototypes for your own deployment on TensorFlow, Machine Learning Engine, and Google Cloud Platform. Please let us know if there are other uses for machine learning that you’re interested in deploying on Google Cloud, but don’t exactly know where to start.