Can I use a Compute Engine for inference? You can perform inference on a model trained on Compute Engine via the Predict mode in TPUEstimator. See TPUEstimator predict method.
Are there built-in TensorFlow ops that are not available on Compute Engine?
There are a few built-in TensorFlow ops that are not currently available on the Compute Engine. See available TensorFlow Ops, which details the current workarounds.
How can I write a custom op for Compute Engine?
TensorFlow ops that run on Compute Engine are implemented in XLA HLO, a language for defining high-level tensor ops using a small set of low-level functions. XLA is included in TensorFlow's open source release, so it is technically possible to write your op in HLO. The majority of existing implementations can be found in the tf2xla directory.
XLA only allows for execution of a limited set of tensor ops on the TPU, not arbitrary C++ or Python code. Most common tensor ops that can be implemented in HLO have already been written.
Can I use placeholders and feed dictionaries with Compute Engine?
This usage pattern is technically available on Compute Engine, however, we strongly recommend against using. Using placehoders and feed directories limits you to a single Compute Engine core and creates excessive overhead.
Can I train a reinforcement learning (RL) model with a Compute Engine?
Reinforcement learning covers a wide array of techniques, some of which currently are not compatible with the software abstractions for TPUs. Some reinforcement learning configurations require executing a black-box "simulation environment" using a CPU as part of the training loop. We have found that these cannot keep up with the Compute Engine and result in significant inefficiencies.
Can I use word embeddings with a Compute Engine?
Yes, Compute Engine supports
tf.nn.embedding_lookup() since it is just
a wrapper around
tf.gather(), which has an implementation on Compute Engine.
However, Compute Engine does not support
Note that the input id tensor to
tf.embedding_lookup() must have a static
shape during training (that is, the batch size and sequence length must be the
same for every batch). This is a more general restriction on all tensors when
using Compute Engine.
Can I use variable-length sequences with Compute Engine?
There are several methods for representing variable-length sequences in
TensorFlow, including padding,
tf.while_loop(), inferred tensor dimensions,
and bucketing. Unfortunately, the current Compute Engine execution
engine supports a subset of these. Variable-length sequences must be implemented
tf.dynamic_rnn(), bucketing, padding, or sequence
Can I train a Recurrent Neural Network (RNN) on Compute Engine?
In certain configurations,
compatible with the current TPU execution engine. More generally, the TPU
TensorArray, which are used to implement
tf.dynamic_rnn(). Specialized toolkits such as CuDNN are not supported on the
TPU, as they contain GPU-specific code. Using
tf.while_loop() on the TPU does
require specifying an upper bound on the number of loop iterations so that the
TPU execution engine can statically determine the memory usage.
Can I train a generative adversarial network (GAN) with Compute Engine?
Training GANs typically requires frequently alternating between training the generator and training the discriminator. The current TPU execution engine only supports a single execution graph. Alternating between graphs requires a complete re-compilation, which can take 30 seconds or more.
One potential workaround is to always compute the sum of losses for both the
generator and discriminator, but multiply these losses by two input tensors:
d_w. In batches where the generator should be trained, you can pass
d_w=0.0, and vice-versa for batches where the discriminator
should be trained.
Can I train a multi-task learning model with Compute Engine?
If the tasks can be represented as one large graph with an aggregate loss function, then no special support is needed for multi-task learning. However, the TPU execution engine currently only supports a single execution graph. Therefore, it is not possible to quickly alternate between multiple execution graphs which share variables but have different structure. Changing execution graphs requires re-running the graph compilation step, which can take 30 seconds or more.
Does Compute Engine support eager mode?
No, eager mode uses a new dynamic execution engine, while Compute Engine uses XLA, which performs static compilation of the execution graph.
Does Compute Engine support model parallelism?
Model parallelism (or executing non-identical programs on the multiple cores within a single Compute Engine device) is not currently supported.
How can I inspect the actual value of intermediate tensors on
Compute Engine, as with
This capability is currently not supported on Compute Engine. The
procedure suggested for developing on Compute Engine is to implement
the model using the
TPUEstimator framework, which allows for effortless
transition between TPU and CPU/GPU (
use_tpu flag). A good practice is to
debug your models on the CPU/GPU using standard TensorFlow tools, and then
switch to the Compute Engine
when your model is ready for a full-scale training.
My training scheme is too complex or specialized for
TPUEstimator API, is
there a lower-level API that I can use?
TPUEstimator is the primary framework for Cloud TPU training.
TPUEstimator wraps the
tpu API, which is part of open source
TensorFlow, so it is technically possible (but unsupported) to use the low-level
tpu API directly. If your training pipeline requires frequent communication
between the Compute Engine and CPU, or requires frequently changing the
execution graph, your computation cannot run efficiently on Compute Engine.