This legacy version of AutoML Vision is deprecated and will no longer be available on Google Cloud after January 23, 2024. All the functionality of legacy AutoML Vision and new features are available on the Vertex AI platform. See Migrate to Vertex AI to learn how to migrate your resources.

Deploy Edge to iOS tutorial

Stay organized with collections Save and categorize content based on your preferences.

What you will build

In this tutorial you will download an exported custom TensorFlow Lite model from AutoML Vision Edge. You will then run a pre-made iOS app that uses the model to identify images of flowers.

End product mobile screenshot
Image credit: Felipe Venâncio, "from my mother's garden" (CC BY 2.0, image shown in app).


In this introductory, end-to-end walkthrough you will use code to:

  • Run a pre-trained model in an iOS app using the TFLite interpreter.

Before you begin

Install TensorFlow

Before you begin the tutorial you need to install several pieces of software:

If you have a working Python installation, run the following commands to download this software:

pip install --upgrade  "tensorflow==1.7.*"
pip install PILLOW

Clone the Git repository

Using the command line, clone the Git repository with the following command:

git clone

Navigate to the directory of the local clone of the repository (tensorflow-for-poets-2 directory). You will run all following code samples from this directory:

cd tensorflow-for-poets-2

Setup the iOS app

The demo iOS app requires several additional tools:

  1. Xcode
  2. Xcode command line tools
  3. Cocoapods

Download Xcode

Use the following link to download Xcode on your machine.

Install Xcode command line tools

Install the Xcode command line tools by running the following command:

xcode-select --install

Install Cocoapods

Cocoapods use Ruby, which is installed by default on macOS.

To install cocoapods, run this command:

sudo gem install cocoapods
Install TFLite Cocoapod

The rest of this codelab needs to run directly in macOS, so close docker now (Ctrl-D will exit docker).

Use the following command to install TensorFlow Lite and create the .xcworkspace file using cocoapods:

pod install --project-directory=ios/tflite/

Open the project with Xcode. You can open the project either through the command line or via the UI.

To open the project via the command line run the following command:

open ios/tflite/tflite_photos_example.xcworkspace

To open the project via the UI, launch Xcode and select the "Open another Project" button.

Xcode UI

After opening the project, navigate to the .xcworkspace file (not the .xcproject file).

Run the original app

The app is a simple example that runs an image recognition model on in the iOS Simulator. The app reads from the photo library, as the Simulator does not support camera input.

Before inserting your customized model, test the baseline version of the app which uses the base "mobilenet" trained on the 1000 ImageNet categories.

To launch the app in the Simulator, select the play button xcode play icon in the upper right corner of the Xcode window.

The "Next Photo" button advances through the photos on the device.

You can add photos to the device's photo library by dragging-and-dropping them onto the Simulator window.

The result should display annotations similar to this image:

test run app screenshot

Run the customized app

The original app setup classifies images into one of the 1000 ImageNet classes, using the standard MobileNet.

Modify the app so that it will use your retrained model with custom image categories.

Add your model files to the project

The demo project is configured to search for a graph.lite, and a labels.txt files in the android/tflite/app/src/main/assets/ directory.

To replace those two files with your versions, run the following command:

cp tf_files/optimized_graph.lite ios/tflite/data/graph.lite
cp tf_files/retrained_labels.txt ios/tflite/data/labels.txt

Run your app

To relaunch the app in the Simulator, select the play button xcode play icon in the upper right corner of the Xcode window.

To test the modifications, add image files from the flower_photos/ directory and get predictions.

Results should look similar to this:

End product mobile screenshot
Image credit: Felipe Venâncio, "from my mother's garden" (CC BY 2.0, image shown in app).

Note that the default images aren't of flowers.

To really try out the model, either add some of the training data images you downloaded earlier, or download some images from a Google search to use for prediction.

How does it work?

Now that you have the app running, look at the TensorFlow Lite specific code.

TensorFlowLite Pod

This app uses a pre-compiled TFLite Cocoapod. The Podfile includes the cocoapod in the project:


platform :ios, '8.0'

target 'tflite_photos_example'
       pod 'TensorFlowLite'

The code interfacing to the TFLite is all contained in the file.


The first block of interest (after the necessary imports) is the viewDidLoad method:

#include "tensorflow/contrib/lite/kernels/register.h"
#include "tensorflow/contrib/lite/model.h"
#include "tensorflow/contrib/lite/string_util.h"
#include "tensorflow/contrib/lite/tools/mutable_op_resolver.h"


- (void)viewDidLoad {
  [super viewDidLoad];
  labelLayers = [[NSMutableArray alloc] init];

  NSString* graph_path = FilePathForResourceName(model_file_name, model_file_type);
  model = tflite::FlatBufferModel::BuildFromFile([graph_path UTF8String]);
  if (!model) {
    LOG(FATAL) << "Failed to mmap model " << graph_path;
  LOG(INFO) << "Loaded model " << graph_path;
  LOG(INFO) << "resolved reporter";


The key line in this first half of the method is the model = tflite::FlatBufferModel::BuildFromFile([graph_path UTF8String]); line. This code creates a FlatBufferModel from the graph file.

A FlatBuffer is a memory mappable data structure. These are a key feature TFLite as they allow the system to better manage the memory used by the model. The system can transparently swap parts of the model in or out of memory as needed.

The second part of the method builds an interpreter for the model, attaching Op implementations to the graph data structure we loaded earlier:

- (void)viewDidLoad {

  tflite::ops::builtin::BuiltinOpResolver resolver;
  LoadLabels(labels_file_name, labels_file_type, &labels);

  tflite::InterpreterBuilder(*model, resolver)(&interpreter);
  if (!interpreter) {
    LOG(FATAL) << "Failed to construct interpreter";
  if (interpreter->AllocateTensors() != kTfLiteOk) {
    LOG(FATAL) << "Failed to allocate tensors!";

  [self attachPreviewLayer];

If you're familiar with TensorFlow in python, this is roughly equivalent to building a tf.Session().

Run the model

The UpdatePhoto method handles all the details of fetching the next photo, updating the preview window, and running the model on the photo.

- (void)UpdatePhoto{
  PHAsset* asset;
  if (photos==nil || photos_index >= photos.count){
    [self updatePhotosLibrary];
  if (photos.count){
    asset = photos[photos_index];
    photos_index += 1;
    input_image = [self convertImageFromAsset:asset
                                   targetSize:CGSizeMake(wanted_input_width, wanted_input_height)
    display_image = [self convertImageFromAsset:asset
    [self DrawImage];

  if (input_image != nil){
    image_data image = [self CGImageToPixels:input_image.CGImage];
    [self inputImageToModel:image];
    [self runModel];

It's the last three lines that we are interested in.

The CGImageToPixels method converts the CGImage returned by the iOS Photos library to a simple structure containing the width, height, channels, and pixel data.


typedef struct {
  int width;
  int height;
  int channels;
  std::vector<uint8_t> data;
} image_data;

The inputImageToModel method handles inserting the image into the interpreter memory. This includes resizing the image and adjusting the pixel values to match what's expected by the model.

- (void)inputImageToModel:(image_data)image{
  float* out = interpreter->typed_input_tensor<float>(0);

  const float input_mean = 127.5f;
  const float input_std = 127.5f;
  assert(image.channels >= wanted_input_channels);
  uint8_t* in =;

  for (int y = 0; y < wanted_input_height; ++y) {
    const int in_y = (y * image.height) / wanted_input_height;
    uint8_t* in_row = in + (in_y * image.width * image.channels);
    float* out_row = out + (y * wanted_input_width * wanted_input_channels);
    for (int x = 0; x < wanted_input_width; ++x) {
      const int in_x = (x * image.width) / wanted_input_width;
      uint8_t* in_pixel = in_row + (in_x * image.channels);
      float* out_pixel = out_row + (x * wanted_input_channels);
      for (int c = 0; c < wanted_input_channels; ++c) {
        out_pixel[c] = (in_pixel[c] - input_mean) / input_std;

We know the model only has one input, so the float* out = interpreter->typed_input_tensor<float>(0); line asks the interpreter a pointer to the memory for input 0. The rest of the method handles the pointer arithmetic and pixel scaling to copy the data into that input array.

Finally the runModel method executes the model:

- (void)runModel {
  double startTimestamp = [[NSDate new] timeIntervalSince1970];
  if (interpreter->Invoke() != kTfLiteOk) {
    LOG(FATAL) << "Failed to invoke!";
  double endTimestamp = [[NSDate new] timeIntervalSince1970];
  total_latency += (endTimestamp - startTimestamp);
  total_count += 1;
  NSLog(@"Time: %.4lf, avg: %.4lf, count: %d", endTimestamp - startTimestamp,
        total_latency / total_count,  total_count);



Next runModel reads back the results. To do this it asks the interpreter for a pointer to the output array's data. The output is a simple array of floats. The GetTopN method handles the extraction of the top 5 results (using a priority queue).

- (void)runModel {

  const int output_size = (int)labels.size();
  const int kNumResults = 5;
  const float kThreshold = 0.1f;

  std::vector<std::pair<float, int>> top_results;

  float* output = interpreter->typed_output_tensor<float>(0);
  GetTopN(output, output_size, kNumResults, kThreshold, &top_results);


The next few lines simply convert those top 5 (probability, class_id) pairs into (probability, label) pairs, and then passes off that result, asynchronously, to the setPredictionValues method which updates the on screen report:

- (void)runModel {

  std::vector<std::pair<float, std::string>> newValues;
  for (const auto& result : top_results) {
    std::pair<float, std::string> item;
    item.first = result.first;
    item.second = labels[result.second];

  dispatch_async(dispatch_get_main_queue(), ^(void) {
    [self setPredictionValues:newValues];

What Next

You've now completed a walkthrough of an iOS flower classification app using an Edge model. You used a trained Edge Tensorflow Lite model to test an image classification app before making modifications to it and getting sample annotations. You then examined TensorFlow Lite specific code to to understand underlying functionality.

The following resources can help you continue to learn about TensorFlow models and AutoML Vision Edge: