Google Cloud

Filtering inappropriate content with the Cloud Vision API

August 17, 2016

Sara Robinson

Developer Advocate, Google Cloud Platform

You may know the Cloud Vision API for its face, object, and landmark detection, but you might not know that the Vision API can also detect inappropriate content in images using the same machine learning models that power Google SafeSearch. Since we announced the Google Cloud Vision API GA in April, we’ve seen over 100 million requests for SafeSearch detection.

Any application with user uploaded images can make use of inappropriate content detection. To get started with this feature, just add SAFE_SEARCH_DETECTION to your API request:

Instead of reviewing all user uploaded images manually, the Vision API’s SafeSearch detection feature flags inappropriate images automatically and returns very few “false positives” (images flagged as inappropriate with no explicit content).

Let’s see how this API method looks for and categorizes inappropriate content (using appropriate images, of course). For example, let’s run SafeSearch detection on this picture of avocado toast:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-3yuuq.max-500x500.PNG

I get the following response:

For each of the four inappropriate content types (spoof, medical, adult, and violence) the API returns one of five possible likelihood values: “VERY_UNLIKELY”, “UNLIKELY”, “POSSIBLE”, “LIKELY”, or “VERY_LIKELY”. Using these likelihood values, you can decide which subset of images require manual review for your app. For example, you might want to manually review only images that the API identifies as “POSSIBLE” and above.

Unsurprisingly, the API response does not give my avocado toast high chances of containing spoof, medical, adult, or violent content. Among the four labels, medical has the highest score due to the close-up nature of the image. If any of these four categories indicated a higher likelihood, it would be easy to flag the image.

The Vision API SafeSearch detection feature uses a deep neural network model specifically trained to classify inappropriate content in images. It computes a score on a 0 to 1 scale for each vertical, and based on that score applies the corresponding likelihood as a string.

What exactly is the API looking for in each SafeSearch category? I chatted with Nikola Todorovic, an engineer on the SafeSearch models, to find out the details. Let’s take a closer look at each part of the API response:

Adult

The adult content classifier is trained to separate pornographic and non-pornographic images. For example, photos of people in swimsuits without nudity are labeled by the API as UNLIKELY or VERY_UNLIKELY. On the other hand, sexual intercourse, nudity and adult content in cartoon images, like hentai, typically gets LIKELY or VERY_LIKELY labels. This classification can be useful in the following scenarios:

When you want to be absolutely certain you’ve removed all inappropriate content (high recall)
When you want to be sure to only remove offensive images (high precision)

In situations where you want to avoid false positives and flag images that you are very sure contain pornographic content, you would remove only images that the API identifies as VERY_LIKELY. If you aren’t as concerned with false positives and want to be on the safe side, you would flag and review all images that return POSSIBLE and above. Ultimately, your use case will determine how you handle the classifier’s response.

The classifier is trained not to flag pictures that contain nudity in a medical, scientific, educational, or artistic context. In addition, the classifier is quite effective in correctly classifying swimsuit and lingerie photos as non-pornographic; despite the skin showing, Cloud Vision is able to detect that the “strategic” parts are covered.

The SafeSearch engineering team has built a debug tool to analyze image classifications and better understand the deep neural networks powering the detection model. Using the internal neural network activations and backpropagation gradients, this tool imposes a heatmap on the image, indicating the parts that the classifier identified as possibly explicit. In the following heatmap, the final classification (VERY_UNLIKELY) is mostly influenced by the highlighted swimsuit area:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-10bx9v.max-400x400.PNG

Image Source

Violence

Images flagged as violent include pictures depicting killing, shooting, or blood and gore. Let’s take a look at a few examples.

Simple images of knives or guns won’t be labeled as violent. For the following image, the API returns UNLIKELY:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-145vuv.max-500x500.PNG

Image source

This image on the other hand, is flagged as POSSIBLE:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-11wmcp.max-400x400.PNG

Image source

Running the image above through the heatmap debugging tool, the following heatmap shows us which parts of the image cause the API to return POSSIBLE for violence:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-120fi8.max-400x400.PNG

This indicates that the model has detected soldiers in the photo. In addition, the pointed rifle will also impact the classification.

Medical

Images flagged as medical content contain explicit images of surgery, diseases, or body parts. A before-and-after picture of hand or face surgery, or a picture depicting an operating room with no close-ups of the patient would not be flagged by the API. The medical classifier looks primarily for graphic photographs of open wounds, genital close-ups, and egregious disease symptoms.

Although the following image clearly depicts a medical procedure, it is labeled as UNLIKELY by the Vision API since there’s nothing inappropriate or shocking about it:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-15rgmz.max-500x500.PNG

Image source

Spoof

The spoof detection classifier primarily looks for memes, which are indicated by the presence of text, typical meme faces, and backgrounds.

The following meme returns a spoof likelihood of VERY_LIKELY:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-188a99.max-600x600.PNG

Image source

Looking at the heatmap, we can see that the spoof classifier clearly identified the combination of the face and text as the spoofed part of this image:

https://storage.googleapis.com/gweb-cloudblog-publish/images/inappropriate-content-detection-176b6c.max-400x400.PNG

Use cases for SafeSearch detection

Since the Vision API became generally available, we’ve seen SafeSearch detection used across a variety of industries for applications with user-generated content. Messaging apps are a primary use case for safe content detection, as they often monitor content as it comes in. Rather than reviewing all content manually, the Vision API lets you automate the content approval process and significantly reduces the number of images requiring manual review. We’ve also seen SafeSearch detection used in crowdsourced digital marketing campaigns where brands ask consumers to upload content.

What’s next?

Try SafeSearch detection directly in the browser by uploading a picture to the Vision API demo here. Dive into the API documentation for SafeSearch detection or use the google-cloud-vision tag on StackOverflow to ask questions. To start building your own apps with the Vision API, check out this GitHub repo for samples in your favorite language.

Posted in