Google Cloud Platform
Classifying text content with the Natural Language API
If you work in the media industry, chances are you’ve spent more hours than you’d like manually tagging text content like blogposts, news articles or marketing copy. With the Natural Language API, you can now tag all of this content with a single API call.
Using the new
classify_text endpoint, the Natural Language API will return a content category for your text. The content categories includes a set of Tier 1 high level categories (like “Arts & Entertainment”) along with a set of Tier 2 categories that provide more granularity (like “Visual Art & Design”), with around 700 categories in total.
To try it out, I wrote a Python script that uses data provided by the New York Times API to get the top stories for each section. Then, I combined the title and abstract for each article and sent it to the
classify_text endpoint for categorization. For example, the following title and abstract from this article:
Rafael Montero Shines in Mets’ Victory Over the Reds. Montero, who was demoted at midseason, took a one-hitter into the ninth inning as the Mets continued to dominate Cincinnati with a win at Great American Ball Park.
Results in this JSON response from the NL API:
name: '/Sports/Team Sports/Baseball',
Each response includes a Tier 1 and Tier 2 category, and we can look at the original article to confirm that these categories are correct.
Once I get the article title and abstract text from the NYT API, calling the Natural Language API is just a few lines of code. Here’s an example using Python:
from google.cloud import language_v1beta2
from google.cloud.language_v1beta2 import enums
from google.cloud.language_v1beta2 import types
language_client = language_v1beta2.LanguageServiceClient()
document = types.Document(
content="Your text to classify here",
result = language_client.classify_text(document)
for category in result.categories:
print('category name: ', category.name)
print('category confidence: ', category.confidence, '\n')
A Smoky Lobster Salad With a Tapa Twist. This spin on the Spanish pulpo a la gallega skips the octopus, but keeps the sea salt, olive oil, pimentón and boiled potatoes.
And here’s the NL API’s response:
name: '/Food & Drink/Cooking & Recipes',
name: '/Food & Drink/Food/Meat & Seafood',