Jump to Content
AI & Machine Learning

Learn how Notified accelerated discovery and classification of journalists at scale with Google Cloud AI

November 30, 2021
BK Arashanapalli

Technology Leader, Google

Earl Hooper

Enterprise Data Architect, Notified

Notified is a leading communications cloud for events, public relations, and investor relations to drive meaningful insights and outcomes. They provide communications solutions to effectively reach and engage customers, investors, employees, and the media.

One of Notified’s Public Relations solutions is the ‘Media Contact Database’ that allows customers to discover media and influencers in a unique media database powered by AI and human-curated research. 

The goal of the initiative is to expand the scope of the AI driven, dynamically discovered influencers, and analyze online news articles using AI/ML technologies to extract entities and classify content. The prior process to extract insights from news articles provided only 30-40% of the desired results, and there were accuracy and stability issues that resulted in a lot of manual intervention.

Journalist Beat

A key outcome of the AI driven process is to identify the ‘Journalist Beat’. A Journalist Beat essentially summarizes the individual’s area of focus such as a sports writer, financial journalist etc. 

Three options were evaluated for the AI/ML process to generate the Journalist Beats :

Option 1:  Topic ML

Unsupervised ML approach to determine the commonly used terms.

  • Pro: Common approach to grouping documents and determine similar text
  • Con: Unbounded list of text

Option 2: ML Classification

Build classification models (supervised) to map reference articles to ‘Beats’ 

  • Pro: Aligns to ‘Research Analytics’ existing processes
  • Con: Time to build and maintain ML models for hundreds of beats.

Option 3: GCP Context Classification

Leverage GCP’s Natural Language API for initial classification and as input to Notified single model

  • Pro: Aligns to ‘Research Analytics’ without building ML models.

Ultimately the GCP Natural Language API solution was chosen because of the speed of execution and a high level of accuracy with the pretrained models. The Notified team was able to launch the product feature within a few weeks, without ever needing to do extensive data collection and train the models. 

Here is the high level process that was implemented for Journalist Beats.

https://storage.googleapis.com/gweb-cloudblog-publish/images/1_Notified.max-1000x1000.jpg

Since Notified supports curated media contacts globally, news articles were instantly translated to English using GCP Translation API. GCP Natural Language API’s solution to classify text was used to analyze the translated text and generate the list of content categories.

Solution Architecture

Here is a sample solution architecture for the ‘Discovered Journalist’ process.

https://storage.googleapis.com/gweb-cloudblog-publish/images/2_Notified.max-800x800.jpg

Three core principles guided the above architecture - Serverless & Fully Managed, Scalability & Elasticity for flexibility and to optimize costs, API led real-time processing.

In addition to the GCP Natural Language API and Translation API below are a few serverless GCP products that were part of the automated solution:

  • BigQuery is Google Cloud's fully managed, petabyte-scale, and cost-effective analytics data warehouse that lets you run analytics over vast amounts of data in near real time.

  • Cloud Run is a fully managed serverless platform that can be used to develop and deploy highly scalable containerized applications.

  • Cloud Tasks is a fully managed service that allows you to manage the execution, dispatch, and delivery of a large number of distributed tasks.

The powerful pre-trained models of the Natural Language API provide a comprehensive set of features to apply natural language understanding to applications such as sentiment analysis, entity analysis, entity sentiment analysis, content classification, and syntax analysis. 

Notified looks ahead to super-scaling

In an effort to even further improve its best in class ‘Media Contact Database’, Notified looks to super scale the above AI driven Influencer Discovery process to the order of 100+ million news articles per month. It plans to expand the scope of entities extracted from the news articles and provide a news exploration service for its customers by performing intelligent entity-based searches.

To watch your markets evolve, see how competitors add AI insights. To actually stay in the market, make AI the main driver of your product road maps. GCP Natural Language API accelerated our ability to adopt AI at scale.

Thomas Squeo, CTO, Notified

Acknowledgments

We’d like to thank our collaborators at Google and Notified for making this blog post possible. Thanks to Arpit Agrawal at MediaAgility for contributing to this blog post.

To learn more about how Google Cloud Natural Language AI can help your enterprise, try out an interactive demo and take the next step, visit the product overview page here.

Posted in