Semantic Similarity for Natural Language

For a given piece of text, find the most similar or related text items among a list of candidates. Built from correlations in natural language usage, this experiment helps to connect items based on meaning and usage rather than simple keywords.

Apply for access Private documentation

Intended use

Problem types: This experiment allows users to explore natural language queries and answers, based on meaning and common relations, instead of standard keywords. We expect that the most valuable use cases will involve placing the experiment within a larger system, rather than as a standalone service. Applications may include product search and recommendations; text querying and understanding; analysis of comments, feedback, or social media streams; etc.

Inputs and outputs:

  • Users provide: List of “input” and "candidate" text pairs.
  • Users receive: List of similarities scores.

What data do I need?

Data specifications: This experiment requires text items as both the query and the list of candidates

  • While the API can take variable length text technically without limit, we find the most performant applications use queries and candidates at the sentence-to-short-paragraph length
  • Languages - Our current model is optimized for English. We plan to support 30+ non-English languages in the future.
  • Users provide input and candidate text pairs to be scored for similarity.
  • Formats - Text pairs will be in simple JSON format.

What skills do I need?

As with all AI Workshop experiments, successful users are likely to be savvy with core AI concepts and skills in order to both deploy the experiment technology and interact with our AI researchers and engineers.

In particular, users of this experiment should:

  • Be thoughtful about how to integrate a fundamental technology into a larger solution
  • Be familiar with accessing Google APIs, particularly using a command line interface