Control how terms are translated

In cases where domain-specific terms and named entities must be consistently translated, create a glossary. A glossary is a custom dictionary that contains corresponding terms in two or more languages. During machine translation, Translation Hub automatically replaces matching source-language terms with their associated target-language terms.

Use glossaries to fine tune how certain terms or entities are translated. For example, you can include a glossary entry to prevent a product name (like "Google Home") from being translated.

After you create a glossary, add it to portals to make it available for portal users to use.

Glossary compared to other translation resources

In addition to glossaries, you can provide other resources to assist portal users with their translations. The following section describes the differences between each resource.

  • Translation memories match on segments, whereas glossaries match on terms. Use translation memories to reuse human-reviewed translations that were imported, captured during post-editing, or both. Translation memories can include as many segments pairs in as many languages as you require.
  • Custom models are trained with your sentence pairs and can help you improve machine translations when you don't have a comprehensive glossary or translation memory available. In cases where you want to tune machine translations for a specific domain and writing style, use AutoML Translation to build custom models that produce better-fitting predictions.

Portal users can use a combination of these resources to help improve the quality of their translations. For more information about how Translation Hub applies resources during translations, see Translate documents.

Stopwords

Translation Hub ignores some terms that are included in a glossary; these terms are known as stopwords. Translation Hub still translates stopwords but ignores any matching glossary entries. For a list of all stopwords, see Glossary stopwords.

Before you begin

To populate your glossary, you need a file with terms in their corresponding languages. The format of your source file depends on the type of glossary that you create: unidirectional or equivalent term set.

Glossary entries are case sensitive. For case-insensitive entries, include both forms in your glossary.

Unidirectional

A unidirectional glossary specifies the desired translation for terms as source and target language pairs. These glossaries work one way. For example, an English to Spanish unidirectional glossary doesn't apply to Spanish to English translations.

You can provide a TSV (tab-separated values), CSV (comma-separated values), or TMX (Translation Memory eXchange) file. For TSV and CSV files, don't include a header row to identify the source and target languages. You specify them when you create the glossary. For TMX files, use the TMX version 1.4 standard.

For details and examples of glossaries, see Creating and using glossaries in the Cloud Translation documentation.

Equivalent term set

An equivalent term set contains equivalent terms in multiple languages and must be defined in a CSV file. Each row contains corresponding terms in different languages. These glossaries are bi-directional. The header row must identify the language for each column by its corresponding language code.

For details and examples of glossaries, see Creating and using glossaries in the Cloud Translation documentation.

Glossary limits

Translation Hub sets limits on the size of the source file and on the size of each glossary entry. For more information, see Quotas and limits.

Create glossary

You create glossaries by using the Google Cloud console. If you have previously created resources through the Cloud Translation API, Translation Hub makes them available to you. You can assign those resources to portals.

  1. In the Translation Hub section of the Google Cloud console, go to the Resources page.

    Go to the Resources page

  2. Click Add resource.

  3. In the Add resource pane, select the Glossaries tab.

  4. Specify a name for the glossary.

  5. Select the glossary type.

  6. Upload a local glossary file to Cloud Storage or select an existing glossary file from Cloud Storage.

  7. Specify the glossary languages.

  8. Click Add to create the glossary.

Add glossaries to portals

After you create glossaries, add them to portals to let portal users use them when they request translations.

  1. In the Translation Hub section of the Google Cloud console, go to the Resources page.

    Go to the Resources page

  2. From the list of resources, select one or more glossaries to add to one or more portals.

  3. Click Assign to portals, which opens the Assign resource to portal pane.

  4. From the portals field, select one or more portals to add the glossaries to.

  5. Click Assign.

    On the Resources page, you can confirm the addition by viewing the Portal names column for each resource.

What's next