Speech-to-Text On-Prem documentation

Overview

Speech-to-Text On-Prem enables easy integration of Google speech recognition technologies into your on-premises solution. The STT On-Prem solution gives you full control over your infrastructure and protected speech data in order to meet data residency and compliance requirements. This best-in-class machine learning technology gives you access to next-generation speech recognition models that are more accurate, smaller in size, and require fewer computing resources to run than existing solutions.

Speech-to-Text On-Prem is a Google Cloud Marketplace application and can be deployed as a container to any GKE cluster. This gives you flexibility and greater control in deployment, whether you decide to deploy on Google Cloud with GKE or on-premises with Anthos. This allows you to take advantage of the simplicity, agility, and cost-effectiveness of Google’s container hosting and management across hybrid environments.

Key capabilities
High quality transcription Apply Google’s advanced deep learning neural network algorithms to automatic speech recognition.
Deployable anywhere Run in any GKE or Anthos cluster.
Efficient models Deploy efficiently with models that are less than 1 GB in size and consume minimal resources.
API compatible Full compatibility with the Speech-to-Text API and its client libraries.
Istio service mesh Use our pre-built Istio objects to seamlessly scale up to thousands of connections.
Stackdriver integration Export metadata logs to one centralized location.
Supported languages Support your global user base with language supports in English, French, Spanish, Cantonese and Japanese.

Reference architecture

Deployment and installation

  1. See the Speech-to-Text On-Prem pricing page for an outline of how cost is calculated.
  2. Contact your seller to get access to the solution.
  3. Deploy the application to your cluster.
  4. Configure your chosen client library to access your deployment.
  5. Start transcribing your audio files.