Create a python file ocr_test.py. Replace the image_uri_to_test value
with the URI of a source image, as shown:
importosimportrequestsimportjsondefdetect_text_rest(image_uri):"""Performs Optical Character Recognition (OCR) on an image by invoking the Vertex AI REST API."""# Securely fetch the API key from environment variablesapi_key=os.environ.get("GCP_API_KEY")ifnotapi_key:raiseValueError("GCP_API_KEY environment variable must be defined.")# Construct the Vision API endpoint URLvision_api_url=f"https://vision.googleapis.com/v1/images:annotate?key={api_key}"print(f"Initiating OCR process for image: {image_uri}")# Define the request payload for text detectionrequest_payload={"requests":[{"image":{"source":{"imageUri":image_uri}},"features":[{"type":"TEXT_DETECTION"}]}]}# Send a POST request to the Vision APIresponse=requests.post(vision_api_url,json=request_payload)response.raise_for_status()# Check for HTTP errorsresponse_json=response.json()print("\n--- OCR Results ---")# Extract and print the detected textif"textAnnotations"inresponse_json["responses"]:full_text=response_json["responses"]["textAnnotations"]["description"]print(f"Detected Text:\n{full_text}")else:print("No text was detected in the image.")print("--- End of Results ---\n")if__name__=="__main__":# URI of a publicly available image, or a storage bucketimage_uri_to_test="IMAGE_URI"detect_text_rest(image_uri_to_test)
Replace the following:
IMAGE_URI with the URI of a publicly available
image that contains text, for example,
"https://cloud.google.com/vision/docs/images/sign.jpg".
Alternatively, you can specify a Cloud Storage URI, for example,
"gs://your-bucket/your-image.png"
Create a Dockerfile:
ROMpython:3.9-slim
WORKDIR/appCOPYocr_test.py/app/
# Install 'requests' for HTTP callsRUNpipinstall--no-cache-dirrequests
CMD["python","ocr_test.py"]
Build the Docker image for the translation application:
This command creates a secret named gcp-api-key-secret with a key
GCP_API_KEY.
Apply the kubernetes manifest:
apiVersion:batch/v1kind:Jobmetadata:name:ocr-test-job-apikeyspec:template:spec:containers:-name:ocr-test-containerimage:HARBOR_INSTANCE_URL/HARBOR_PROJECT/ocr-app:latest# Your image path# Mount the API key from the secret into the container# as an environment variable named GCP_API_KEY.imagePullSecrets:-name:${SECRET}envFrom:-secretRef:name:gcp-api-key-secretrestartPolicy:NeverbackoffLimit:4
Replace the following:
HARBOR_INSTANCE_URL: the Harbor instance URL.
HARBOR_PROJECT: the Harbor project.
SECRET: the name of the secret created to store docker credentials.
Check the job status:
kubectlgetjobs/ocr-test-job-apikey
# It will show 0/1 completions, then 1/1 after it succeeds
After the job has completed, you can view the OCR output in the pod logs:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-03 UTC."],[],[],null,["# Try out Optical Character Recognition (OCR)\n\nThis guide walks you through the process of running an Optical Character\nRecognition (OCR) test using Google's Vertex AI Vision service.\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using client libraries](/vertex-ai/docs/start/client-libraries).\nFor more information, see the\n[Vertex AI Python API reference documentation](/python/docs/reference/aiplatform/latest).\n\n1. Create a python file `ocr_test.py`. Replace the `image_uri_to_test` value\n with the URI of a source image, as shown:\n\n import os\n import requests\n import json\n\n def detect_text_rest(image_uri):\n \"\"\"Performs Optical Character Recognition (OCR) on an image by invoking the Vertex AI REST API.\"\"\"\n\n # Securely fetch the API key from environment variables\n api_key = os.environ.get(\"GCP_API_KEY\")\n if not api_key:\n raise ValueError(\"GCP_API_KEY environment variable must be defined.\")\n\n # Construct the Vision API endpoint URL\n vision_api_url = f\"https://vision.googleapis.com/v1/images:annotate?key={api_key}\"\n\n print(f\"Initiating OCR process for image: {image_uri}\")\n\n # Define the request payload for text detection\n request_payload = {\n \"requests\": [\n {\n \"image\": {\n \"source\": {\n \"imageUri\": image_uri\n }\n },\n \"features\": [\n {\n \"type\": \"TEXT_DETECTION\"\n }\n ]\n }\n ]\n }\n\n # Send a POST request to the Vision API\n response = requests.post(vision_api_url, json=request_payload)\n response.raise_for_status() # Check for HTTP errors\n\n response_json = response.json()\n\n print(\"\\n--- OCR Results ---\")\n\n # Extract and print the detected text\n if \"textAnnotations\" in response_json[\"responses\"]:\n full_text = response_json[\"responses\"][\"textAnnotations\"][\"description\"]\n print(f\"Detected Text:\\n{full_text}\")\n else:\n print(\"No text was detected in the image.\")\n\n print(\"--- End of Results ---\\n\")\n\n if __name__ == \"__main__\":\n # URI of a publicly available image, or a storage bucket\n image_uri_to_test = \"\u003cvar translate=\"no\"\u003eIMAGE_URI\u003c/var\u003e\"\n\n detect_text_rest(image_uri_to_test)\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eIMAGE_URI\u003c/var\u003e with the URI of a publicly available image that contains text, for example, \"`https://cloud.google.com/vision/docs/images/sign.jpg`\". Alternatively, you can specify a Cloud Storage URI, for example, \"`gs://your-bucket/your-image.png`\"\n2. Create a Dockerfile:\n\n ROM python:3.9-slim\n\n WORKDIR /app\n\n COPY ocr_test.py /app/\n\n # Install 'requests' for HTTP calls\n RUN pip install --no-cache-dir requests\n\n CMD [\"python\", \"ocr_test.py\"]\n\n3. Build the Docker image for the translation application:\n\n docker build -t ocr-app .\n\n4. Follow instructions at\n [Configure Docker](/distributed-cloud/hosted/docs/latest/gdch/platform-application/deploy-container-workloads#configure-docker)\n to:\n\n 1. Configure Docker,\n 2. Create a secret, and\n 3. Upload the image to HaaS.\n5. [Sign in to the user cluster](/distributed-cloud/hosted/docs/latest/gdch/clusters#kubernetes-clusters) and generate its kubeconfig file with a\n user identity. Make sure you set the kubeconfig path as an environment\n variable:\n\n export KUBECONFIG=${CLUSTER_KUBECONFIG_PATH}\n\n6. Create a Kubernetes secret by running the following command in your\n terminal, pasting your API key:\n\n kubectl create secret generic gcp-api-key-secret \\\n --from-literal=GCP_API_KEY='PASTE_YOUR_API_KEY_HERE'\n\n This command creates a secret named `gcp-api-key-secret` with a key\n `GCP_API_KEY`.\n7. Apply the kubernetes manifest:\n\n apiVersion: batch/v1\n kind: Job\n metadata:\n name: ocr-test-job-apikey\n spec:\n template:\n spec:\n containers:\n - name: ocr-test-container\n image: \u003cvar translate=\"no\"\u003e\u003cspan class=\"devsite-syntax-l devsite-syntax-l-Scalar devsite-syntax-l-Scalar-Plain\"\u003eHARBOR_INSTANCE_URL\u003c/span\u003e\u003c/var\u003e/\u003cvar translate=\"no\"\u003eHARBOR_PROJECT\u003c/var\u003e/ocr-app:latest # Your image path\n # Mount the API key from the secret into the container\n # as an environment variable named GCP_API_KEY.\n imagePullSecrets:\n - name: ${SECRET}\n envFrom:\n - secretRef:\n name: gcp-api-key-secret\n restartPolicy: Never\n backoffLimit: 4\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003eHARBOR_INSTANCE_URL\u003c/var\u003e: the Harbor instance URL.\n - \u003cvar translate=\"no\"\u003eHARBOR_PROJECT\u003c/var\u003e: the Harbor project.\n - \u003cvar translate=\"no\"\u003eSECRET\u003c/var\u003e: the name of the secret created to store docker credentials.\n8. Check the job status:\n\n kubectl get jobs/ocr-test-job-apikey\n # It will show 0/1 completions, then 1/1 after it succeeds\n\n9. After the job has completed, you can view the OCR output in the pod logs:\n\n kubectl logs -l job-name=ocr-test-job-apikey"]]