本頁面由 Cloud Translation API 翻譯而成。

試用光學字元辨識 (OCR)

本指南將逐步說明如何使用 Google 的 Vertex AI Vision 服務執行光學字元辨識 (OCR) 測試。

在嘗試這個範例之前，請先按照這篇 Vertex AI 快速入門導覽課程中的 Python 設定操作說明進行。詳情請參閱 Vertex AI Python API 參考說明文件。

建立 Python 檔案 ocr_test.py。將 image_uri_to_test 值替換為來源圖片的 URI，如下所示：

import os
import requests
import json

def detect_text_rest(image_uri):
    """Performs Optical Character Recognition (OCR) on an image by invoking the Vertex AI REST API."""

    # Securely fetch the API key from environment variables
    api_key = os.environ.get("GCP_API_KEY")
    if not api_key:
        raise ValueError("GCP_API_KEY environment variable must be defined.")

    # Construct the Vision API endpoint URL
    vision_api_url = f"https://vision.googleapis.com/v1/images:annotate?key={api_key}"

    print(f"Initiating OCR process for image: {image_uri}")

    # Define the request payload for text detection
    request_payload = {
        "requests": [
            {
                "image": {
                    "source": {
                        "imageUri": image_uri
                    }
                },
                "features": [
                    {
                        "type": "TEXT_DETECTION"
                    }
                ]
            }
        ]
    }

    # Send a POST request to the Vision API
    response = requests.post(vision_api_url, json=request_payload)
    response.raise_for_status()  # Check for HTTP errors

    response_json = response.json()

    print("\n--- OCR Results ---")

    # Extract and print the detected text
    if "textAnnotations" in response_json["responses"]:
        full_text = response_json["responses"]["textAnnotations"]["description"]
        print(f"Detected Text:\n{full_text}")
    else:
        print("No text was detected in the image.")

    print("--- End of Results ---\n")

if __name__ == "__main__":
    # URI of a publicly available image, or a storage bucket
    image_uri_to_test = "IMAGE_URI"

    detect_text_rest(image_uri_to_test)

更改下列內容：

將 IMAGE_URI 替換為含有文字的公開圖片 URI，例如「https://cloud.google.com/vision/docs/images/sign.jpg」。或者，您也可以指定 Cloud Storage URI，例如「gs://your-bucket/your-image.png」。

建立 Dockerfile：

ROM python:3.9-slim

WORKDIR /app

COPY ocr_test.py /app/

# Install 'requests' for HTTP calls
RUN pip install --no-cache-dir requests

CMD ["python", "ocr_test.py"]

建構翻譯應用程式的 Docker 映像檔：
```
docker build -t ocr-app .
```
請按照「設定 Docker」中的操作說明執行下列操作：
1. 設定 Docker，
2. 建立密鑰，並
3. 將圖片上傳至 HaaS。
登入使用者叢集，並使用使用者身分產生 kubeconfig 檔案。請務必將 kubeconfig 路徑設為環境變數：
```
export KUBECONFIG=${CLUSTER_KUBECONFIG_PATH}
```
在終端機中執行下列指令，並貼上 API 金鑰，即可建立 Kubernetes 密鑰：
```
kubectl create secret generic gcp-api-key-secret \
  --from-literal=GCP_API_KEY='PASTE_YOUR_API_KEY_HERE'
```
這個指令會建立名為 gcp-api-key-secret 的密鑰，並使用 GCP_API_KEY 金鑰。

套用 Kubernetes 資訊清單：

apiVersion: batch/v1
kind: Job
metadata:
  name: ocr-test-job-apikey
spec:
  template:
    spec:
      containers:
      - name: ocr-test-container
        image: HARBOR_INSTANCE_URL/HARBOR_PROJECT/ocr-app:latest # Your image path
        # Mount the API key from the secret into the container
        # as an environment variable named GCP_API_KEY.
        imagePullSecrets:
        - name: ${SECRET}
        envFrom:
        - secretRef:
            name: gcp-api-key-secret
      restartPolicy: Never
  backoffLimit: 4

更改下列內容：

HARBOR_INSTANCE_URL：Harbor 執行個體網址。
HARBOR_PROJECT：Harbor 專案。
SECRET：為儲存 Docker 憑證而建立的密鑰名稱。

檢查工作狀態：

kubectl get jobs/ocr-test-job-apikey
# It will show 0/1 completions, then 1/1 after it succeeds

工作完成後，您可以在 Pod 記錄中查看 OCR 輸出內容：
```
kubectl logs -l job-name=ocr-test-job-apikey
```

試用光學字元辨識 (OCR) 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

試用光學字元辨識 (OCR)