This quickstart guides the Application Operator (AO) through the process of using the Vertex AI Optical Character Recognition (OCR) pre-trained API on Google Distributed Cloud (GDC) air-gapped.
Before you begin
Follow these steps before trying OCR:
Set up a project using the GDC console to group the Vertex AI services. For information about creating and using projects, see Create a project.
Ask your Project IAM Admin to grant you the AI OCR Developer (
ai-ocr-developer
) role in your project namespace.Download the gdcloud command-line interface (CLI).
Set up your service account
Set up your service account with the name of your service account, project ID,
and service key. Replace the PROJECT_ID
with your project.
${HOME}/gdcloud init # set URI and project
${HOME}/gdcloud auth login
${HOME}/gdcloud iam service-accounts create SERVICE_ACCOUNT --project=PROJECT_ID
${HOME}/gdcloud iam service-accounts keys create "SERVICE_KEY ".json --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT
Grant access to project resources
Grant access to the Translation API service account by providing
your project ID, name of your service account, and the role ai-ocr-developer
.
${HOME}/gdcloud iam service-accounts add-iam-policy-binding --project=PROJECT_ID --iam-account=SERVICE_ACCOUNT --role=role/ai-ocr-developer
Set your environment variables
Before running the OCR pre-trained service, set your environment variable.
export GOOGLE_APPLICATION_CREDENTIALS="SERVICE_KEY ".json
Authenticate the request
You must get a token to authenticate the requests to the OCR pre-trained service. Follow these steps:
Export the identity token for the specified account to an environment variable:
export TOKEN="$($HOME/gdcloud auth print-identity-token --audiences=https://ENDPOINT )"
Replace ENDPOINT
with the OCR endpoint. For more information, view service statuses and endpoints.
Install the
google-auth
client library.pip install google-auth
Save the following code to a Python script, and update the
ENDPOINT
to the OCR endpoint. For more information, see View service statuses and endpoints.import google.auth from google.auth.transport import requests api_endpoint = "https://
ENDPOINT " creds, project_id = google.auth.default() creds = creds.with_gdch_audience(api_endpoint) def test_get_token(): req = requests.Request() creds.refresh(req) print(creds.token) if __name__=="__main__": test_get_token()Run the script to fetch the token.
You must add the fetched token to the header of the curl
requests as in the following example:
-H "Authorization: Bearer TOKEN "
Make the curl
request:
echo '{"requests": [{"image": {"content": "'iVBORw0KGgoAAAANSUhEUgAAAMgAAAArCAMAAAAKVjeAAAAAA3NCSVQICAjb4U/gAAAADFBMVEX///8AAABnZ2cMDAzMh6MLAAAAX3pUWHRSYXcgcHJvZmlsZSB0eXBlIEFQUDEAAAiZ40pPzUstykxWKCjKT8vMSeVSAANjEy4TSxNLo0QDAwMLAwgwNDAwNgSSRkC2OVQo0QAFmJibpQGhuVmymSmIzwUAT7oVaBst2IwAAAEjSURBVGiB7ZRBFsMgCEShvf+d+9o0VmAwxpCuZjZGkYGfaEQoiqIoiqIoiqKoG6Sqg6lbTqK1LfwWTpUjSJ0IMnIhyAXdDaL6mwSQPpg5hgeT9H7c5sG1FES/wiA2OgkSLUPfW7wSRNWUdSAuih19drTUFnCuiyBO+6ob7WBGTPJ5tZYDJ4NAJYgvEoesUgoC+8bntgikczALSXQGJLMcuj7nOfAduQbStkm3fQnkUQACP9EZkB3mCsgZ3QEiDkRQ0r9A4K55kHaswlUmyApIVsVH04oGxO1NSoDfbw2IujmI5hX7fNeeDkDaWAbSX/cIIjY4B+KTAoj5xaDelkAEWobooW2/xyZFkH0DTF4GsZ84HIejg4x7UWuAnlSzZIqiJvUCFxYEUadKypwAAAAASUVORK5CYII='" }, "features": [ { "type": "DOCUMENT_TEXT_DETECTION" } ] }] }' | curl --cacert CERTIFICATE_NAME --data-binary @- -H "Content-Type: application/json" -H "Authorization: Bearer TOKEN " -H "x-goog-user-project: projects/PROJECT_ID " https://ENDPOINT /v1/images:annotate
Run the OCR pre-trained API sample script
This example shows you how to interact with an OCR pre-trained API.
Check whether the client library for OCR is installed.
pip freeze | grep vision # output example: google-cloud-vision==3.0.0
If the existing version doesn't match the client library in
https://CONSOLE_ENDPOINT/.well-known/static/client-libraries
, uninstall the client library.pip uninstall google-cloud-vision
Specify the console endpoint and the client library for OCR (provided in the example).
wget https://
CONSOLE_ENDPOINT /.well-known/static/client-libraries/google-cloud-visionExtract the
tar
file, and install it usingpip
. If errors are generated because something isn't found, install any missing dependencies.tar -xvzf
CLIENT_LIBRARY pip install -rFOLDER /requirements.txt --no-index --find-linksFOLDER Use the OCR client library script to generate the token, and make requests to the OCR service.
Set up your environment variable.
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = ""
SERVICE_KEY ".json"
Run the OCR sample
Replace the ENDPOINT
with the OCR endpoint that you use for your
organization.
from google.cloud import vision
import google.auth
from google.auth.transport import requests
from google.api_core.client_options import ClientOptions
audience = "https://ENDPOINT :443"
api_endpoint="ENDPOINT :443"
def vision_client(creds):
opts = ClientOptions(api_endpoint=api_endpoint)
"""Create vision client."""
return vision.ImageAnnotatorClient(credentials=creds, client_options=opts)
def main():
creds = None
try:
creds, project_id = google.auth.default()
creds = creds.with_gdch_audience(audience)
req = requests.Request()
creds.refresh(req)
print("Got token: ")
print(creds.token)
except Exception as e:
print("Caught exception" + str(e))
raise e
return creds
def vision_func(creds):
vc = vision_client(creds)
image = {"content": "iVBORw0KGgoAAAANSUhEUgAAAMgAAAArCAMAAAAKVjeAAAAAA3NCSVQICAjb4U/gAAAADFBMVEX///8AAABnZ2cMDAzMh6MLAAAAX3pUWHRSYXcgcHJvZmlsZSB0eXBlIEFQUDEAAAiZ40pPzUstykxWKCjKT8vMSeVSAANjEy4TSxNLo0QDAwMLAwgwNDAwNgSSRkC2OVQo0QAFmJibpQGhuVmymSmIzwUAT7oVaBst2IwAAAEjSURBVGiB7ZRBFsMgCEShvf+d+9o0VmAwxpCuZjZGkYGfaEQoiqIoiqIoiqKoG6Sqg6lbTqK1LfwWTpUjSJ0IMnIhyAXdDaL6mwSQPpg5hgeT9H7c5sG1FES/wiA2OgkSLUPfW7wSRNWUdSAuih19drTUFnCuiyBO+6ob7WBGTPJ5tZYDJ4NAJYgvEoesUgoC+8bntgikczALSXQGJLMcuj7nOfAduQbStkm3fQnkUQACP9EZkB3mCsgZ3QEiDkRQ0r9A4K55kHaswlUmyApIVsVH04oGxO1NSoDfbw2IujmI5hX7fNeeDkDaWAbSX/cIIjY4B+KTAoj5xaDelkAEWobooW2/xyZFkH0DTF4GsZ84HIejg4x7UWuAnlSzZIqiJvUCFxYEUadKypwAAAAASUVORK5CYII="}
features = [{"type_": vision.Feature.Type.DOCUMENT_TEXT_DETECTION}]
# Each requests element corresponds to a single image. To annotate more
# images, create a request element for each image and add it to
# the array of requests
req = {"image": image, "features": features}
metadata = [("x-goog-user-project", "projects/PROJECT_ID ")]
resp = vc.annotate_image(req,metadata=metadata)
print(resp)
if __name__=="__main__":
creds = main()
vision_func(creds)
Replace PROJECT_ID
with the ID of the project that you want to use.
What's next
- Learn more about how to Detect text in images.
- Learn more about how to Detect text in images offline.