이미지의 텍스트 감지

광 문자 인식(OCR)

Vision API는 이미지에서 텍스트를 감지하고 추출할 수 있습니다. 다음과 같은 두 가지 주석 특징에서 광 문자 인식(OCR)을 지원합니다.

TEXT_DETECTION은 임의의 이미지에서 텍스트를 감지하고 추출합니다. 간판이나 표지판이 찍힌 사진을 예시로 들 수 있습니다. JSON은 추출된 전체 문자열과 함께 개별 단어와 해당 경계 상자를 포함합니다.
DOCUMENT_TEXT_DETECTION도 이미지에서 텍스트를 추출하지만, 응답이 밀집된 텍스트와 문서에 맞게 최적화됩니다. JSON은 페이지, 블록, 단락, 단어, 줄바꿈 정보를 포함합니다.

DOCUMENT_TEXT_DETECTION의 필기 입력 추출 및 파일(PDF/TIFF)에서 텍스트 추출에 대해 자세히 알아보세요.

직접 사용해 보기

Google Cloud를 처음 사용하는 경우 계정을 만들어 실제 시나리오에서 Cloud Vision의 성능을 평가할 수 있습니다. 또한 신규 고객에게는 워크로드를 실행, 테스트, 배포하는 데 사용할 수 있는 $300의 무료 크레딧이 제공됩니다.

Cloud Vision 무료로 사용해 보기

텍스트 감지 요청

Google Cloud 프로젝트 및 인증 설정

아직 Google Cloud 프로젝트를 만들지 않았다면 지금 만드세요. 이 섹션을 펼쳐서 안내를 참조하세요.

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Install the Google Cloud CLI.

외부 ID 공급업체(IdP)를 사용하는 경우 먼저 제휴 ID로 gcloud CLI에 로그인해야 합니다.

gcloud CLI를 초기화하려면, 다음 명령어를 실행합니다.

gcloud init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Vision API.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the API

Install the Google Cloud CLI.

외부 ID 공급업체(IdP)를 사용하는 경우 먼저 제휴 ID로 gcloud CLI에 로그인해야 합니다.

gcloud CLI를 초기화하려면, 다음 명령어를 실행합니다.

gcloud init

로컬 이미지의 텍스트 감지

Vision API를 사용하여 로컬 이미지 파일에서 기능 감지를 수행할 수 있습니다.

REST 요청의 경우 이미지 파일의 콘텐츠를 요청 본문에 base64로 인코딩된 문자열로 보냅니다.

gcloud 및 클라이언트 라이브러리 요청의 경우 요청에 로컬 이미지 경로를 지정합니다.

gcloud

텍스트 인식을 수행하려면 다음 예시와 같이 gcloud ml vision detect-text 명령어를 사용합니다.

gcloud ml vision detect-text ./path/to/local/file.jpg

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

BASE64_ENCODED_IMAGE: 바이너리 이미지 데이터의 base64 표현(ASCII 문자열)입니다. 이 문자열은 다음 문자열과 유사하게 표시됩니다.
- /9j/4QAYRXhpZgAA...9tAVx/zDQDlGxn//2Q==
자세한 내용은 base64 인코딩 주제를 참조하세요.
PROJECT_ID: Google Cloud 프로젝트 ID입니다.

HTTP 메서드 및 URL:

POST https://vision.googleapis.com/v1/images:annotate

JSON 요청 본문:

{
  "requests": [
    {
      "image": {
        "content": "BASE64_ENCODED_IMAGE"
      },
      "features": [
        {
          "type": "TEXT_DETECTION"
        }
      ]
    }
  ]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하거나 gcloud CLI에 자동으로 로그인하는 Cloud Shell을 사용하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/images:annotate"

PowerShell

참고: 다음 명령어는 gcloud init 또는 gcloud auth login을 실행하여 사용자 계정으로 gcloud CLI에 로그인했다고 가정합니다. gcloud auth list를 실행하면 현재 활성 계정을 확인할 수 있습니다.

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

요청이 성공하면 서버가 200 OK HTTP 상태 코드와 응답을 JSON 형식으로 반환합니다.

TEXT_DETECTION 응답은 감지된 문구와 해당 경계 상자 및 개별 단어와 해당 경계 상자를 포함합니다.

응답

{
  "responses": [
    {
      "textAnnotations": [
        {
          "locale": "en",
          "description": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 341,
                "y": 828
              },
              {
                "x": 2249,
                "y": 828
              },
              {
                "x": 2249,
                "y": 1993
              },
              {
                "x": 341,
                "y": 1993
              }
            ]
          }
        },
        {
          "description": "WAITING?",
          "boundingPoly": {
            "vertices": [
              {
                "x": 352,
                "y": 828
              },
              {
                "x": 2248,
                "y": 911
              },
              {
                "x": 2238,
                "y": 1148
              },
              {
                "x": 342,
                "y": 1065
              }
            ]
          }
        },
        {
          "description": "PLEASE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1233
              },
              {
                "x": 1907,
                "y": 1263
              },
              {
                "x": 1902,
                "y": 1383
              },
              {
                "x": 1205,
                "y": 1353
              }
            ]
          }
        },
        {
          "description": "TURN",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1418
              },
              {
                "x": 1730,
                "y": 1441
              },
              {
                "x": 1724,
                "y": 1564
              },
              {
                "x": 1205,
                "y": 1541
              }
            ]
          }
        },
        {
          "description": "OFF",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1792,
                "y": 1443
              },
              {
                "x": 2128,
                "y": 1458
              },
              {
                "x": 2122,
                "y": 1581
              },
              {
                "x": 1787,
                "y": 1566
              }
            ]
          }
        },
        {
          "description": "YOUR",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1219,
                "y": 1603
              },
              {
                "x": 1746,
                "y": 1629
              },
              {
                "x": 1740,
                "y": 1759
              },
              {
                "x": 1213,
                "y": 1733
              }
            ]
          }
        },
        {
          "description": "ENGINE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1222,
                "y": 1771
              },
              {
                "x": 1944,
                "y": 1834
              },
              {
                "x": 1930,
                "y": 1992
              },
              {
                "x": 1208,
                "y": 1928
              }
            ]
          }
        }
      ],
      "fullTextAnnotation": {
        "pages": [
                  ...
                  ]
                },
                "paragraphs": [
                      ...
                      ]
                    },
                    "words": [
                        ...
                        },
                        "symbols": [
                        ...
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              },
              ...
            ]
          }
        ],
        "text": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n"
      }
    }
  ]
}

Go

이 샘플을 사용해 보기 전에 Vision 빠른 시작: 클라이언트 라이브러리 사용의 Go 설정 안내를 따르세요. 자세한 내용은 Vision Go API 참고 문서를 참조하세요.

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.


// detectText gets text from the Vision API for an image at the given file path.
func detectText(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	f, err := os.Open(file)
	if err != nil {
		return err
	}
	defer f.Close()

	image, err := vision.NewImageFromReader(f)
	if err != nil {
		return err
	}
	annotations, err := client.DetectTexts(ctx, image, nil, 10)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No text found.")
	} else {
		fmt.Fprintln(w, "Text:")
		for _, annotation := range annotations {
			fmt.Fprintf(w, "%q\n", annotation.Description)
		}
	}

	return nil
}

Java

이 샘플을 시도하기 전에 Vision API 빠른 시작: 클라이언트 라이브러리 사용의 자바 설정 안내를 따르세요. 자세한 내용은 Vision API Java 참고 문서를 참조하세요.


import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.EntityAnnotation;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.protobuf.ByteString;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class DetectText {
  public static void detectText() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String filePath = "path/to/your/image/file.jpg";
    detectText(filePath);
  }

  // Detects text in the specified image.
  public static void detectText(String filePath) throws IOException {
    List<AnnotateImageRequest> requests = new ArrayList<>();

    ByteString imgBytes = ByteString.readFrom(new FileInputStream(filePath));

    Image img = Image.newBuilder().setContent(imgBytes).build();
    Feature feat = Feature.newBuilder().setType(Feature.Type.TEXT_DETECTION).build();
    AnnotateImageRequest request =
        AnnotateImageRequest.newBuilder().addFeatures(feat).setImage(img).build();
    requests.add(request);

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
      BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
      List<AnnotateImageResponse> responses = response.getResponsesList();

      for (AnnotateImageResponse res : responses) {
        if (res.hasError()) {
          System.out.format("Error: %s%n", res.getError().getMessage());
          return;
        }

        // For full list of available annotations, see http://g.co/cloud/vision/docs
        for (EntityAnnotation annotation : res.getTextAnnotationsList()) {
          System.out.format("Text: %s%n", annotation.getDescription());
          System.out.format("Position : %s%n", annotation.getBoundingPoly());
        }
      }
    }
  }
}

Node.js

이 샘플을 사용해 보기 전에 Vision 빠른 시작: 클라이언트 라이브러리 사용의 Node.js 설정 안내를 따르세요. 자세한 내용은 Vision Node.js API 참고 문서를 참조하세요.

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

const vision = require('@google-cloud/vision');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following line before running the sample.
 */
// const fileName = 'Local image file, e.g. /path/to/image.png';

// Performs text detection on the local file
const [result] = await client.textDetection(fileName);
const detections = result.textAnnotations;
console.log('Text:');
detections.forEach(text => console.log(text));

Python

이 샘플을 사용해 보기 전에 Vision 빠른 시작: 클라이언트 라이브러리 사용의 Python 설정 안내를 따르세요. 자세한 내용은 Vision Python API 참고 문서를 참조하세요.

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

def detect_text(path):
    """Detects text in the file."""
    from google.cloud import vision

    client = vision.ImageAnnotatorClient()

    with open(path, "rb") as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print("Texts:")

    for text in texts:
        print(f'\n"{text.description}"')

        vertices = [
            f"({vertex.x},{vertex.y})" for vertex in text.bounding_poly.vertices
        ]

        print("bounds: {}".format(",".join(vertices)))

    if response.error.message:
        raise Exception(
            "{}\nFor more info on error messages, check: "
            "https://cloud.google.com/apis/design/errors".format(response.error.message)
        )

추가 언어

C#: 클라이언트 라이브러리 페이지의 C# 설정 안내를 따른 다음 .NET용 Vision 참고 문서를 참조하세요.

PHP: 클라이언트 라이브러리 페이지의 PHP 설정 안내를 따른 다음 PHP용 Vision 참고 문서를 참조하세요.

Ruby: 클라이언트 라이브러리 페이지의 Ruby 설정 안내를 따른 다음 Ruby용 Vision 참고 문서를 참조하세요.

원격 이미지의 텍스트 인식

Vision API를 사용하여 Cloud Storage 또는 웹에 있는 원격 이미지 파일에서 기능 감지를 수행할 수 있습니다. 원격 파일 요청을 보내려면 요청 본문에 파일의 웹 URL 또는 Cloud Storage URI를 지정합니다.

gcloud

텍스트 인식을 수행하려면 다음 예시와 같이 gcloud ml vision detect-text 명령어를 사용합니다.

gcloud ml vision detect-text gs://cloud-samples-data/vision/ocr/sign.jpg

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

CLOUD_STORAGE_IMAGE_URI: Cloud Storage 버킷에 있는 유효한 이미지 파일의 경로입니다. 적어도 파일에 대한 읽기 권한이 있어야 합니다. 예를 들면 다음과 같습니다.
- ```
gs://cloud-samples-data/vision/ocr/sign.jpg
```
PROJECT_ID: Google Cloud 프로젝트 ID입니다.

HTTP 메서드 및 URL:

POST https://vision.googleapis.com/v1/images:annotate

JSON 요청 본문:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "CLOUD_STORAGE_IMAGE_URI"
        }
       },
       "features": [
         {
           "type": "TEXT_DETECTION"
         }
       ]
    }
  ]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://vision.googleapis.com/v1/images:annotate"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://vision.googleapis.com/v1/images:annotate" | Select-Object -Expand Content

요청이 성공하면 서버가 200 OK HTTP 상태 코드와 응답을 JSON 형식으로 반환합니다.

TEXT_DETECTION 응답은 감지된 문구와 해당 경계 상자 및 개별 단어와 해당 경계 상자를 포함합니다.

응답

{
  "responses": [
    {
      "textAnnotations": [
        {
          "locale": "en",
          "description": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 341,
                "y": 828
              },
              {
                "x": 2249,
                "y": 828
              },
              {
                "x": 2249,
                "y": 1993
              },
              {
                "x": 341,
                "y": 1993
              }
            ]
          }
        },
        {
          "description": "WAITING?",
          "boundingPoly": {
            "vertices": [
              {
                "x": 352,
                "y": 828
              },
              {
                "x": 2248,
                "y": 911
              },
              {
                "x": 2238,
                "y": 1148
              },
              {
                "x": 342,
                "y": 1065
              }
            ]
          }
        },
        {
          "description": "PLEASE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1233
              },
              {
                "x": 1907,
                "y": 1263
              },
              {
                "x": 1902,
                "y": 1383
              },
              {
                "x": 1205,
                "y": 1353
              }
            ]
          }
        },
        {
          "description": "TURN",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1418
              },
              {
                "x": 1730,
                "y": 1441
              },
              {
                "x": 1724,
                "y": 1564
              },
              {
                "x": 1205,
                "y": 1541
              }
            ]
          }
        },
        {
          "description": "OFF",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1792,
                "y": 1443
              },
              {
                "x": 2128,
                "y": 1458
              },
              {
                "x": 2122,
                "y": 1581
              },
              {
                "x": 1787,
                "y": 1566
              }
            ]
          }
        },
        {
          "description": "YOUR",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1219,
                "y": 1603
              },
              {
                "x": 1746,
                "y": 1629
              },
              {
                "x": 1740,
                "y": 1759
              },
              {
                "x": 1213,
                "y": 1733
              }
            ]
          }
        },
        {
          "description": "ENGINE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1222,
                "y": 1771
              },
              {
                "x": 1944,
                "y": 1834
              },
              {
                "x": 1930,
                "y": 1992
              },
              {
                "x": 1208,
                "y": 1928
              }
            ]
          }
        }
      ],
      "fullTextAnnotation": {
        "pages": [
                  ...
                  ]
                },
                "paragraphs": [
                      ...
                      ]
                    },
                    "words": [
                        ...
                        },
                        "symbols": [
                        ...
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              },
              ...
            ]
          }
        ],
        "text": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n"
      }
    }
  ]
}

Go

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.


// detectText gets text from the Vision API for an image at the given file path.
func detectTextURI(w io.Writer, file string) error {
	ctx := context.Background()

	client, err := vision.NewImageAnnotatorClient(ctx)
	if err != nil {
		return err
	}

	image := vision.NewImageFromURI(file)
	annotations, err := client.DetectTexts(ctx, image, nil, 10)
	if err != nil {
		return err
	}

	if len(annotations) == 0 {
		fmt.Fprintln(w, "No text found.")
	} else {
		fmt.Fprintln(w, "Text:")
		for _, annotation := range annotations {
			fmt.Fprintf(w, "%q\n", annotation.Description)
		}
	}

	return nil
}

Java


import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.EntityAnnotation;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageSource;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class DetectTextGcs {

  public static void detectTextGcs() throws IOException {
    // TODO(developer): Replace these variables before running the sample.
    String filePath = "gs://your-gcs-bucket/path/to/image/file.jpg";
    detectTextGcs(filePath);
  }

  // Detects text in the specified remote image on Google Cloud Storage.
  public static void detectTextGcs(String gcsPath) throws IOException {
    List<AnnotateImageRequest> requests = new ArrayList<>();

    ImageSource imgSource = ImageSource.newBuilder().setGcsImageUri(gcsPath).build();
    Image img = Image.newBuilder().setSource(imgSource).build();
    Feature feat = Feature.newBuilder().setType(Feature.Type.TEXT_DETECTION).build();
    AnnotateImageRequest request =
        AnnotateImageRequest.newBuilder().addFeatures(feat).setImage(img).build();
    requests.add(request);

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient client = ImageAnnotatorClient.create()) {
      BatchAnnotateImagesResponse response = client.batchAnnotateImages(requests);
      List<AnnotateImageResponse> responses = response.getResponsesList();

      for (AnnotateImageResponse res : responses) {
        if (res.hasError()) {
          System.out.format("Error: %s%n", res.getError().getMessage());
          return;
        }

        // For full list of available annotations, see http://g.co/cloud/vision/docs
        for (EntityAnnotation annotation : res.getTextAnnotationsList()) {
          System.out.format("Text: %s%n", annotation.getDescription());
          System.out.format("Position : %s%n", annotation.getBoundingPoly());
        }
      }
    }
  }
}

Node.js

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

// Imports the Google Cloud client libraries
const vision = require('@google-cloud/vision');

// Creates a client
const client = new vision.ImageAnnotatorClient();

/**
 * TODO(developer): Uncomment the following lines before running the sample.
 */
// const bucketName = 'Bucket where the file resides, e.g. my-bucket';
// const fileName = 'Path to file within bucket, e.g. path/to/image.png';

// Performs text detection on the gcs file
const [result] = await client.textDetection(`gs://${bucketName}/${fileName}`);
const detections = result.textAnnotations;
console.log('Text:');
detections.forEach(text => console.log(text));

Python

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

def detect_text_uri(uri):
    """Detects text in the file located in Google Cloud Storage or on the Web."""
    from google.cloud import vision

    client = vision.ImageAnnotatorClient()
    image = vision.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print("Texts:")

    for text in texts:
        print(f'\n"{text.description}"')

        vertices = [
            f"({vertex.x},{vertex.y})" for vertex in text.bounding_poly.vertices
        ]

        print("bounds: {}".format(",".join(vertices)))

    if response.error.message:
        raise Exception(
            "{}\nFor more info on error messages, check: "
            "https://cloud.google.com/apis/design/errors".format(response.error.message)
        )

추가 언어

C#: 클라이언트 라이브러리 페이지의 C# 설정 안내를 따른 다음 .NET용 Vision 참고 문서를 참조하세요.

PHP: 클라이언트 라이브러리 페이지의 PHP 설정 안내를 따른 다음 PHP용 Vision 참고 문서를 참조하세요.

Ruby: 클라이언트 라이브러리 페이지의 Ruby 설정 안내를 따른 다음 Ruby용 Vision 참고 문서를 참조하세요.

언어 지정(선택사항)

OCR 요청의 두 가지 유형 모두 이미지에 포함된 텍스트의 언어를 지정하는 하나 이상의 languageHints를 지원합니다. 그러나 값을 생략하면 자동 언어 감지가 사용되므로 일반적으로 빈 값으로 두면 최상의 결과를 얻을 수 있습니다. 라틴 알파벳을 사용하는 언어라면 languageHints를 설정할 필요가 없습니다. 드문 경우지만, 이미지에 포함된 텍스트의 언어를 알고 있는 경우 힌트를 설정하면 결과가 나아지기도 합니다. 단, 힌트가 잘못된 경우 심각한 저해 요인이 될 수 있습니다. 지정한 언어 중 지원되는 언어가 아닌 언어가 하나라도 있으면 텍스트 인식 시 오류가 반환됩니다.

언어 힌트를 제공하려는 경우 요청(request.json 파일)의 본문을 수정하여 다음 샘플과 같이 imageContext.languageHints 필드에 지원되는 언어 중 하나의 문자열을 제공합니다.

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "IMAGE_URL"
        }
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "imageContext": {
        "languageHints": ["en-t-i0-handwrit"]
      }
    }
  ]
}

멀티 리전 지원

이 기능은 현재 OCR 기능에만 적용됩니다(TEXT_DETECTION 또는 DOCUMENT_TEXT_DETECTION 유형).

이제 대륙 수준 데이터 스토리지와 OCR 처리를 지정할 수 있습니다. 현재 지원되는 리전은 다음과 같습니다.

us: 미국 국가만
eu: 유럽 연합

위치

Cloud Vision은 프로젝트의 리소스가 저장되고 처리되는 위치를 제어할 수 있는 기능을 제공합니다. 특히 유럽 연합에서만 데이터를 저장하고 처리하도록 Cloud Vision을 구성할 수 있습니다.

기본적으로 Cloud Vision은 전역 위치에 리소스를 저장하고 처리하므로 리소스가 특정 위치 또는 리전 내에만 유지되도록 보장하지 않습니다. 유럽 연합 위치를 선택하면 Google은 유럽 연합에서만 데이터를 저장하고 처리합니다. 개발자와 사용자는 어디에서든 데이터에 액세스할 수 있습니다.

API를 사용하여 위치 설정

Vision API는 전역 API 엔드포인트(vision.googleapis.com)와 두 가지 리전 기반 엔드포인트인 유럽 연합 엔드포인트(eu-vision.googleapis.com) 및 미국 엔드포인트(us-vision.googleapis.com)를 모두 지원합니다. 리전별 처리에 이러한 엔드포인트를 사용합니다. 예를 들어 유럽 연합에서만 데이터를 저장하고 처리하려면 REST API 호출에 vision.googleapis.com 대신 URI eu-vision.googleapis.com을 사용합니다.

https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/images:annotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/images:asyncBatchAnnotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/files:annotate
https://eu-vision.googleapis.com/v1/projects/PROJECT_ID/locations/eu/files:asyncBatchAnnotate

미국에서만 데이터를 저장하고 처리하려면 위의 메서드로 미국 엔드포인트(us-vision.googleapis.com)를 사용합니다.

클라이언트 라이브러리를 사용하여 위치 설정

Vision API 클라이언트 라이브러리는 기본적으로 전역 API 엔드포인트(vision.googleapis.com)에 액세스합니다. 유럽 연합에서만 데이터를 저장하고 처리하려면 엔드포인트(eu-vision.googleapis.com)를 명시적으로 설정해야 합니다. 다음 코드 샘플은 이 설정을 구성하는 방법을 보여줍니다.

REST

요청 데이터를 사용하기 전에 다음을 바꿉니다.

REGION_ID: 유효한 리전 위치 식별자 중 하나입니다.
- us: 미국 국가만
- eu: 유럽 연합
CLOUD_STORAGE_IMAGE_URI: Cloud Storage 버킷에 있는 유효한 이미지 파일의 경로입니다. 적어도 파일에 대한 읽기 권한이 있어야 합니다. 예를 들면 다음과 같습니다.
- ```
gs://cloud-samples-data/vision/ocr/sign.jpg
```
PROJECT_ID: Google Cloud 프로젝트 ID입니다.

HTTP 메서드 및 URL:

POST https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/images:annotate

JSON 요청 본문:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "CLOUD_STORAGE_IMAGE_URI"
        }
       },
       "features": [
         {
           "type": "TEXT_DETECTION"
         }
       ]
    }
  ]
}

요청을 보내려면 다음 옵션 중 하나를 선택합니다.

curl

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/images:annotate"

PowerShell

요청 본문을 request.json 파일에 저장하고 다음 명령어를 실행합니다.

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://REGION_ID-vision.googleapis.com/v1/projects/PROJECT_ID/locations/REGION_ID/images:annotate" | Select-Object -Expand Content

요청이 성공하면 서버가 200 OK HTTP 상태 코드와 응답을 JSON 형식으로 반환합니다.

TEXT_DETECTION 응답은 감지된 문구와 해당 경계 상자 및 개별 단어와 해당 경계 상자를 포함합니다.

응답

{
  "responses": [
    {
      "textAnnotations": [
        {
          "locale": "en",
          "description": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 341,
                "y": 828
              },
              {
                "x": 2249,
                "y": 828
              },
              {
                "x": 2249,
                "y": 1993
              },
              {
                "x": 341,
                "y": 1993
              }
            ]
          }
        },
        {
          "description": "WAITING?",
          "boundingPoly": {
            "vertices": [
              {
                "x": 352,
                "y": 828
              },
              {
                "x": 2248,
                "y": 911
              },
              {
                "x": 2238,
                "y": 1148
              },
              {
                "x": 342,
                "y": 1065
              }
            ]
          }
        },
        {
          "description": "PLEASE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1233
              },
              {
                "x": 1907,
                "y": 1263
              },
              {
                "x": 1902,
                "y": 1383
              },
              {
                "x": 1205,
                "y": 1353
              }
            ]
          }
        },
        {
          "description": "TURN",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1210,
                "y": 1418
              },
              {
                "x": 1730,
                "y": 1441
              },
              {
                "x": 1724,
                "y": 1564
              },
              {
                "x": 1205,
                "y": 1541
              }
            ]
          }
        },
        {
          "description": "OFF",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1792,
                "y": 1443
              },
              {
                "x": 2128,
                "y": 1458
              },
              {
                "x": 2122,
                "y": 1581
              },
              {
                "x": 1787,
                "y": 1566
              }
            ]
          }
        },
        {
          "description": "YOUR",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1219,
                "y": 1603
              },
              {
                "x": 1746,
                "y": 1629
              },
              {
                "x": 1740,
                "y": 1759
              },
              {
                "x": 1213,
                "y": 1733
              }
            ]
          }
        },
        {
          "description": "ENGINE",
          "boundingPoly": {
            "vertices": [
              {
                "x": 1222,
                "y": 1771
              },
              {
                "x": 1944,
                "y": 1834
              },
              {
                "x": 1930,
                "y": 1992
              },
              {
                "x": 1208,
                "y": 1928
              }
            ]
          }
        }
      ],
      "fullTextAnnotation": {
        "pages": [
                  ...
                  ]
                },
                "paragraphs": [
                      ...
                      ]
                    },
                    "words": [
                        ...
                        },
                        "symbols": [
                        ...
                      }
                    ]
                  }
                ],
                "blockType": "TEXT"
              },
              ...
            ]
          }
        ],
        "text": "WAITING?\nPLEASE\nTURN OFF\nYOUR\nENGINE\n"
      }
    }
  ]
}

Go

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

import (
	"context"
	"fmt"

	vision "cloud.google.com/go/vision/apiv1"
	"google.golang.org/api/option"
)

// setEndpoint changes your endpoint.
func setEndpoint(endpoint string) error {
	// endpoint := "eu-vision.googleapis.com:443"

	ctx := context.Background()
	client, err := vision.NewImageAnnotatorClient(ctx, option.WithEndpoint(endpoint))
	if err != nil {
		return fmt.Errorf("NewImageAnnotatorClient: %w", err)
	}
	defer client.Close()

	return nil
}

Java

ImageAnnotatorSettings settings =
    ImageAnnotatorSettings.newBuilder().setEndpoint("eu-vision.googleapis.com:443").build();

// Initialize client that will be used to send requests. This client only needs to be created
// once, and can be reused for multiple requests. After completing all of your requests, call
// the "close" method on the client to safely clean up any remaining background resources.
ImageAnnotatorClient client = ImageAnnotatorClient.create(settings);

Node.js

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

// Imports the Google Cloud client library
const vision = require('@google-cloud/vision');

async function setEndpoint() {
  // Specifies the location of the api endpoint
  const clientOptions = {apiEndpoint: 'eu-vision.googleapis.com'};

  // Creates a client
  const client = new vision.ImageAnnotatorClient(clientOptions);

  // Performs text detection on the image file
  const [result] = await client.textDetection('./resources/wakeupcat.jpg');
  const labels = result.textAnnotations;
  console.log('Text:');
  labels.forEach(label => console.log(label.description));
}
setEndpoint();

Python

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

from google.cloud import vision

client_options = {"api_endpoint": "eu-vision.googleapis.com"}

client = vision.ImageAnnotatorClient(client_options=client_options)

직접 해 보기

아래와 같이 텍스트 인식 및 문서 텍스트 인식을 사용해 봅니다. 실행을 클릭하여 이미 지정된 이미지(gs://cloud-samples-data/vision/ocr/sign.jpg)를 사용하거나 자체 이미지를 대신 지정할 수도 있습니다.

문서 텍스트 인식을 사용해 보려면 type의 값을 DOCUMENT_TEXT_DETECTION으로 업데이트합니다.

도로 표지판 이미지

요청 본문:

{
  "requests": [
    {
      "features": [
        {
          "type": "TEXT_DETECTION"
        }
      ],
      "image": {
        "source": {
          "imageUri": "gs://cloud-samples-data/vision/ocr/sign.jpg"
        }
      }
    }
  ]
}