Vertex AI의 최첨단 멀티모달 모델인 Gemini 1.5 Pro를 사용해 보고 토큰 백만 개 규모의 컨텍스트 윈도우로 무엇을 빌드할 수 있는지 알아보세요. Vertex AI의 최첨단 멀티모달 모델인 Gemini 1.5 Pro를 사용해 보고 토큰 백만 개 규모의 컨텍스트 윈도우로 무엇을 빌드할 수 있는지 알아보세요.

오프라인 배치 이미지 주석

Vision API는 Vision 기능 유형을 사용하여 대규모 이미지 파일 배치에 대한 오프라인(비동기) 감지 서비스 및 주석을 실행할 수 있습니다. 예를 들어 단일 이미지 배치에 Vision API 기능을 하나 이상(예: TEXT_DETECTION, LABEL_DETECTION, LANDMARK_DETECTION) 지정할 수 있습니다.

오프라인 일괄 요청의 출력은 지정된 Cloud Storage 버킷에서 만든 JSON 파일로 작성됩니다.

온라인(동기) 요청 - 온라인 주석 요청(images:annotate 또는 files:annotate)은 사용자에게 인라인 주석을 즉시 반환합니다. 온라인 주석 요청은 단일 요청으로 주석을 추가할 수 있는 파일 수를 제한합니다. images:annotate 요청을 사용하면 주석을 추가할 소수의 이미지(16개 이하)만 지정할 수 있습니다. files:annotate 요청을 사용하면 단일 파일을 지정하고 해당 파일에 주석을 추가할 소수의 페이지(5개 이하)를 지정할 수 있습니다.
오프라인(비동기) 요청 - 오프라인 주석 요청(images:asyncBatchAnnotate 또는 files:asyncBatchAnnotate)은 장기 실행 작업(LRO)을 시작하며 호출자에게 응답을 즉시 반환하지 않습니다. LRO가 완료되면 주석은 지정한 Cloud Storage 버킷에 파일로 저장됩니다. images:asyncBatchAnnotate 요청을 사용하면 요청당 최대 2,000개의 이미지를 지정할 수 있습니다. files:asyncBatchAnnotate 요청을 사용하면 온라인 요청으로 가능한 것보다 더 큰 파일 배치를 지정할 수 있으며 파일당 더 많은 페이지(2,000개 이하)에 한 번에 주석을 추가하도록 지정할 수 있습니다.

제한사항

Vision API는 최대 2,000개의 이미지 파일을 허용합니다. 배치의 이미지 파일이 이보다 많으면 오류가 반환됩니다.

현재 지원되는 기능 유형

기능 유형
`CROP_HINTS`	이미지에서 자르기 영역으로 제안되는 꼭짓점을 결정합니다.
`DOCUMENT_TEXT_DETECTION`	문서(PDF/TIFF)와 같은 밀집 텍스트 이미지와 필기 입력이 포함된 이미지에 OCR을 수행합니다. `TEXT_DETECTION`은 희소 텍스트 이미지에 사용할 수 있습니다. `DOCUMENT_TEXT_DETECTION`과 `TEXT_DETECTION`이 모두 존재하는 경우 우선 적용됩니다.
`FACE_DETECTION`	이미지 안의 얼굴을 감지합니다.
`IMAGE_PROPERTIES`	이미지의 주요 색상과 같은 이미지 속성의 집합을 계산합니다.
`LABEL_DETECTION`	이미지 콘텐츠를 기반으로 라벨을 추가합니다.
`LANDMARK_DETECTION`	이미지 안의 특징을 감지합니다.
`LOGO_DETECTION`	이미지 안의 회사 로고를 감지합니다.
`OBJECT_LOCALIZATION`	이미지에서 여러 객체를 감지하고 추출합니다.
`SAFE_SEARCH_DETECTION`	세이프서치를 실행하여 안전하지 않거나 바람직하지 않은 콘텐츠를 감지합니다.
`TEXT_DETECTION`	이미지 안의 텍스트에 광 문자 인식(OCR)을 수행합니다. 텍스트 감지는 큰 이미지 내의 희소 텍스트 영역에 최적화되어 있습니다. 이미지가 문서(PDF/TIFF)이거나 밀집 텍스트가 있거나 필기 입력이 포함된 경우 `DOCUMENT_TEXT_DETECTION`을 대신 사용하세요.
`WEB_DETECTION`	Google 이미지 검색을 사용하여 이미지에서 뉴스, 이벤트, 연예인 등의 주제별 항목을 검색하고 웹에서 유사한 이미지를 찾습니다.

샘플 코드

Cloud Storage에서 이미지 파일 배치에 오프라인 주석 서비스를 실행하려면 다음 코드 샘플을 사용하세요.

참고: 다음 코드 샘플에서 각 요청 요소(requests_element/requestsElement)는 단일 이미지에 해당합니다. 더 많은 이미지에 주석을 추가하려면 각 이미지에 대한 요청 요소를 만들어서 요청 배열(requests)에 추가합니다.

Java

이 샘플을 시도하기 전에 Vision API 빠른 시작: 클라이언트 라이브러리 사용의 자바 설정 안내를 따르세요. 자세한 내용은 Vision API 자바 참조 문서를 참조하세요.

import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesRequest;
import com.google.cloud.vision.v1.AsyncBatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.GcsDestination;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.cloud.vision.v1.ImageSource;
import com.google.cloud.vision.v1.OutputConfig;
import java.io.IOException;
import java.util.concurrent.ExecutionException;

public class AsyncBatchAnnotateImages {

  public static void asyncBatchAnnotateImages()
      throws InterruptedException, ExecutionException, IOException {
    String inputImageUri = "gs://cloud-samples-data/vision/label/wakeupcat.jpg";
    String outputUri = "gs://YOUR_BUCKET_ID/path/to/save/results/";
    asyncBatchAnnotateImages(inputImageUri, outputUri);
  }

  public static void asyncBatchAnnotateImages(String inputImageUri, String outputUri)
      throws IOException, ExecutionException, InterruptedException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ImageAnnotatorClient imageAnnotatorClient = ImageAnnotatorClient.create()) {

      // You can send multiple images to be annotated, this sample demonstrates how to do this with
      // one image. If you want to use multiple images, you have to create a `AnnotateImageRequest`
      // object for each image that you want annotated.
      // First specify where the vision api can find the image
      ImageSource source = ImageSource.newBuilder().setImageUri(inputImageUri).build();
      Image image = Image.newBuilder().setSource(source).build();

      // Set the type of annotation you want to perform on the image
      // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
      Feature feature = Feature.newBuilder().setType(Feature.Type.LABEL_DETECTION).build();

      // Build the request object for that one image. Note: for additional images you have to create
      // additional `AnnotateImageRequest` objects and store them in a list to be used below.
      AnnotateImageRequest imageRequest =
          AnnotateImageRequest.newBuilder().setImage(image).addFeatures(feature).build();

      // Set where to store the results for the images that will be annotated.
      GcsDestination gcsDestination = GcsDestination.newBuilder().setUri(outputUri).build();
      OutputConfig outputConfig =
          OutputConfig.newBuilder()
              .setGcsDestination(gcsDestination)
              .setBatchSize(2) // The max number of responses to output in each JSON file
              .build();

      // Add each `AnnotateImageRequest` object to the batch request and add the output config.
      AsyncBatchAnnotateImagesRequest request =
          AsyncBatchAnnotateImagesRequest.newBuilder()
              .addRequests(imageRequest)
              .setOutputConfig(outputConfig)
              .build();

      // Make the asynchronous batch request.
      AsyncBatchAnnotateImagesResponse response =
          imageAnnotatorClient.asyncBatchAnnotateImagesAsync(request).get();

      // The output is written to GCS with the provided output_uri as prefix
      String gcsOutputUri = response.getOutputConfig().getGcsDestination().getUri();
      System.out.format("Output written to GCS with prefix: %s%n", gcsOutputUri);
    }
  }
}

Node.js

이 샘플을 사용해 보기 전에 Vision 빠른 시작: 클라이언트 라이브러리 사용의 Node.js 설정 안내를 따르세요. 자세한 내용은 Vision Node.js API 참조 문서를 참조하세요.

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.

/**
 * TODO(developer): Uncomment these variables before running the sample.
 */
// const inputImageUri = 'gs://cloud-samples-data/vision/label/wakeupcat.jpg';
// const outputUri = 'gs://YOUR_BUCKET_ID/path/to/save/results/';

// Imports the Google Cloud client libraries
const {ImageAnnotatorClient} = require('@google-cloud/vision').v1;

// Instantiates a client
const client = new ImageAnnotatorClient();

// You can send multiple images to be annotated, this sample demonstrates how to do this with
// one image. If you want to use multiple images, you have to create a request object for each image that you want annotated.
async function asyncBatchAnnotateImages() {
  // Set the type of annotation you want to perform on the image
  // https://cloud.google.com/vision/docs/reference/rpc/google.cloud.vision.v1#google.cloud.vision.v1.Feature.Type
  const features = [{type: 'LABEL_DETECTION'}];

  // Build the image request object for that one image. Note: for additional images you have to create
  // additional image request objects and store them in a list to be used below.
  const imageRequest = {
    image: {
      source: {
        imageUri: inputImageUri,
      },
    },
    features: features,
  };

  // Set where to store the results for the images that will be annotated.
  const outputConfig = {
    gcsDestination: {
      uri: outputUri,
    },
    batchSize: 2, // The max number of responses to output in each JSON file
  };

  // Add each image request object to the batch request and add the output config.
  const request = {
    requests: [
      imageRequest, // add additional request objects here
    ],
    outputConfig,
  };

  // Make the asynchronous batch request.
  const [operation] = await client.asyncBatchAnnotateImages(request);

  // Wait for the operation to complete
  const [filesResponse] = await operation.promise();

  // The output is written to GCS with the provided output_uri as prefix
  const destinationUri = filesResponse.outputConfig.gcsDestination.uri;
  console.log(`Output written to GCS with prefix: ${destinationUri}`);
}

asyncBatchAnnotateImages();

Python

이 샘플을 사용해 보기 전에 Vision 빠른 시작: 클라이언트 라이브러리 사용의 Python 설정 안내를 따르세요. 자세한 내용은 Vision Python API 참조 문서를 참조하세요.

Vision에 인증하려면 애플리케이션 기본 사용자 인증 정보를 설정합니다. 자세한 내용은 로컬 개발 환경의 인증 설정을 참조하세요.


from google.cloud import vision_v1

def sample_async_batch_annotate_images(
    input_image_uri="gs://cloud-samples-data/vision/label/wakeupcat.jpg",
    output_uri="gs://your-bucket/prefix/",
):
    """Perform async batch image annotation."""
    client = vision_v1.ImageAnnotatorClient()

    source = {"image_uri": input_image_uri}
    image = {"source": source}
    features = [
        {"type_": vision_v1.Feature.Type.LABEL_DETECTION},
        {"type_": vision_v1.Feature.Type.IMAGE_PROPERTIES},
    ]

    # Each requests element corresponds to a single image.  To annotate more
    # images, create a request element for each image and add it to
    # the array of requests
    requests = [{"image": image, "features": features}]
    gcs_destination = {"uri": output_uri}

    # The max number of responses to output in each JSON file
    batch_size = 2
    output_config = {"gcs_destination": gcs_destination, "batch_size": batch_size}

    operation = client.async_batch_annotate_images(
        requests=requests, output_config=output_config
    )

    print("Waiting for operation to complete...")
    response = operation.result(90)

    # The output is written to GCS with the provided output_uri as prefix
    gcs_output_uri = response.output_config.gcs_destination.uri
    print(f"Output written to GCS with prefix: {gcs_output_uri}")

대응

요청에 성공하면 코드 샘플에서 지정한 Cloud Storage 버킷에 응답 JSON 파일이 반환됩니다. JSON 파일당 응답 수는 코드 샘플에서 batch_size에 의해 지정됩니다.

반환된 응답은 이미지에 대해 어떤 기능을 요청하는지에 따라 일반 Vision API 기능 응답과 유사합니다.

다음 응답은 image1.png에 대해 LABEL_DETECTION 및 TEXT_DETECTION 주석을, image2.jpg에 대해 IMAGE_PROPERTIES 주석을, image3.jpg에 대해 OBJECT_LOCALIZATION 주석을 표시합니다.

응답에는 파일의 URI를 표시하는 context 필드도 포함됩니다.

`offline_batch_output/output-1-to-2.json`

{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/07s6nbt",
          "description": "Text",
          "score": 0.93413997,
          "topicality": 0.93413997
        },
        {
          "mid": "/m/0dwx7",
          "description": "Logo",
          "score": 0.8733531,
          "topicality": 0.8733531
        },
        ...
        {
          "mid": "/m/03bxgrp",
          "description": "Company",
          "score": 0.5682425,
          "topicality": 0.5682425
        }
      ],
      "textAnnotations": [
        {
          "locale": "en",
          "description": "Google\n",
          "boundingPoly": {
            "vertices": [
              {
                "x": 72,
                "y": 40
              },
              {
                "x": 613,
                "y": 40
              },
              {
                "x": 613,
                "y": 233
              },
              {
                "x": 72,
                "y": 233
              }
            ]
          }
        },
        ...
                ],
                "blockType": "TEXT"
              }
            ]
          }
        ],
        "text": "Google\n"
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image1.png"
      }
    },
    {
      "imagePropertiesAnnotation": {
        "dominantColors": {
          "colors": [
            {
              "color": {
                "red": 229,
                "green": 230,
                "blue": 238
              },
              "score": 0.2744754,
              "pixelFraction": 0.075339235
            },
            ...
            {
              "color": {
                "red": 86,
                "green": 87,
                "blue": 95
              },
              "score": 0.025770646,
              "pixelFraction": 0.13109145
            }
          ]
        }
      },
      "cropHintsAnnotation": {
        "cropHints": [
          {
            "boundingPoly": {
              "vertices": [
                {},
                {
                  "x": 1599
                },
                {
                  "x": 1599,
                  "y": 1199
                },
                {
                  "y": 1199
                }
              ]
            },
            "confidence": 0.79999995,
            "importanceFraction": 1
          }
        ]
      },
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image2.jpg"
      }
    }
  ]
}

`offline_batch_output/output-3-to-3.json`

{
  "responses": [
    {
      "context": {
        "uri": "gs://cloud-samples-data/vision/document_understanding/image3.jpg"
      },
      "localizedObjectAnnotations": [
        {
          "mid": "/m/0bt9lr",
          "name": "Dog",
          "score": 0.9669734,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.6035543,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.1357359
              },
              {
                "x": 0.98546547,
                "y": 0.98426414
              },
              {
                "x": 0.6035543,
                "y": 0.98426414
              }
            ]
          }
        },
        ...
        {
          "mid": "/m/0jbk",
          "name": "Animal",
          "score": 0.58003056,
          "boundingPoly": {
            "normalizedVertices": [
              {
                "x": 0.014534635,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.1357359
              },
              {
                "x": 0.37197515,
                "y": 0.98426414
              },
              {
                "x": 0.014534635,
                "y": 0.98426414
              }
            ]
          }
        }
      ]
    }
  ]
}