ウェブ検出のチュートリアル

対象

このチュートリアルの目的は、Vision API のウェブ検出機能を使用してアプリケーションを開発する方法を学ぶことです。このチュートリアルは、基本的なプログラミング構成やテクニックの知識があることを前提としていますが、プログラミングの初心者の方にも簡単に操作できるように設計されています。Cloud Vision API リファレンスドキュメントを参照しながらチュートリアルに従うことで、基本的なアプリケーションを作成できるようになります。

このチュートリアルでは、Vision API アプリケーションを順を追って説明し、Vision API を呼び出してウェブ検出機能を使用する方法を示します。

前提条件

Google Cloud Console で Vision API プロジェクトを設定します。
アプリケーションのデフォルト認証情報を使用するように環境を設定します。

Python

Python をインストールします。
pip をインストールします。
Google Cloud クライアントライブラリをインストールします。

概要

このチュートリアルでは、Web detection リクエストを使用する基本的な Vision API アプリケーションについて段階的に説明します。Web detection レスポンスは、リクエストで送信された画像に、次のようなアノテーションを付けます。

ウェブから取得したラベル
一致する画像のあるサイト URL
リクエスト中の画像に部分一致または完全一致するウェブ画像を参照する URL
視覚的に類似している画像を参照する URL

コードリスト

Vision API Python リファレンスを参照しながらコードを読み進めることをおすすめします。

import argparse

from google.cloud import vision

def annotate(path: str) -> vision.WebDetection:
    """Returns web annotations given the path to an image.

    Args:
        path: path to the input image.

    Returns:
        An WebDetection object with relevant information of the
        image from the internet (i.e., the annotations).
    """
    client = vision.ImageAnnotatorClient()

    if path.startswith("http") or path.startswith("gs:"):
        image = vision.Image()
        image.source.image_uri = path

    else:
        with open(path, "rb") as image_file:
            content = image_file.read()

        image = vision.Image(content=content)

    web_detection = client.web_detection(image=image).web_detection

    return web_detection

def report(annotations: vision.WebDetection) -> None:
    """Prints detected features in the provided web annotations.

    Args:
        annotations: The web annotations (WebDetection object) from which
        the features should be parsed and printed.
    """
    if annotations.pages_with_matching_images:
        print(
            f"\n{len(annotations.pages_with_matching_images)} Pages with matching images retrieved"
        )

        for page in annotations.pages_with_matching_images:
            print(f"Url   : {page.url}")

    if annotations.full_matching_images:
        print(f"\n{len(annotations.full_matching_images)} Full Matches found: ")

        for image in annotations.full_matching_images:
            print(f"Url  : {image.url}")

    if annotations.partial_matching_images:
        print(f"\n{len(annotations.partial_matching_images)} Partial Matches found: ")

        for image in annotations.partial_matching_images:
            print(f"Url  : {image.url}")

    if annotations.web_entities:
        print(f"\n{len(annotations.web_entities)} Web entities found: ")

        for entity in annotations.web_entities:
            print(f"Score      : {entity.score}")
            print(f"Description: {entity.description}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description=__doc__,
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    path_help = str(
        "The image to detect, can be web URI, "
        "Google Cloud Storage, or path to local file."
    )
    parser.add_argument("image_url", help=path_help)
    args = parser.parse_args()

    report(annotate(args.image_url))

この簡単なアプリケーションでは次のタスクを実行します。

アプリケーションの実行に必要なライブラリをインポートする
画像パスを引数として main() 関数に渡す
Google Cloud API クライアントを使用してウェブ検出を実行する
レスポンスをループ処理して、結果を出力する
ウェブエンティティとその説明およびスコアのリストを出力する
一致するページのリストを出力する
部分一致する画像のリストを出力する
完全一致する画像のリストを出力する

詳細

ライブラリのインポート

import argparse

from google.cloud import vision

標準ライブラリをインポートします。

argparse: アプリケーションが入力ファイル名を引数として受け取れるようにする
io: ファイルから読み込む

その他のインポート:

google.cloud.vision ライブラリ内の ImageAnnotatorClient クラス: Vision API へのアクセスを提供する
google.cloud.vision ライブラリ内の types モジュール: リクエストを作成する

アプリケーションの実行

parser = argparse.ArgumentParser(
    description=__doc__,
    formatter_class=argparse.RawDescriptionHelpFormatter,
)
path_help = str(
    "The image to detect, can be web URI, "
    "Google Cloud Storage, or path to local file."
)
parser.add_argument("image_url", help=path_help)
args = parser.parse_args()

report(annotate(args.image_url))

ここでは、単に、ウェブ画像の URL を指定するために渡された引数を解析し、main() 関数に渡します。

API に対する認証

Vision API サービスと通信する前に、事前に取得した認証情報を使用してサービスを認証する必要があります。アプリケーション内で認証情報を取得する最も簡単な方法は、アプリケーションのデフォルト認証情報（ADC）を使用することです。クライアントライブラリにより、資格情報が自動的に取得されます。デフォルトでは、これは GOOGLE_APPLICATION_CREDENTIALS 環境変数から認証情報を取得することで得られます。この環境変数はサービスアカウントの JSON キーファイル（詳しくはサービスアカウントの設定を参照）を指している必要があります。

リクエストの作成

client = vision.ImageAnnotatorClient()

if path.startswith("http") or path.startswith("gs:"):
    image = vision.Image()
    image.source.image_uri = path

else:
    with open(path, "rb") as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

web_detection = client.web_detection(image=image).web_detection

Vision API サービスの準備ができたので、このサービスへのリクエストを作成できます。

このコードスニペットによって、次のタスクが実行されます。

ImageAnnotatorClient インスタンスをクライアントとして作成します。
ローカルファイルまたは URI から Image オブジェクトを作成します。
Image オブジェクトを、クライアントの web_detection メソッドに渡します。
アノテーションを返します。

レスポンスの出力

if annotations.pages_with_matching_images:
    print(
        f"\n{len(annotations.pages_with_matching_images)} Pages with matching images retrieved"
    )

    for page in annotations.pages_with_matching_images:
        print(f"Url   : {page.url}")

if annotations.full_matching_images:
    print(f"\n{len(annotations.full_matching_images)} Full Matches found: ")

    for image in annotations.full_matching_images:
        print(f"Url  : {image.url}")

if annotations.partial_matching_images:
    print(f"\n{len(annotations.partial_matching_images)} Partial Matches found: ")

    for image in annotations.partial_matching_images:
        print(f"Url  : {image.url}")

if annotations.web_entities:
    print(f"\n{len(annotations.web_entities)} Web entities found: ")

    for entity in annotations.web_entities:
        print(f"Score      : {entity.score}")
        print(f"Description: {entity.description}")

オペレーションが完了したら、WebDetection について調べ、アノテーションに含まれるエンティティと URL を出力します（各アノテーションの上位 2 つの結果を次のセクションで示します）。

アプリケーションの実行

アプリケーションを実行するには、以下に示す自動車の画像を参照するウェブ URL（http://www.photos-public-domain.com/wp-content/uploads/2011/01/old-vw-bug-and-van.jpg）を渡します。

Python コマンドと、それに渡す自動車画像のウェブ URL、およびコンソール出力を以下に示します。リストされたエンティティの後に、関連性スコアが付加されているのがわかります。スコアは正規化されていないため、異なる画像クエリの間の比較には使用できません。

python web_detect.py "http://www.photos-public-domain.com/wp-content/uploads/2011/01/old-vw-bug-and-van.jpg"

5 Pages with matching images retrieved
Url   : http://www.photos-public-domain.com/2011/01/07/old-volkswagen-bug-and-van/
Url   : http://pix-hd.com/old+volkswagen+van+for+sale
...

2 Full Matches found:
Url  : http://www.photos-public-domain.com/wp-content/uploads/2011/01/old-vw-bug-and-van.jpg
Url  : http://www.wbwagen.com/media/old-volkswagen-bug-and-van-picture-free-photograph-photos-public_s_66f487042adad5a6.jpg

4 Partial Matches found:
Url  : http://www.photos-public-domain.com/wp-content/uploads/2011/01/old-vw-bug-and-van.jpg
Url  : http://www.wbwagen.com/media/old-vw-bug-and-vanjpg_s_ac343d7f041b5f8d.jpg
...

5 Web entities found:
Score      : 5.35028934479
Description: Volkswagen Beetle
Score      : 1.43998003006
Description: Volkswagen
Score      : 0.828279972076
Description: Volkswagen Type 2
Score      : 0.75271999836
Description: Van
Score      : 0.690039992332
Description: Car

予想どおり、Cloud Vision のウェブ検出の結果、アプリケーションへの入力として渡したウェブ画像の URL（上記のリストの最後の URL）が検出され、fullMatchingImages の下に返されました。

これで完了です。Vision API を使用してウェブ検出を実行しました。