Vision API は、画像ラベリング、顔やランドマークの検出、光学式文字認識、露骨な表現を含むコンテンツのタグ付けなど、複数の処理機能をサポートしています。これらの機能はそれぞれ、さまざまな業界に適用可能な複数のユースケースを有効にします。このドキュメントでは、Vision API を使用する際に可能なことの簡単な例をいくつか紹介しますが、可能なアプリケーションの範囲は非常に広範囲です。
Vision API は REST API や RPC API を介して強力な事前トレーニング済み ML モデルを提供します。画像にラベルを割り当てることで、事前定義済みの数百万のカテゴリに画像を分類できます。オブジェクトを検出し、印刷テキストや手書き文字を読み取り、画像カタログ内に有用なメタデータを作成します。
このアーキテクチャでは、モデルのトレーニングを事前に行う必要はありません。特定のデータでトレーニングされたカスタムモデルが必要な場合は、Vertex AI を使用して、画像分類やオブジェクト検出などのコンピュータ ビジョンの目標のために AutoML またはカスタムモデルをトレーニングできます。または、Vertex AI Vision を使用して、エンドツーエンドのアプリケーション開発環境でコンピュータ ビジョン アプリケーションを構築、デプロイ、管理することもできます。
代替案を設計する
画像を Google Cloud Storage バケットに保存する代わりに、画像を生成するプロセスで、画像をメッセージ システム(Pub/Sub など)に直接公開できます。また、Dataflow パイプラインで画像を Vision API に直接送信できます。
信頼できないソースから受信した画像には、マルウェアが含まれている可能性があります。Vision API は、分析した画像に基づいて何も実行しないため、画像ベースのマルウェアは API に影響しません。画像をスキャンする必要がある場合は、Dataflow パイプラインを変更してスキャンのステップを追加します。同じ結果を得るために、Pub/Sub トピックへの個別のサブスクリプションを使用し、別のプロセスで画像をスキャンすることもできます。
Vision API は、Identity and Access Management(IAM)を使用して認証を行います。Vision API にアクセスするには、セキュリティ プリンシパルに、分析するファイルを含むバケットに対する Cloud Storage > ストレージ オブジェクト閲覧者(roles/storage.objectViewer)のアクセス権が必要です。
AI ワークロードと ML ワークロードに固有のセキュリティの原則と推奨事項については、Well-Architected Framework の AI と ML の視点: セキュリティをご覧ください。
AI ワークロードと ML ワークロードに固有の費用最適化の原則と推奨事項については、Well-Architected Framework の AI と ML の視点: 費用の最適化をご覧ください。
パフォーマンスの最適化
Vision API はリソースを大量に消費する API です。そのため、画像を大規模に処理するには、API 呼び出しを慎重にオーケストレーションする必要があります。Dataflow パイプラインは、API リクエストのバッチ処理、割り当てに達したことに関連する例外の適切な処理、API 使用量のカスタム指標の生成を行います。これらの指標は、API 割り当ての増加が必要かどうか、またはリクエストの頻度を減らすために Dataflow パイプライン パラメータを調整する必要があるかどうかを判断するのに役立ちます。Vision API の割り当てリクエストの増加については、割り当てと上限をご覧ください。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["わかりにくい","hardToUnderstand","thumb-down"],["情報またはサンプルコードが不正確","incorrectInformationOrSampleCode","thumb-down"],["必要な情報 / サンプルがない","missingTheInformationSamplesINeed","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2024-05-23 UTC。"],[[["\u003cp\u003eThis architecture demonstrates how to use a Dataflow pipeline to process image files with Cloud Vision API and store the resulting annotations in BigQuery for large-scale data analysis and model training.\u003c/p\u003e\n"],["\u003cp\u003eThe system ingests images through Cloud Storage, triggers processing via Pub/Sub notifications, and uses Dataflow to send image references to Vision API for analysis.\u003c/p\u003e\n"],["\u003cp\u003eThe processed data is stored in BigQuery, enabling users to query, analyze, and utilize BigQuery ML or Vertex AI to build models based on the image annotations.\u003c/p\u003e\n"],["\u003cp\u003eThe design offers flexibility with design alternatives, including direct Pub/Sub messaging for latency-sensitive use cases and async batch processing for large image sets, while emphasizing security, cost, and performance optimization considerations.\u003c/p\u003e\n"],["\u003cp\u003eThis reference architecture provides a way to achieve processing at scale by batching the API requests, handling exceptions, and producing custom metrics that can be used to determine if an API quota increase or adjustments in parameters are needed.\u003c/p\u003e\n"]]],[],null,["# Build an ML vision analytics solution with Dataflow and Cloud Vision API\n\nIn this reference architecture, you'll learn about the use cases,\ndesign alternatives, and design considerations when deploying a\n[Dataflow](/dataflow)\npipeline to process image files with\n[Cloud Vision](/vision/docs)\nand to store processed results in BigQuery.\nYou can use those stored results for large scale data analysis and to train\n[BigQuery ML pre-built models](/bigquery-ml/docs/introduction).\n\nThis reference architecture document is intended for data engineers and data\nscientists.\n\nArchitecture\n------------\n\nThe following diagram illustrates the system flow for this reference\narchitecture.\n\nAs shown in the preceding diagram, information flows as follows:\n\n1. **Ingest and trigger**: This is the first stage of the system flow where\n images first enter the system. During this stage, the following actions\n occur:\n\n 1. Clients upload image files to a Cloud Storage bucket.\n 2. For each file upload, the Cloud Storage automatically sends an input notification by publishing a message to Pub/Sub.\n2. **Process**: This stage immediately follows the ingest and trigger stage.\n For each new input notification, the following actions occur:\n\n 1. The Dataflow pipeline listens for these file input notifications, extracts file metadata from the Pub/Sub message, and sends the file reference to Vision API for processing.\n 2. Vision API reads the image and creates annotations.\n 3. The Dataflow pipeline stores the annotations produced by Vision API in BigQuery tables.\n3. **Store and analyze**: This is the final stage in the flow. At this stage,\n you can do the following with the saved results:\n\n 1. Query BigQuery tables and analyze the stored annotations.\n 2. Use BigQuery ML or Vertex AI to build models and execute predictions based on the stored annotations.\n 3. Perform additional analysis in the Dataflow pipeline (not shown on this diagram).\n\nProducts used\n-------------\n\nThis reference architecture uses the following Google Cloud products:\n\n- [BigQuery](/bigquery/docs)\n- [Cloud Storage](/storage/docs)\n- [Vision API](/storage/docs)\n- [Dataflow](/dataflow/docs)\n- [Pub/Sub](/pubsub/docs)\n\nUse cases\n---------\n\nVision API supports\n[multiple processing features](/vision/docs/features-list),\nincluding image labeling, face and landmark detection, optical character\nrecognition, explicit content tagging, and others. Each of these features enable\nseveral use cases that are applicable to different industries. This document\ncontains some simple examples of what's possible when using\nVision API, but the spectrum of possible applications is very broad.\n\nVision API also offers powerful pre-trained machine learning models\nthrough REST and RPC APIs. You can assign labels to images and classify them\ninto millions of predefined categories. It helps you detect objects, read\nprinted and handwritten text, and build valuable metadata into your image\ncatalog.\n\nThis architecture doesn't require any model training before you can use it. If\nyou need a custom model trained on your specific data, Vertex AI\nlets you train an AutoML or a custom model for computer vision\nobjectives, like image classification and object detection. Or, you can use\n[Vertex AI Vision](/vertex-ai-vision)\nfor an end-to-end application development environment that lets you\nbuild, deploy, and manage computer vision applications.\n\nDesign alternatives\n-------------------\n\nInstead of storing images in a Google Cloud Storage bucket, the\nprocess that produces the images can publish them directly to a messaging\nsystem---Pub/Sub for example---and the Dataflow pipeline can\nsend the images directly to Vision API.\n\nThis design alternative can be a good solution\nfor latency-sensitive use cases where you need to analyze images of relatively\nsmall sizes. Pub/Sub limits the maximum size of the message to\n10 Mb.\n\nIf you need to batch process a large number of images, you can use a\nspecifically designed\n[`asyncBatchAnnotate`](/vision/docs/reference/rest/v1/images/asyncBatchAnnotate)\nAPI.\n\nDesign considerations\n---------------------\n\nThis section describes the design considerations for this reference\narchitecture:\n\n- [Security, privacy, and compliance](#security,_privacy,_and_compliance)\n- [Cost optimization](#cost_optimization)\n- [Performance optimization](#performance_optimization)\n\n### Security, privacy, and compliance\n\nImages received from untrusted sources can contain malware. Because\nVision API doesn't execute anything based on the images it analyzes,\nimage-based malware wouldn't affect the API. If you need to scan images, change\nthe Dataflow pipeline to add a scanning step. To achieve the same\nresult, you can also use a separate subscription to the Pub/Sub\ntopic and scan images in a separate process.\n\nFor more information, see\n[Automate malware scanning for files uploaded to Cloud Storage](/architecture/automate-malware-scanning-for-documents-uploaded-to-cloud-storage).\n\nVision API uses\n[Identity and Access Management (IAM)](/iam/docs/overview)\nfor authentication. To access the Vision API, the\nsecurity principal needs **Cloud Storage \\\u003e Storage object viewer**\n(`roles/storage.objectViewer`) access to the bucket that contains the files that\nyou want to analyze.\n\n\nFor security principles and recommendations that are specific to AI and ML workloads, see\n[AI and ML perspective: Security](/architecture/framework/perspectives/ai-ml/security)\nin the Well-Architected Framework.\n\n### Cost optimization\n\nCompared to the other options discussed, like low-latency processing and\nasynchronous batch processing, this reference architecture uses a cost-efficient\nway to process the images in streaming pipelines by batching the API requests.\nThe lower latency direct image streaming mentioned in the\n[Design alternatives](#design_alternatives)\nsection could be more expensive due to the additional Pub/Sub and\nDataflow costs. For image processing that doesn't need to happen\nwithin seconds or minutes, you can run the Dataflow pipeline in\nbatch mode. Running the pipeline in batch mode can provide some savings when\ncompared to what it costs to run the streaming pipeline.\n\nVision API supports offline\n[asynchronous batch](/vision/docs/batch)\nimage annotation for all features. The asynchronous request supports up to 2,000\nimages per batch. In response, Vision API returns JSON files that are\nstored in a Cloud Storage bucket.\n\nVision API also provides a set of features for analyzing images.\n[The pricing](/vision/pricing)\nis per image per feature. To reduce costs, only request the specific features\nyou need for your solution.\n\nTo generate a cost estimate based on your projected usage, use the\n[pricing calculator](/products/calculator).\n\n\nFor cost optimization principles and recommendations that are specific to AI and ML workloads, see\n[AI and ML perspective: Cost optimization](/architecture/framework/perspectives/ai-ml/cost-optimization)\nin the Well-Architected Framework.\n\n### Performance optimization\n\nVision API is a resource intensive API. Because of that, processing\nimages at scale requires careful orchestration of the API calls. The\nDataflow pipeline takes care of batching the API requests,\ngracefully handling of the exceptions related to reaching quotas, and producing\ncustom metrics of the API usage. These metrics can help you decide if an API\nquota increase is warranted, or if the Dataflow pipeline\nparameters should be adjusted to reduce the frequency of requests. For more\ninformation about increasing quota requests for Vision API, see\n[Quotas and limits](/vision/quotas).\n\nThe Dataflow pipeline has several parameters that can affect the\nprocessing latencies. For more information about these parameters, see\n[Deploy an ML vision analytics solution with Dataflow and Vision API](/architecture/building-a-vision-analytics-solution/deployment).\n\n\nFor performance optimization principles and recommendations that are specific to AI and ML workloads, see\n[AI and ML perspective: Performance optimization](/architecture/framework/perspectives/ai-ml/performance-optimization)\nin the Well-Architected Framework.\n\nDeployment\n----------\n\nTo deploy this architecture, see\n[Deploy an ML vision analytics solution with Dataflow and Vision API](/architecture/building-a-vision-analytics-solution/deployment).\n\nWhat's next\n-----------\n\n- Learn more about [Dataflow](/dataflow/docs/overview).\n- Learn more about [BigQuery ML](/bigquery/docs/bqml-introduction).\n- Learn more about BigQuery reliability in the [Understand BigQuery reliability](/bigquery/docs/reliability-intro) guide.\n- Learn about storing data in [Jump Start Solution: Data warehouse with BigQuery](/architecture/big-data-analytics/data-warehouse).\n- Review the [Vision API features list](/vision/docs/features-list).\n- Learn how to [deploy an ML vision analytics solution with Dataflow and Vision API](/architecture/building-a-vision-analytics-solution/deployment).\n- For an overview of architectural principles and recommendations that are specific to AI and ML workloads in Google Cloud, see the [AI and ML perspective](/architecture/framework/perspectives/ai-ml) in the Well-Architected Framework.\n- For more reference architectures, diagrams, and best practices, explore the [Cloud Architecture Center](/architecture).\n\nContributors\n------------\n\nAuthors:\n\n- [Masud Hasan](https://www.linkedin.com/in/masudhasan1480) \\| Site Reliability Engineering Manager\n- [Sergei Lilichenko](https://www.linkedin.com/in/sergei-lilichenko) \\| Solutions Architect\n- [Lakshmanan Sethu](https://www.linkedin.com/in/lakshmanansethu) \\| Technical Account Manager\n\n\u003cbr /\u003e\n\nOther contributors:\n\n- [Jiyeon Kang](https://www.linkedin.com/in/jiyeon-kang) \\| Customer Engineer\n- [Sunil Kumar Jang Bahadur](https://www.linkedin.com/in/sunilkumar88) \\| Customer Engineer\n\n\u003cbr /\u003e"]]