In this experimental launch, we are providing developers with a powerful tool for object detection and localization within images and video. By accurately identifying and delineating objects with bounding boxes, developers can unlock a wide range of applications and enhance the intelligence of their projects.
Key Benefits:
- Simple: Integrate object detection capabilities into your applications with ease, regardless of your computer vision expertise.
- Customizable: Produce bounding boxes based on custom instructions (e.g. "I want to see bounding boxes of all the green objects in this image"), without having to train a custom model.
Technical Details:
- Input: Your prompt and associated images or video frames.
- Output: Bounding boxes in the
[y_min, x_min, y_max, x_max]
format. The top left corner is the origin. Thex
andy
axis go horizontally and vertically, respectively. Coordinate values are normalized to 0-1000 for every image. - Visualization: AI Studio users will see bounding boxes plotted within the UI. Vertex AI users should visualize their bounding boxes through custom visualization code.
Gen AI SDK for Python
Learn how to install or update the Google Gen AI SDK for Python.
For more information, see the
Gen AI SDK for Python API reference documentation or the
python-genai
GitHub repository.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values # with appropriate values for your project. export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT export GOOGLE_CLOUD_LOCATION=us-central1 export GOOGLE_GENAI_USE_VERTEXAI=True