Cette page a été traduite par l'API Cloud Translation.

Utiliser Vertex AI Vector Search avec le moteur de RAG Vertex AI

Cette page explique comment connecter votre moteur Vertex AI RAG à Vertex AI Vector Search.

Vous pouvez également suivre ce notebook Moteur de RAG Vertex AI avec Vertex AI Vector Search.

Le moteur RAG Vertex AI est un outil puissant qui utilise une base de données vectorielle intégrée alimentée par Spanner pour stocker et gérer les représentations vectorielles de documents textuels. La base de données vectorielle permet de récupérer efficacement les documents pertinents en fonction de leur similarité sémantique avec une requête donnée. En intégrant Vertex AI Vector Search en tant que base de données vectorielle supplémentaire avec Vertex AI RAG Engine, vous pouvez utiliser les fonctionnalités de Vector Search pour gérer des volumes de données avec une faible latence afin d'améliorer les performances et l'évolutivité de vos applications RAG.

Configuration de Vertex AI Vector Search

Vertex AI Vector Search est basé sur la technologie de recherche vectorielle développée par Google Research. Vector Search vous permet d'utiliser l'infrastructure qui est à la base des produits Google, tels que la recherche Google, YouTube et Google Play.

Pour l'intégration au moteur de RAG Vertex AI, un indice de recherche vectorielle vide est requis.

Configurer le SDK Vertex AI

Pour configurer le SDK Vertex AI, consultez la section Configuration.

Créer un index Vector Search

Pour créer un index de recherche vectorielle compatible avec votre corpus RAG, l'index doit respecter les critères suivants:

IndexUpdateMethod doit être STREAM_UPDATE. Consultez Créer un index de flux.
Le type de mesure de la distance doit être défini explicitement sur l'un des éléments suivants:
- DOT_PRODUCT_DISTANCE
- COSINE_DISTANCE
La dimension du vecteur doit être cohérente avec le modèle d'embedding que vous prévoyez d'utiliser dans le corpus RAG. D'autres paramètres peuvent être affinés en fonction de vos choix, qui déterminent si les paramètres supplémentaires peuvent être affinés.

Python

Pour savoir comment installer ou mettre à jour le SDK Vertex AI pour Python, consultez la section Installer le SDK Vertex AI pour Python. Pour en savoir plus, consultez la documentation de référence de l'API Python.

def vector_search_create_streaming_index(
    project: str, location: str, display_name: str, gcs_uri: Optional[str] = None
) -> aiplatform.MatchingEngineIndex:
    """Create a vector search index.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        display_name (str): Required. The index display name
        gcs_uri (str): Optional. The Google Cloud Storage uri for index content

    Returns:
        The created MatchingEngineIndex.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create Index
    index = aiplatform.MatchingEngineIndex.create_tree_ah_index(
        display_name=display_name,
        contents_delta_uri=gcs_uri,
        description="Matching Engine Index",
        dimensions=100,
        approximate_neighbors_count=150,
        leaf_node_embedding_count=500,
        leaf_nodes_to_search_percent=7,
        index_update_method="STREAM_UPDATE",  # Options: STREAM_UPDATE, BATCH_UPDATE
        distance_measure_type=aiplatform.matching_engine.matching_engine_index_config.DistanceMeasureType.DOT_PRODUCT_DISTANCE,
    )

    return index

Créer un point de terminaison d'index Vector Search

Le moteur RAG Vertex AI est compatible avec les points de terminaison publics.

Python

def vector_search_create_index_endpoint(
    project: str, location: str, display_name: str
) -> None:
    """Create a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        display_name (str): Required. The index endpoint display name
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create Index Endpoint
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint.create(
        display_name=display_name,
        public_endpoint_enabled=True,
        description="Matching Engine Index Endpoint",
    )

    print(index_endpoint.name)

Déployer un index sur un point de terminaison d'index

Avant de pouvoir effectuer la recherche des plus proches voisins, l'index doit être déployé sur un point de terminaison d'index.

Python

def vector_search_deploy_index(
    project: str,
    location: str,
    index_name: str,
    index_endpoint_name: str,
    deployed_index_id: str,
) -> None:
    """Deploy a vector search index to a vector search index endpoint.

    Args:
        project (str): Required. Project ID
        location (str): Required. The region name
        index_name (str): Required. The index to update. A fully-qualified index
          resource name or a index ID.  Example:
          "projects/123/locations/us-central1/indexes/my_index_id" or
          "my_index_id".
        index_endpoint_name (str): Required. Index endpoint to deploy the index
          to.
        deployed_index_id (str): Required. The user specified ID of the
          DeployedIndex.
    """
    # Initialize the Vertex AI client
    aiplatform.init(project=project, location=location)

    # Create the index instance from an existing index
    index = aiplatform.MatchingEngineIndex(index_name=index_name)

    # Create the index endpoint instance from an existing endpoint.
    index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name=index_endpoint_name
    )

    # Deploy Index to Endpoint
    index_endpoint = index_endpoint.deploy_index(
        index=index, deployed_index_id=deployed_index_id
    )

    print(index_endpoint.deployed_indexes)

Si vous déployez un index pour la première fois sur un point de terminaison d'index, la création et le lancement du backend prennent environ 30 minutes avant que l'index puisse être stocké. Après le premier déploiement, l'index est prêt en quelques secondes. Pour consulter l'état du déploiement de l'index, ouvrez la console Vector Search, sélectionnez l'onglet Points de terminaison de l'index, puis choisissez votre point de terminaison de l'index.

Identifiez le nom de la ressource de votre index et de votre point de terminaison d'index, qui se présentent sous les formats suivants:

projects/${PROJECT_ID}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
projects/${PROJECT_ID}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}

Utiliser Vertex AI Vector Search dans le moteur de classification de risque Vertex AI

Une fois l'instance Vector Search configurée, suivez la procédure de cette section pour définir l'instance Vector Search comme base de données vectorielle pour l'application RAG.

Définir la base de données vectorielle pour créer un corpus RAG

Lorsque vous créez le corpus RAG, spécifiez uniquement INDEX_ENDPOINT_NAME et INDEX_NAME complets. Veillez à utiliser l'ID numérique pour les noms de ressources de l'index et du point de terminaison de l'index. Le corpus RAG est créé et automatiquement associé à l'index Vector Search. Les validations sont effectuées sur les critères. Si l'une des exigences n'est pas remplie, la requête est rejetée.

Python

Avant d'essayer cet exemple, suivez les instructions de configuration pour Python décrites dans le guide de démarrage rapide de Vertex AI à l'aide des bibliothèques clientes. Pour en savoir plus, consultez la documentation de référence de l'API Vertex AI Python.

Pour vous authentifier auprès de Vertex AI, configurez le service Identifiants par défaut de l'application. Pour en savoir plus, consultez Configurer l'authentification pour un environnement de développement local.


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# vector_search_index_name = "projects/{PROJECT_ID}/locations/{LOCATION}/indexes/{INDEX_ID}"
# vector_search_index_endpoint_name = "projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}"
# display_name = "test_corpus"
# description = "Corpus Description"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Configure embedding model (Optional)
embedding_model_config = rag.EmbeddingModelConfig(
    publisher_model="publishers/google/models/text-embedding-004"
)

# Configure Vector DB
vector_db = rag.VertexVectorSearch(
    index=vector_search_index_name, index_endpoint=vector_search_index_endpoint_name
)

corpus = rag.create_corpus(
    display_name=display_name,
    description=description,
    embedding_model_config=embedding_model_config,
    vector_db=vector_db,
)
print(corpus)
# Example response:
# RagCorpus(name='projects/1234567890/locations/us-central1/ragCorpora/1234567890',
# display_name='test_corpus', description='Corpus Description', embedding_model_config=...
# ...

REST

  # TODO(developer): Update and un-comment the following lines:
  # CORPUS_DISPLAY_NAME = "YOUR_CORPUS_DISPLAY_NAME"
  # Full index/indexEndpoint resource name
  # Index: projects/${PROJECT_ID}/locations/${LOCATION_ID}/indexes/${INDEX_ID}
  # IndexEndpoint: projects/${PROJECT_ID}/locations/${LOCATION_ID}/indexEndpoints/${INDEX_ENDPOINT_ID}
  # INDEX_RESOURCE_NAME = "YOUR_INDEX_ENDPOINT_RESOURCE_NAME"
  # INDEX_NAME = "YOUR_INDEX_RESOURCE_NAME"
  # Call CreateRagCorpus API to create a new RagCorpus
  curl -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION_ID}/ragCorpora -d '{
        "display_name" : '\""${CORPUS_DISPLAY_NAME}"\"',
        "rag_vector_db_config" : {
                "vertex_vector_search": {
                  "index":'\""${INDEX_NAME}"\"'
              "index_endpoint":'\""${INDEX_ENDPOINT_NAME}"\"'
                }
          }
    }'

  # Call ListRagCorpora API to verify the RagCorpus is created successfully
  curl -sS -X GET \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://${LOCATION_ID}-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/${LOCATION_ID}/ragCorpora"

Importer des fichiers à l'aide de l'API RAG

Utilisez l'API ImportRagFiles pour importer des fichiers depuis Cloud Storage ou Google Drive dans l'index de recherche vectorielle. Les fichiers sont intégrés et stockés dans l'index Vector Search.

REST

# TODO(developer): Update and uncomment the following lines:
# RAG_CORPUS_ID = "your-rag-corpus-id"
#
# Google Cloud Storage bucket/file location.
# For example, "gs://rag-fos-test/"
# GCS_URIS= "your-gcs-uris"

# Call ImportRagFiles API to embed files and store in the BigQuery table
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles:import \
-d '{
  "import_rag_files_config": {
    "gcs_source": {
      "uris": '\""${GCS_URIS}"\"'
    },
    "rag_file_chunking_config": {
      "chunk_size": 512
    }
  }
}'

# Call ListRagFiles API to verify the files are imported successfully
curl -X GET \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}/ragFiles

Python


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"
# paths = ["https://drive.google.com/file/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.import_files(
    corpus_name=corpus_name,
    paths=paths,
    chunk_size=512,  # Optional
    chunk_overlap=100,  # Optional
    max_embedding_requests_per_min=900,  # Optional
)
print(f"Imported {response.imported_rag_files_count} files.")
# Example response:
# Imported 2 files.

Récupérer des contextes pertinents à l'aide de l'API RAG

Une fois les importations de fichiers terminées, le contexte pertinent peut être récupéré à partir de l'index de recherche vectorielle à l'aide de l'API RetrieveContexts.

REST

# TODO(developer): Update and uncomment the following lines:
# RETRIEVAL_QUERY="your-retrieval-query"
#
# Full RAG corpus resource name
# Format:
# "projects/${PROJECT_ID}/locations/us-central1/ragCorpora/${RAG_CORPUS_ID}"
# RAG_CORPUS_RESOURCE="your-rag-corpus-resource"

# Call RetrieveContexts API to retrieve relevant contexts
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1:retrieveContexts \
  -d '{
    "vertex_rag_store": {
      "rag_resources": {
          "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
        },
    },
    "query": {
      "text": '\""${RETRIEVAL_QUERY}"\"',
      "similarity_top_k": 10
    }
  }'

Python


from vertexai.preview import rag
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/[PROJECT_ID]/locations/us-central1/ragCorpora/[rag_corpus_id]"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="Hello World!",
    similarity_top_k=10,  # Optional
    vector_distance_threshold=0.5,  # Optional
    # vector_search_alpha=0.5, # Optional - Only supported for Weaviate
)
print(response)
# Example response:
# contexts {
#   contexts {
#     source_uri: "gs://your-bucket-name/file.txt"
#     text: "....
#   ....

Générer du contenu à l'aide de l'API Vertex AI Gemini

Pour générer du contenu à l'aide de modèles Gemini, appelez l'API GenerateContent Vertex AI. En spécifiant RAG_CORPUS_RESOURCE dans la requête, l'API récupère automatiquement les données de l'index Vector Search.

REST

# TODO(developer): Update and uncomment the following lines:
# MODEL_ID=gemini-1.5-flash-001
# GENERATE_CONTENT_PROMPT="your-generate-content-prompt"

# GenerateContent with contexts retrieved from the FeatureStoreOnline index
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json"  https://us-central1-aiplatform.googleapis.com/v1beta1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
  "contents": {
    "role": "user",
    "parts": {
      "text": '\""${GENERATE_CONTENT_PROMPT}"\"'
    }
  },
  "tools": {
    "retrieval": {
      "vertex_rag_store": {
        "rag_resources": {
            "rag_corpus": '\""${RAG_CORPUS_RESOURCE}"\"',
          },
        "similarity_top_k": 8,
        "vector_distance_threshold": 0.32
      }
    }
  }
}'

Python


from vertexai.preview import rag
from vertexai.preview.generative_models import GenerativeModel, Tool
import vertexai

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# corpus_name = "projects/{PROJECT_ID}/locations/us-central1/ragCorpora/{rag_corpus_id}"

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            similarity_top_k=3,  # Optional
            vector_distance_threshold=0.5,  # Optional
        ),
    )
)

rag_model = GenerativeModel(
    model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)
response = rag_model.generate_content("Why is the sky blue?")
print(response.text)
# Example response:
#   The sky appears blue due to a phenomenon called Rayleigh scattering.
#   Sunlight, which contains all colors of the rainbow, is scattered
#   by the tiny particles in the Earth's atmosphere....
#   ...

Étape suivante

Utiliser Vertex AI Search comme backend de récupération à l'aide du moteur de RAG Vertex AI