Ajusta un modelo de incorporación con los parámetros especificados
Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
En esta muestra de código, se muestra cómo ajustar un modelo de embedding con Vertex AI. La muestra usa un modelo entrenado previamente y lo ajusta en un conjunto de datos específico.
Explora más
Para obtener documentación en la que se incluye esta muestra de código, consulta lo siguiente:
Muestra de código
Salvo que se indique lo contrario, el contenido de esta página está sujeto a la licencia Atribución 4.0 de Creative Commons, y los ejemplos de código están sujetos a la licencia Apache 2.0. Para obtener más información, consulta las políticas del sitio de Google Developers. Java es una marca registrada de Oracle o sus afiliados.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],[],[],[],null,["# Tune an embedding model using the specified parameters\n\nThis code sample demonstrates how to fine-tune an embedding model using Vertex AI. The sample uses a pre-trained model and tunes it on a specific dataset.\n\nExplore further\n---------------\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Tune text embeddings](/vertex-ai/generative-ai/docs/models/tune-embeddings)\n\nCode sample\n-----------\n\n### Python\n\n\nBefore trying this sample, follow the Python setup instructions in the\n[Vertex AI quickstart using\nclient libraries](/vertex-ai/docs/start/client-libraries).\n\n\nFor more information, see the\n[Vertex AI Python API\nreference documentation](/python/docs/reference/aiplatform/latest).\n\n\nTo authenticate to Vertex AI, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import re\n\n from google.cloud.aiplatform import initializer as aiplatform_init\n from vertexai.language_models import TextEmbeddingModel\n\n\n def tune_embedding_model(\n api_endpoint: str,\n base_model_name: str = \"text-embedding-005\",\n corpus_path: str = \"gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/corpus.jsonl\",\n queries_path: str = \"gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/queries.jsonl\",\n train_label_path: str = \"gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/train.tsv\",\n test_label_path: str = \"gs://cloud-samples-data/ai-platform/embedding/goog-10k-2024/r11/test.tsv\",\n ): # noqa: ANN201\n \"\"\"Tune an embedding model using the specified parameters.\n Args:\n api_endpoint (str): The API endpoint for the Vertex AI service.\n base_model_name (str): The name of the base model to use for tuning.\n corpus_path (str): GCS URI of the JSONL file containing the corpus data.\n queries_path (str): GCS URI of the JSONL file containing the queries data.\n train_label_path (str): GCS URI of the TSV file containing the training labels.\n test_label_path (str): GCS URI of the TSV file containing the test labels.\n \"\"\"\n match = re.search(r\"^(\\w+-\\w+)\", api_endpoint)\n location = match.group(1) if match else \"us-central1\"\n base_model = TextEmbeddingModel.from_pretrained(base_model_name)\n tuning_job = base_model.https://cloud.google.com/python/docs/reference/vertexai/latest/vertexai.language_models._language_models._TunableModelMixin.html#vertexai_language_models__language_models__TunableModelMixin_tune_model(\n task_type=\"DEFAULT\",\n corpus_data=corpus_path,\n queries_data=queries_path,\n training_data=train_label_path,\n test_data=test_label_path,\n batch_size=128, # The batch size to use for training.\n train_steps=1000, # The number of training steps.\n tuned_model_location=location,\n output_dimensionality=768, # The dimensionality of the output embeddings.\n learning_rate_multiplier=1.0, # The multiplier for the learning rate.\n )\n return tuning_job\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=generativeaionvertexai)."]]