Función de la instancia, que puede ser Master o Worker
dataproc-master
Nombre del host del primer nodo principal. El valor es [CLUSTER_NAME]-m en un clúster de nodo único o estándar, o [CLUSTER_NAME]-m-0 en un clúster de alta disponibilidad, en el cual [CLUSTER_NAME] es el nombre de tu clúster.
dataproc-master-additional
Lista de nombres de host separados por comas para los nodos principales adicionales en un clúster de alta disponibilidad, por ejemplo, [CLUSTER_NAME]-m-1,[CLUSTER_NAME]-m-2 en un clúster que tiene 3 nodos de instancias principales.
SPARK_BQ_CONNECTOR_VERSION or SPARK_BQ_CONNECTOR_URL
Es la versión o URL que apunta a una versión del conector de BigQuery para Spark que se usará en las aplicaciones de Spark, por ejemplo, 0.42.1 o gs://spark-lib/bigquery/spark-3.5-bigquery-0.42.1.jar. En los clústeres de Dataproc con la versión de imagen 2.1 y posteriores, se preinstala una versión predeterminada del conector de Spark BigQuery. Para obtener más información, consulta Usa el conector de BigQuery para Spark.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-04 (UTC)"],[[["\u003cp\u003eCustom metadata is accessible to processes within your cluster and can be utilized by initialization actions.\u003c/p\u003e\n"],["\u003cp\u003eLabels, while not directly available to cluster processes, are used for searching resources via the Dataproc API.\u003c/p\u003e\n"],["\u003cp\u003eIf data needs to be accessible to the cluster and also be a search parameter, it should be added as both metadata and a label.\u003c/p\u003e\n"],["\u003cp\u003eDataproc provides predefined metadata keys like \u003ccode\u003edataproc-bucket\u003c/code\u003e, \u003ccode\u003edataproc-region\u003c/code\u003e, \u003ccode\u003edataproc-worker-count\u003c/code\u003e, and others, to manage cluster operations.\u003c/p\u003e\n"],["\u003cp\u003eCustom metadata can be set during cluster creation using the \u003ccode\u003e--metadata\u003c/code\u003e flag with the gcloud CLI's \u003ccode\u003egcloud dataproc clusters create\u003c/code\u003e command.\u003c/p\u003e\n"]]],[],null,["| **Metadata compared to Labels**\n|\n| - Custom metadata is available to processes running on your cluster, and can be used by initialization actions.\n| - Labels are not readily available to processes running on your cluster, but can be used when searching through resources with the Dataproc API.\n| If you need a piece of data to be available to your cluster and also used as an API search parameter, then add it both as metadata and as a label to your cluster.\n\nDataproc sets special metadata values for the instances that run in your\ncluster:\n\n| Metadata key | Value |\n|--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `dataproc-bucket` | Name of the cluster's [staging bucket](/dataproc/docs/concepts/configuring-clusters/staging-bucket) |\n| `dataproc-region` | [Region](/dataproc/docs/concepts/regional-endpoints) of the cluster's endpoint |\n| `dataproc-worker-count` | Number of worker nodes in the cluster. The value is `0` for [single node clusters](/dataproc/docs/concepts/configuring-clusters/single-node-clusters). |\n| `dataproc-cluster-name` | Name of the cluster |\n| `dataproc-cluster-uuid` | UUID of the cluster |\n| `dataproc-role` | Instance's role, either `Master` or `Worker` |\n| `dataproc-master` | Hostname of the first master node. The value is either `[CLUSTER_NAME]-m` in a standard or single node cluster, or `[CLUSTER_NAME]-m-0` in a [high-availability cluster](/dataproc/docs/concepts/configuring-clusters/high-availability), where `[CLUSTER_NAME]` is the name of your cluster. |\n| `dataproc-master-additional` | Comma-separated list of hostnames for the additional master nodes in a high-availability cluster, for example, `[CLUSTER_NAME]-m-1,[CLUSTER_NAME]-m-2` in a cluster that has 3 master nodes. |\n| `SPARK_BQ_CONNECTOR_VERSION or SPARK_BQ_CONNECTOR_URL` | The version or URL that points to a Spark BigQuery connector version to use in Spark applications, for example, `0.42.1` or `gs://spark-lib/bigquery/spark-3.5-bigquery-0.42.1.jar`. A default Spark BigQuery connector version is pre-installed in Dataproc `2.1` and later image version clusters. For more information, see [Use the Spark BigQuery connector](/dataproc/docs/tutorials/bigquery-connector-spark-example). |\n\nYou can use these values to customize the behavior of\n[initialization actions](/dataproc/docs/concepts/configuring-clusters/init-actions).\n\nYou can use the `--metadata` flag in the\n[gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand to provide your own metadata: \n\n```\ngcloud dataproc clusters create CLUSTER_NAME \\\n --region=REGION \\\n --metadata=name1=value1,name2=value2... \\\n ... other flags ...\n```"]]