Organiza tus páginas con colecciones
Guarda y categoriza el contenido según tus preferencias.
Puedes instalar componentes adicionales, como Presto, cuando creas un clúster de Dataproc con la función de componentes opcionales. En esta página, se describe cómo instalar opcionalmente el componente Presto en un clúster de Dataproc.
Presto (Trino) es un motor de consulta en SQL distribuido de código abierto. El servidor Presto y la IU web están disponibles de forma predeterminada en el puerto 8060 (o el puerto 7778 si está habilitado Kerberos) en el primer nodo principal del clúster.
De forma predeterminada, Presto en Dataproc está configurado para funcionar con conectoresHive, BigQuery, Memory, TPCH y TPCDS.
Después de crear un clúster con el componente Presto, puedes ejecutar consultas:
desde una ventana de terminal en el primer nodo principal del clúster con la CLI (interfaz de línea de comandos) de presto; consulta Usa Trino con Dataproc
Instala el componente
Instala el componente cuando crees un clúster de Dataproc.
Los componentes se pueden agregar a los clústeres creados con la versión 1.3 de Dataproc y las posteriores.
Para crear un clúster de Dataproc que incluya el componente Presto, usa el comando gcloud dataproc clusters createcluster-name con la marca --optional-components.
Agrega la marca --properties al comando gcloud dataproc clusters create para establecer las propiedades de configuración presto, presto-jvm y presto-catalog.
Propiedades de la aplicación: Usa las propiedades del clúster con el prefijo presto: para configurar las propiedades de la aplicación Presto; por ejemplo, --properties="presto:join-distribution-type=AUTOMATIC".
Propiedades de configuración de JVM: Usa las propiedades del clúster con el prefijo presto-jvm: para configurar las propiedades de JVM del coordinador de Presto y los procesos de Java de los trabajadores. Por ejemplo, --properties="presto-jvm:XX:+HeapDumpOnOutOfMemoryError".
Crea nuevos catálogos y agrega propiedades de catálogo: Usa presto-catalog:catalog-name.property-name para configurar los catálogos de Presto.
Ejemplo: La siguiente marca "propiedades" se puede usar con el comando "gcloud dataproc clusters create" para crear un clúster de Presto con un catálogo de Hive "prodhive". Se creará un archivo prodhive.properties en /usr/lib/presto/etc/catalog/ para habilitar el catálogo de prodhive.
[[["Fácil de comprender","easyToUnderstand","thumb-up"],["Resolvió mi problema","solvedMyProblem","thumb-up"],["Otro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Información o código de muestra incorrectos","incorrectInformationOrSampleCode","thumb-down"],["Faltan la información o los ejemplos que necesito","missingTheInformationSamplesINeed","thumb-down"],["Problema de traducción","translationIssue","thumb-down"],["Otro","otherDown","thumb-down"]],["Última actualización: 2025-09-04 (UTC)"],[[["\u003cp\u003eThe Presto component, which is Trino in Dataproc image versions 2.1 and later, is an optional distributed SQL query engine that can be installed on Dataproc clusters.\u003c/p\u003e\n"],["\u003cp\u003ePresto on Dataproc is pre-configured to work with \u003ccode\u003eHive\u003c/code\u003e, \u003ccode\u003eBigQuery\u003c/code\u003e, \u003ccode\u003eMemory\u003c/code\u003e, \u003ccode\u003eTPCH\u003c/code\u003e, and \u003ccode\u003eTPCDS\u003c/code\u003e connectors, enabling users to query various data sources.\u003c/p\u003e\n"],["\u003cp\u003eYou can install Presto on a Dataproc cluster using the \u003ccode\u003egcloud dataproc clusters create\u003c/code\u003e command with the \u003ccode\u003e--optional-components=PRESTO\u003c/code\u003e flag, and it is compatible with Dataproc version 1.3 and later.\u003c/p\u003e\n"],["\u003cp\u003eThe Presto Web UI and server can be accessed via port \u003ccode\u003e8060\u003c/code\u003e (or \u003ccode\u003e7778\u003c/code\u003e if Kerberos is enabled) on the cluster's first master node, and you can enable access through the component gateway by setting the \u003ccode\u003e--enable-component-gateway\u003c/code\u003e flag when creating a cluster.\u003c/p\u003e\n"],["\u003cp\u003eCustom properties for Presto application, JVM, and catalogs can be configured using the \u003ccode\u003e--properties\u003c/code\u003e flag with appropriate prefixes (e.g., \u003ccode\u003epresto:\u003c/code\u003e, \u003ccode\u003epresto-jvm:\u003c/code\u003e, \u003ccode\u003epresto-catalog:\u003c/code\u003e) during cluster creation.\u003c/p\u003e\n"]]],[],null,["| **Note:** The [Presto Optional Component](/dataproc/docs/concepts/components/presto) is available as [Trino Optional Component](/dataproc/docs/concepts/components/trino) in 2.1 and later image versions.\n\nYou can install additional components like Presto when you create a Dataproc\ncluster using the\n[Optional components](/dataproc/docs/concepts/components/overview#available_optional_components)\nfeature. This page describes how you can optionally install Presto component\non a Dataproc cluster.\n\nPresto ([Trino](https://trino.io/)) is an open\nsource distributed SQL query engine. The Presto server and\nWeb UI are by default available on port `8060` (or port `7778` if Kerberos is\nenabled) on the cluster's first master node.\n\nBy default, Presto on Dataproc is configured to work with `Hive`, `BigQuery`,\n`Memory`, `TPCH` and `TPCDS` [connectors](https://trino.io/docs/current/connector.html).\n\nAfter creating a cluster with the Presto component, you can run queries:\n\n- from a local terminal with the [`gcloud dataproc jobs submit presto`](/sdk/gcloud/reference/dataproc/jobs/submit/presto) command\n- from a terminal window on the cluster's first master node using the `presto` CLI (Command Line Interface)---see [Use Trino with Dataproc](/dataproc/docs/tutorials/trino-dataproc)\n\nInstall the component\n\nInstall the component when you create a Dataproc cluster.\nComponents can be added to clusters created with\nDataproc [version 1.3](/dataproc/docs/concepts/versioning/dataproc-release-1.3)\nand later.\n\nSee\n[Supported Dataproc versions](/dataproc/docs/concepts/versioning/dataproc-versions#supported_cloud_dataproc_versions)\nfor the component version included in each Dataproc image release. \n\ngcloud command\n\nTo create a Dataproc cluster that includes the Presto component,\nuse the\n[gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create) \u003cvar translate=\"no\"\u003ecluster-name\u003c/var\u003e\ncommand with the `--optional-components` flag.\nWhen creating the cluster, use [gcloud dataproc clusters create](/sdk/gcloud/reference/dataproc/clusters/create) command with the `--enable-component-gateway` flag, as shown below, to enable connecting to the Presto Web UI using the [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways). \n\n```\ngcloud dataproc clusters create cluster-name \\\n --optional-components=PRESTO \\\n --region=region \\\n --enable-component-gateway \\\n ... other flags\n```\n\nConfiguring properties\n\nAdd the [`--properties`](/dataproc/docs/concepts/configuring-clusters/cluster-properties#how_the_properties_flag_works) flag to the\n`gcloud dataproc clusters create` command to set\npresto, presto-jvm and presto-catalog config properties.\n\n- **Application properties:** Use cluster properties with the `presto:` prefix to configure [Presto application properties](https://trino.io/docs/current/admin/properties.html)---for example, `--properties=\"presto:join-distribution-type=AUTOMATIC\"`.\n- **JVM configuration properties:** Use cluster properties with the `presto-jvm:` prefix to configure JVM properties for Presto coordinator and worker Java processes---for example, `--properties=\"presto-jvm:XX:+HeapDumpOnOutOfMemoryError\"`.\n- **Creating new catalogs and adding catalog properties:** Use `presto-catalog:`\u003cvar translate=\"no\"\u003ecatalog-name\u003c/var\u003e`.`\u003cvar translate=\"no\"\u003eproperty-name\u003c/var\u003e to configure Presto catalogs.\n\n\n **Example:** The following \\`properties\\` flag can be used\n with the \\`gcloud dataproc clusters create\\` command to create a Presto cluster\n with a \"prodhive\" Hive catalog. A `prodhive.properties` file will\n be created under`/usr/lib/presto/etc/catalog/` to enable the\n prodhive catalog. \n\n ```\n --properties=\"presto-catalog:prodhive.connector.name=hive-hadoop2,presto-catalog:prodhive.hive.metastore.uri=thrift://localhost:9083\n ```\n\nREST API\n\nThe Presto component can be specified through the Dataproc API using\n[SoftwareConfig.Component](/dataproc/docs/reference/rest/v1/ClusterConfig#Component)\nas part of a\n[clusters.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nrequest.\n| Using the [Dataproc `v1` API](/dataproc/docs/reference/rest), set the [EndpointConfig.enableHttpPortAccess](/dataproc/docs/reference/rest/v1/ClusterConfig#endpointconfig) property to `true` as part of the clusters.create request to enable connecting to the Presto Web UI using the [Component Gateway](/dataproc/docs/concepts/accessing/dataproc-gateways).\n\nConsole\n\n1. Enable the component and component gateway.\n - In the Google Cloud console, open the Dataproc [Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) page. The Set up cluster panel is selected.\n - In the Components section:\n - Under Optional components, select Presto and other optional components to install on your cluster.\n - Under Component Gateway, select Enable component gateway (see [Viewing and Accessing Component Gateway URLs](/dataproc/docs/concepts/accessing/dataproc-gateways#viewing_and_accessing_component_gateway_urls))."]]