Pour utiliser les procédures stockées Spark, consultez les rôles requis pour les procédures stockées et accordez les rôles nécessaires.
Pour obtenir les autorisations nécessaires pour utiliser Spark et les procédures stockées avec le metastore BigLake comme metastore, demandez à votre administrateur de vous accorder les rôles IAM suivants :
Créez des tables BigLake Metastore dans Spark :
Éditeur de données BigQuery (roles/bigquery.dataEditor) sur le compte de service de connexion Spark dans le projet
Sauf indication contraire, le contenu de cette page est régi par une licence Creative Commons Attribution 4.0, et les échantillons de code sont régis par une licence Apache 2.0. Pour en savoir plus, consultez les Règles du site Google Developers. Java est une marque déposée d'Oracle et/ou de ses sociétés affiliées.
Dernière mise à jour le 2025/09/04 (UTC).
[[["Facile à comprendre","easyToUnderstand","thumb-up"],["J'ai pu résoudre mon problème","solvedMyProblem","thumb-up"],["Autre","otherUp","thumb-up"]],[["Difficile à comprendre","hardToUnderstand","thumb-down"],["Informations ou exemple de code incorrects","incorrectInformationOrSampleCode","thumb-down"],["Il n'y a pas l'information/les exemples dont j'ai besoin","missingTheInformationSamplesINeed","thumb-down"],["Problème de traduction","translationIssue","thumb-down"],["Autre","otherDown","thumb-down"]],["Dernière mise à jour le 2025/09/04 (UTC)."],[],[],null,["# Use BigLake metastore with Spark stored procedures\n==================================================\n\nThis document explains how to use [Apache Spark stored procedures](/bigquery/docs/spark-procedures) with\nBigLake metastore.\n\nBefore you begin\n----------------\n\n1. Enable billing for your Google Cloud project. Learn how to [check if billing is enabled on a project](/billing/docs/how-to/verify-billing-enabled).\n2. Enable the BigQuery and Dataflow APIs.\n\n [Enable the APIs](/billing/docs/how-to/verify-billing-enabled)\n3. Optional: Learn more about the following:\n\n - Understand how [BigLake metastore works](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,dataproc.googleapis.com) and why you should use it.\n - Learn how [BigQuery Spark stored procedures work](/bigquery/docs/spark-procedures) and complete the before you begin tasks.\n\n### Required roles\n\nTo use Spark stored procedures, review the required roles for [stored procedures](/bigquery/docs/spark-procedures#required_roles)\nand grant the necessary roles.\n\n\nTo get the permissions that\nyou need to use Spark and stored procedures with BigLake metastore as a metadata store,\n\nask your administrator to grant you the\nfollowing IAM roles:\n\n- Create BigLake metastore tables in Spark:\n - [BigQuery Data Editor](/iam/docs/roles-permissions/bigquery#bigquery.dataEditor) (`roles/bigquery.dataEditor`) on the Spark Connection service account in the project\n - [Storage Object Admin](/iam/docs/roles-permissions/storage#storage.objectAdmin) (`roles/storage.objectAdmin`) on the Spark Connection service account in the project\n- Query BigLake metastore tables in BigQuery:\n - [BigQuery Data Viewer](/iam/docs/roles-permissions/bigquery#bigquery.dataViewer) (`roles/bigquery.dataViewer`) on the project\n - [BigQuery User](/iam/docs/roles-permissions/bigquery#bigquery.user) (`roles/bigquery.user`) on the project\n - [Storage Object Viewer](/iam/docs/roles-permissions/storage#storage.objectViewer) (`roles/storage.objectViewer`) on the project\n\n\nFor more information about granting roles, see [Manage access to projects, folders, and organizations](/iam/docs/granting-changing-revoking-access).\n\n\nYou might also be able to get\nthe required permissions through [custom\nroles](/iam/docs/creating-custom-roles) or other [predefined\nroles](/iam/docs/roles-overview#predefined).\n\nCreate and run a stored procedure\n---------------------------------\n\nThe following example shows you how to create and run a stored procedure with\nBigLake metastore.\n\n1. Go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, add the following sample code for the [`CREATE PROCEDURE`\n statement](/bigquery/docs/reference/standard-sql/data-definition-language#create_procedure).\n\n ```googlesql\n CREATE OR REPLACE PROCEDURE\n `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e`()\n WITH CONNECTION `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e.\u003cvar translate=\"no\"\u003eSPARK_CONNECTION_ID\u003c/var\u003e` OPTIONS (engine='SPARK',\n runtime_version='1.1',\n properties=[(\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.warehouse\",\n \"\u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.gcp_location\",\n \"\u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.gcp_project\",\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e\",\n \"org.apache.iceberg.spark.SparkCatalog\"),\n (\"spark.sql.catalog.\u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e.catalog-impl\",\n \"org.apache.iceberg.gcp.bigquery.BigQueryMetastoreCatalog\"),\n (\"spark.jars.packages\",\n \"org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.6.1\")],\n jar_uris=[\"gs://spark-lib/bigquery/iceberg-bigquery-catalog-1.6.1-1.0.1-beta.jar\"])\n LANGUAGE python AS R\"\"\"\n from pyspark.sql import SparkSession\n spark = SparkSession \\\n .builder \\\n .appName(\"BigLake Metastore Iceberg\") \\\n .getOrCreate()\n spark.sql(\"USE CATALOG_NAME;\")\n spark.sql(\"CREATE NAMESPACE IF NOT EXISTS NAMESPACE_NAME;\")\n spark.sql(\"USE NAMESPACE_NAME;\")\n spark.sql(\"CREATE TABLE TABLE_NAME (id int, data string) USING ICEBERG LOCATION '\u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e'\")\n spark.sql(\"DESCRIBE TABLE_NAME;\")\n spark.sql(\"INSERT INTO TABLE_NAME VALUES (1, \\\"first row\\\");\")\n spark.sql(\"SELECT * from \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e;\")\n spark.sql(\"ALTER TABLE \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e ADD COLUMNS (newDoubleCol double);\")\n spark.sql(\"DESCRIBE \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e;\")\n \"\"\";\n CALL `\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e.\u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e`();\n ```\n\n Replace the following:\n - \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the ID of your Google Cloud project.\n - \u003cvar translate=\"no\"\u003eBQ_DATASET_ID\u003c/var\u003e: the ID of the dataset in BigQuery that contains the procedure.\n - \u003cvar translate=\"no\"\u003ePROCEDURE_NAME\u003c/var\u003e: the name of the procedure that you're creating or replacing.\n - \u003cvar translate=\"no\"\u003eREGION\u003c/var\u003e: the location of your Spark connection.\n - \u003cvar translate=\"no\"\u003eLOCATION\u003c/var\u003e: the location of your BigQuery resources.\n - \u003cvar translate=\"no\"\u003eSPARK_CONNECTION_ID\u003c/var\u003e: the ID of your Spark connection.\n - \u003cvar translate=\"no\"\u003eCATALOG_NAME\u003c/var\u003e: the name of the catalog that you're using.\n - \u003cvar translate=\"no\"\u003eWAREHOUSE_DIRECTORY\u003c/var\u003e: the URI of the Cloud Storage folder that contains your data warehouse.\n - \u003cvar translate=\"no\"\u003eNAMESPACE_NAME\u003c/var\u003e: the namespace that you're using.\n\nWhat's next\n-----------\n\n- Set up [optional BigLake metastore features](/bigquery/docs/blms-features).\n- [View and query tables from Spark in the BigQuery console](/bigquery/docs/blms-query-tables)."]]