Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Objekttabellen mit Remote-Funktionen analysieren
In diesem Dokument wird die Analyse von unstrukturierten Daten in Objekttabellen mithilfe von Remote-Funktionen beschrieben.
Übersicht
Sie können die unstrukturierten Daten einer Objekttabelle mithilfe einer Remotefunktion analysieren. Mit einer Remote-Funktion können Sie eine in Cloud Run-Funktionen oder Cloud Run ausgeführte Funktion aufrufen, auf die Sie programmieren können, um auf Ressourcen wie die folgenden zuzugreifen:
Vortrainierte KI-Modelle von Google, einschließlich Cloud Vision API und Document AI.
Zum Analysieren von Objekttabellendaten mithilfe einer Remotefunktion müssen Sie signierte URLs für die Objekte in der Objekttabelle generieren und übergeben, wenn Sie die Remote-Funktion aufrufen. Diese signierten URLs gewähren der Remote-Funktion Zugriff auf die Objekte.
Erforderliche Berechtigungen
Zum Erstellen der von der Remote-Funktion verwendeten Verbindungsressource benötigen Sie die folgenden Berechtigungen:
Zum Aufrufen einer Remote-Funktion benötigen Sie die unter Remote-Funktionen beschriebenen Berechtigungen.
Zum Analysieren einer Objekttabelle mit einer Remote-Funktion benötigen Sie die Berechtigung bigquery.tables.getData für die Objekttabelle.
Hinweise
Sign in to your Google Cloud account. If you're new to
Google Cloud,
create an account to evaluate how our products perform in
real-world scenarios. New customers also get $300 in free credits to
run, test, and deploy workloads.
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Wenn Sie eine Remote-Funktion zum Analysieren von Objekttabellendaten erstellen, müssen Sie signierte URLs übergeben, die für die Objekte in der Objekttabelle generiert wurden. Dazu können Sie einen Eingabeparameter mit dem Datentyp STRING verwenden. Die signierten URLs werden der Remote-Funktion als Eingabedaten im Feld calls der HTTP-POST-Anfrage zur Verfügung gestellt.
Ein Beispiel für eine Anfrage ist:
Sie können ein Objekt in Ihrer Remote-Funktion lesen, indem Sie eine Methode verwenden, die eine HTTP-GET-Anfrage an die signierte URL sendet. Die Remote-Funktion kann auf das Objekt zugreifen, da die signierte URL Authentifizierungsinformationen im Abfragestring enthält.
Zum Aufrufen einer Remote-Funktion für Objekttabellendaten verweisen Sie auf die Remote-Funktion in der select_list der Abfrage und rufen Sie dann die EXTERNAL_OBJECT_TRANSFORM-Funktion in der FROM-Klausel, um signierte URLs für die Objekte zu generieren.
Das folgende Beispiel zeigt die typische Anweisungssyntax:
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-09-04 (UTC)."],[[["\u003cp\u003eRemote functions allow for the analysis of unstructured data within object tables by interacting with services like pre-trained AI models, open-source libraries, and custom models.\u003c/p\u003e\n"],["\u003cp\u003eAnalyzing object table data with remote functions requires passing signed URLs, which are generated for the objects and provide the necessary access authorization to the remote function.\u003c/p\u003e\n"],["\u003cp\u003eSpecific permissions are needed to create connections, create, and invoke remote functions, as well as to access the data in the object table itself.\u003c/p\u003e\n"],["\u003cp\u003eWhen creating a remote function for object table analysis, it's recommended to set the \u003ccode\u003emax_batching_rows\u003c/code\u003e option to 1 to prevent Cloud Run function timeouts and enhance processing parallelism.\u003c/p\u003e\n"],["\u003cp\u003eTo use a remote function on object table data, it must be referenced in the \u003ccode\u003eSELECT\u003c/code\u003e list and use \u003ccode\u003eEXTERNAL_OBJECT_TRANSFORM\u003c/code\u003e in the \u003ccode\u003eFROM\u003c/code\u003e clause to generate signed URLs, with the option to filter using \u003ccode\u003eWHERE\u003c/code\u003e.\u003c/p\u003e\n"]]],[],null,["Analyze object tables by using remote functions\n\nThis document describes how to analyze unstructured data in\n[object tables](/bigquery/docs/object-table-introduction) by using\n[remote functions](/bigquery/docs/remote-functions).\n\nOverview\n\nYou can analyze the unstructured data represented by an object table by using\na remote function. A remote function lets you call a function running on\nCloud Run functions or Cloud Run, which you can program to access\nresources such as:\n\n- Google's pre-trained AI models, including Cloud Vision API and Document AI.\n- Open source libraries such as [Apache Tika](https://tika.apache.org/).\n- Your own custom models.\n\nTo analyze object table data by using a remote function, you must\ngenerate and pass in\n[signed URLs](/bigquery/docs/object-table-introduction#signed_urls) for the\nobjects in the object table when you call the remote function. These signed\nURLs are what grant the remote function access to the objects.\n\nRequired permissions\n\n- To create the connection resource used by the remote function, you need the following permissions:\n\n - `bigquery.connections.create`\n - `bigquery.connections.get`\n - `bigquery.connections.list`\n - `bigquery.connections.update`\n - `bigquery.connections.use`\n - `bigquery.connections.delete`\n- To create a remote function, you need the permissions associated with the\n [Cloud Functions Developer](/functions/docs/reference/iam/roles#cloudfunctions.developer)\n or [Cloud Run Developer](/iam/docs/understanding-roles#run.developer) roles.\n\n- To invoke a remote function, you need the permissions described in\n [Remote functions](/bigquery/docs/remote-functions#grant_permission_on_function).\n\n- To analyze an object table with a remote function, you need the\n `bigquery.tables.getData` permission on the object table.\n\nBefore you begin\n\n- Sign in to your Google Cloud account. If you're new to Google Cloud, [create an account](https://console.cloud.google.com/freetrial) to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the BigQuery, BigQuery Connection API, Cloud Run functions APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,cloudfunctions.googleapis.com)\n\n- In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n-\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n-\n\n\n Enable the BigQuery, BigQuery Connection API, Cloud Run functions APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,cloudfunctions.googleapis.com)\n\n1. Ensure that your BigQuery administrator has [created a connection](/bigquery/docs/create-cloud-resource-connection#create-cloud-resource-connection) and [set up\n access to Cloud Storage](/bigquery/docs/create-cloud-resource-connection#access-storage).\n\n\u003cbr /\u003e\n\nCreate a remote function\n\nFor general instructions on creating a remote function, see\n[Working with remote functions](/bigquery/docs/remote-functions).\n\nWhen you create a remote function to analyze object table data, you must\npass in [signed URLS](/bigquery/docs/object-table-introduction#signed_urls)\nthat have been generated for the objects in the object table. You can do this\nby using an input parameter with a `STRING` data type. The signed URLS are\nmade available to the remote function as input data in the\n[`calls` field of the HTTP `POST` request](/bigquery/docs/remote-functions#input_format).\nAn example of a request is: \n\n {\n // Other fields omitted.\n \"calls\": [\n [\"https://storage.googleapis.com/mybucket/1.pdf?X-Goog-SignedHeaders=abcd\"],\n [\"https://storage.googleapis.com/mybucket/2.pdf?X-Goog-SignedHeaders=wxyz\"]\n ]\n }\n\nYou can read an object in your remote function by using a method that makes\nan HTTP `GET` request to the signed URL. The remote function can access the\nobject because the signed URL contains authentication information in its\nquery string.\n\nWhen you specify the\n[`CREATE FUNCTION` statement](/bigquery/docs/reference/standard-sql/data-definition-language#create_function_statement)\nfor the remote function, we recommend that you set the `max_batching_rows`\noption to 1 in order to\n[avoid Cloud Run functions timeout](/functions/docs/concepts/exec#timeout)\nand increase processing parallelism.\n\nExample\n\nThe following Cloud Run functions Python code example reads storage\nobjects and returns their content length to BigQuery: \n\n import functions_framework\n import json\n import urllib.request\n\n @functions_framework.http\n def object_length(request):\n calls = request.get_json()['calls']\n replies = []\n for call in calls:\n object_content = urllib.request.urlopen(call[0]).read()\n replies.append(len(object_content))\n return json.dumps({'replies': replies})\n\nDeployed, this function would have an endpoint similar to\n`https://us-central1-myproject.cloudfunctions.net/object_length`.\n\nThe following example shows how to create a BigQuery remote\nfunction based on this Cloud Run functions function: \n\n```googlesql\nCREATE FUNCTION mydataset.object_length(signed_url STRING) RETURNS INT64\nREMOTE WITH CONNECTION `us.myconnection`\nOPTIONS(\n endpoint = \"https://us-central1-myproject.cloudfunctions.net/object_length\",\n max_batching_rows = 1\n);\n```\n\nFor step-by-step guidance, see\n[Tutorial: Analyze an object table with a remote function](/bigquery/docs/remote-function-tutorial).\n\nCall a remote function\n\nTo call a remote function on object table data, reference the remote\nfunction in the\n[`select_list`](/bigquery/docs/reference/standard-sql/query-syntax#select_list)\nof the query, and then call the\n[`EXTERNAL_OBJECT_TRANSFORM` function](/bigquery/docs/reference/standard-sql/table-functions-built-in#external_object_transform)\nin the\n[`FROM` clause](/bigquery/docs/reference/standard-sql/query-syntax#from_clause)\nto generate the signed URLs for the objects.\n| **Note:** When using one of the [AI APIs](/products/ai), be aware of the relevant quotas for the API you are targeting. Use a `LIMIT` clause to limit the results returned if necessary to stay within quota.\n\nThe following example shows typical statement syntax: \n\n```googlesql\nSELECT uri, function_name(signed_url) AS function_output\nFROM EXTERNAL_OBJECT_TRANSFORM(TABLE my_dataset.object_table, [\"SIGNED_URL\"])\nLIMIT 10000;\n```\n\nThe following example shows how to process only a subset of the object table\ncontents with a remote function: \n\n```googlesql\nSELECT uri, function_name(signed_url) AS function_output\nFROM EXTERNAL_OBJECT_TRANSFORM(TABLE my_dataset.object_table, [\"SIGNED_URL\"])\nWHERE content_type = \"application/pdf\";\n```\n\nWhat's next\n\nLearn how to [run inference on image object tables](/bigquery/docs/object-table-inference)."]]