Diese Seite wurde von der Cloud Translation API übersetzt.

Vorlage „JDBC zu JDBC“

Verwenden Sie die Vorlage „Serverless for Apache Spark JDBC to JDBC“, um Daten von JDBC zu JDBC zu extrahieren.

Diese Vorlage unterstützt die folgenden Datenbanken:

MySQL
PostgreSQL
Microsoft SQL Server
Oracle

Vorlage verwenden

Führen Sie die Vorlage mit der gcloud CLI oder der Dataproc API aus.

gcloud

Ersetzen Sie folgende Werte, bevor sie einen der Befehlsdaten verwenden:

PROJECT_ID: erforderlich. Ihre Google Cloud-Projekt-ID, die in den IAM-Einstellungen aufgeführt ist.
REGION: erforderlich. Compute Engine-Region.
SUBNET: Optional. Wenn kein Subnetz angegeben ist, wird das Subnetz in der angegebenen REGION im Netzwerk default ausgewählt.
Beispiel:projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_NAME
TEMPLATE_VERSION: erforderlich. Geben Sie latest für die aktuelle Vorlagenversion oder das Datum einer bestimmten Version an, z. B. 2023-03-17_v0.1.0-beta. Rufen Sie gs://dataproc-templates-binaries auf oder führen Sie gcloud storage ls gs://dataproc-templates-binaries aus, um die verfügbaren Vorlagenversionen aufzulisten.
INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH und OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH: Erforderlich. Der vollständige Cloud Storage-Pfad, einschließlich des Dateinamens, in dem die JAR-Dateien des JDBC-Eingabe- und ‑Ausgabe-Connectors gespeichert sind.
Hinweis:Wenn Eingabe- und Ausgabe-Jars identisch sind, reicht es aus, nur INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH festzulegen.

Mit den folgenden Befehlen können Sie JDBC-Connectors zum Hochladen in Cloud Storage herunterladen:
- MySQL:
```
  wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.30.tar.gz
```
- PostgreSQL:
```
  wget https://jdbc.postgresql.org/download/postgresql-42.2.6.jar
```
- Microsoft SQL Server:
```
  wget https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/6.4.0.jre8/mssql-jdbc-6.4.0.jre8.jar
```
- Oracle:
```
  wget https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/21.7.0.0/ojdbc8-21.7.0.0.jar
```

Die folgenden Variablen werden verwendet, um die erforderliche JDBC-Eingabe-URL zu erstellen:

INPUT_JDBC_HOST
INPUT_JDBC_PORT
INPUT_JDBC_DATABASE oder für Oracle: INPUT_JDBC_SERVICE
INPUT_JDBC_USERNAME
INPUT_JDBC_PASSWORD

Erstellen Sie INPUT_JDBC_CONNECTION_URL in einem der folgenden connectorspezifischen Formate:

MySQL:

jdbc:mysql://INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_DATABASE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

PostgreSQL:

jdbc:postgresql://INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_DATABASE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

Microsoft SQL Server:

jdbc:sqlserver://INPUT_JDBC_HOST:INPUT_JDBC_PORT;databaseName=INPUT_JDBC_DATABASE;user=INPUT_JDBC_USERNAME;password=INPUT_JDBC_PASSWORD

Oracle:

jdbc:oracle:thin:@//INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_SERVICE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

Die folgenden Variablen werden verwendet, um die erforderliche JDBC-Ausgabe-URL zu erstellen:

OUTPUT_JDBC_HOST
OUTPUT_JDBC_PORT
OUTPUT_JDBC_DATABASE oder für Oracle: OUTPUT_JDBC_SERVICE
OUTPUT_JDBC_USERNAME
OUTPUT_JDBC_PASSWORD

Erstellen Sie OUTPUT_JDBC_CONNECTION_URL in einem der folgenden connectorspezifischen Formate:

MySQL:

jdbc:mysql://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_DATABASE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

PostgreSQL:

jdbc:postgresql://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_DATABASE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

Microsoft SQL Server:

jdbc:sqlserver://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT;databaseName=OUTPUT_JDBC_DATABASE;user=OUTPUT_JDBC_USERNAME;password=OUTPUT_JDBC_PASSWORD

Oracle:

jdbc:oracle:thin:@//OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_SERVICE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

INPUT_JDBC_TABLE: erforderlich. Name der JDBC-Eingabetabelle oder SQL-Abfrage für die JDBC-Eingabetabelle.
Beispiel (die SQL-Abfrage muss in Klammern stehen): (select * from TABLE_NAME) as ALIAS_TABLE_NAME
OUTPUT_JDBC_TABLE: erforderlich. JDBC-Tabelle, in der die Ausgabe gespeichert wird.
INPUT_DRIVER und OUTPUT_DRIVER: Erforderlich. Der JDBC-Ein- und Ausgabetreiber, der für die Verbindung verwendet wird:
- MySQL:
```
com.mysql.cj.jdbc.Driver
```
- PostgreSQL:
```
org.postgresql.Driver
```
- Microsoft SQL Server:
```
com.microsoft.sqlserver.jdbc.SQLServerDriver
```
- Oracle:
```
oracle.jdbc.driver.OracleDriver
```
INPUT_PARTITION_COLUMN, LOWERBOUND, UPPERBOUND, NUM_PARTITIONS: Optional. Wenn sie verwendet werden, müssen alle folgenden Parameter angegeben werden:
- INPUT_PARTITION_COLUMN: Name der Spalte für die Partitionierung der JDBC-Eingabetabelle.
- LOWERBOUND: Untergrenze der JDBC-Eingabetabellenpartitionsspalte, die zur Bestimmung der Partitionsinkrementierung verwendet wird.
- UPPERBOUND: Obergrenze der JDBC-Eingabetabellenpartitionsspalte, die zur Bestimmung der Partitionslänge verwendet wird.
- NUM_PARTITIONS: Die maximale Anzahl von Partitionen, die für die Parallelität von Tabellenlese- und ‑schreibvorgängen verwendet werden können. Falls angegeben, wird dieser Wert für die JDBC-Ein- und ‑Ausgabeverbindung verwendet.
FETCHSIZE: Optional. Anzahl der Zeilen, die pro Roundtrip abgerufen werden sollen.
BATCH_SIZE: Optional. Anzahl der Datensätze, die pro Roundtrip eingefügt werden sollen. Standardeinstellung: 1000
MODE: Optional. Schreibmodus für die JDBC-Ausgabe. Optionen: Append, Overwrite, Ignore oder ErrorIfExists.
TABLE_PROPERTIES: Optional. Mit dieser Option können datenbankspezifische Tabellen- und Partitionsoptionen beim Erstellen der Ausgabetabelle festgelegt werden.
PRIMARY_KEY: Optional. Primärschlüsselspalte für die Ausgabetabelle. Die angegebene Spalte darf keine doppelten Werte enthalten, da sonst ein Fehler ausgegeben wird.
JDBC_SESSION_INIT: Optional. Anweisung zur Sitzungsinitialisierung zum Lesen von Java-Vorlagen.
LOG_LEVEL: Optional. Ebene der Protokollierung. Kann einer der folgenden Werte sein: ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE oder WARN. Standard: INFO.
TEMP_VIEW und TEMP_QUERY: Optional. Mit diesen beiden optionalen Parametern können Sie eine Spark SQL-Transformation anwenden, während Daten in Cloud Storage geladen werden. TEMP_VIEW muss mit dem in der Abfrage verwendeten Tabellennamen übereinstimmen und TEMP_QUERY ist die Abfrageanweisung.
SERVICE_ACCOUNT: Optional. Wenn nicht angegeben, wird das Compute Engine-Standarddienstkonto verwendet.
PROPERTY und PROPERTY_VALUE: Optional. Durch Kommas getrennte Liste von Spark-Eigenschaft=value-Paaren.
LABEL und LABEL_VALUE: Optional. Durch Kommas getrennte Liste von label=value-Paaren.
KMS_KEY: Optional. Der Cloud Key Management Service-Schlüssel, der für die Verschlüsselung verwendet werden soll. Wenn kein Schlüssel angegeben ist, werden Daten mit einem Google-owned and Google-managed encryption keyim Ruhezustand verschlüsselt.
Beispiel:projects/PROJECT_ID/regions/REGION/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME

Führen Sie folgenden Befehl aus:

Linux, macOS oder Cloud Shell

gcloud dataproc batches submit spark \
    --class=com.google.cloud.dataproc.templates.main.DataProcTemplate \
    --project="PROJECT_ID" \
    --region="REGION" \
    --version="1.2" \
    --jars="gs://dataproc-templates-binaries/TEMPLATE_VERSION/java/dataproc-templates.jar,INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH,OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH" \
    --subnet="SUBNET" \
    --kms-key="KMS_KEY" \
    --service-account="SERVICE_ACCOUNT" \
    --properties="PROPERTY=PROPERTY_VALUE" \
    --labels="LABEL=LABEL_VALUE" \
    -- --template JDBCTOJDBC \
    --templateProperty project.id="PROJECT_ID" \
    --templateProperty log.level="LOG_LEVEL" \
    --templateProperty jdbctojdbc.input.url="INPUT_JDBC_CONNECTION_URL" \
    --templateProperty jdbctojdbc.input.driver="INPUT_DRIVER" \
    --templateProperty jdbctojdbc.input.table="INPUT_JDBC_TABLE" \
    --templateProperty jdbctojdbc.output.url="OUTPUT_JDBC_CONNECTION_URL" \
    --templateProperty jdbctojdbc.output.driver="OUTPUT_DRIVER" \
    --templateProperty jdbctojdbc.output.table="OUTPUT_JDBC_TABLE" \
    --templateProperty jdbctojdbc.input.fetchsize="FETCHSIZE" \
    --templateProperty jdbctojdbc.input.partitioncolumn="INPUT_PARTITION_COLUMN" \
    --templateProperty jdbctojdbc.input.lowerbound="LOWERBOUND" \
    --templateProperty jdbctojdbc.input.upperbound="UPPERBOUND" \
    --templateProperty jdbctojdbc.numpartitions="NUM_PARTITIONS" \
    --templateProperty jdbctojdbc.output.mode="MODE" \
    --templateProperty jdbctojdbc.output.batch.size="BATCH_SIZE" \
    --templateProperty jdbctojdbc.output.primary.key="PRIMARY_KEY" \
    --templateProperty jdbctojdbc.output.create.table.option="TABLE_PROPERTIES" \
    --templateProperty jdbctojdbc.sessioninitstatement="JDBC_SESSION_INIT" \
    --templateProperty jdbctojdbc.temp.view.name="TEMP_VIEW" \
    --templateProperty jdbctojdbc.sql.query="TEMP_QUERY"

Windows (PowerShell)

gcloud dataproc batches submit spark `
    --class=com.google.cloud.dataproc.templates.main.DataProcTemplate `
    --project="PROJECT_ID" `
    --region="REGION" `
    --version="1.2" `
    --jars="gs://dataproc-templates-binaries/TEMPLATE_VERSION/java/dataproc-templates.jar,INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH,OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH" `
    --subnet="SUBNET" `
    --kms-key="KMS_KEY" `
    --service-account="SERVICE_ACCOUNT" `
    --properties="PROPERTY=PROPERTY_VALUE" `
    --labels="LABEL=LABEL_VALUE" `
    -- --template JDBCTOJDBC `
    --templateProperty project.id="PROJECT_ID" `
    --templateProperty log.level="LOG_LEVEL" `
    --templateProperty jdbctojdbc.input.url="INPUT_JDBC_CONNECTION_URL" `
    --templateProperty jdbctojdbc.input.driver="INPUT_DRIVER" `
    --templateProperty jdbctojdbc.input.table="INPUT_JDBC_TABLE" `
    --templateProperty jdbctojdbc.output.url="OUTPUT_JDBC_CONNECTION_URL" `
    --templateProperty jdbctojdbc.output.driver="OUTPUT_DRIVER" `
    --templateProperty jdbctojdbc.output.table="OUTPUT_JDBC_TABLE" `
    --templateProperty jdbctojdbc.input.fetchsize="FETCHSIZE" `
    --templateProperty jdbctojdbc.input.partitioncolumn="INPUT_PARTITION_COLUMN" `
    --templateProperty jdbctojdbc.input.lowerbound="LOWERBOUND" `
    --templateProperty jdbctojdbc.input.upperbound="UPPERBOUND" `
    --templateProperty jdbctojdbc.numpartitions="NUM_PARTITIONS" `
    --templateProperty jdbctojdbc.output.mode="MODE" `
    --templateProperty jdbctojdbc.output.batch.size="BATCH_SIZE" `
    --templateProperty jdbctojdbc.output.primary.key="PRIMARY_KEY" `
    --templateProperty jdbctojdbc.output.create.table.option="TABLE_PROPERTIES" `
    --templateProperty jdbctojdbc.sessioninitstatement="JDBC_SESSION_INIT" `
    --templateProperty jdbctojdbc.temp.view.name="TEMP_VIEW" `
    --templateProperty jdbctojdbc.sql.query="TEMP_QUERY"

Windows (cmd.exe)

gcloud dataproc batches submit spark ^
    --class=com.google.cloud.dataproc.templates.main.DataProcTemplate ^
    --project="PROJECT_ID" ^
    --region="REGION" ^
    --version="1.2" ^
    --jars="gs://dataproc-templates-binaries/TEMPLATE_VERSION/java/dataproc-templates.jar,INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH,OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH" ^
    --subnet="SUBNET" ^
    --kms-key="KMS_KEY" ^
    --service-account="SERVICE_ACCOUNT" ^
    --properties="PROPERTY=PROPERTY_VALUE" ^
    --labels="LABEL=LABEL_VALUE" ^
    -- --template JDBCTOJDBC ^
    --templateProperty project.id="PROJECT_ID" ^
    --templateProperty log.level="LOG_LEVEL" ^
    --templateProperty jdbctojdbc.input.url="INPUT_JDBC_CONNECTION_URL" ^
    --templateProperty jdbctojdbc.input.driver="INPUT_DRIVER" ^
    --templateProperty jdbctojdbc.input.table="INPUT_JDBC_TABLE" ^
    --templateProperty jdbctojdbc.output.url="OUTPUT_JDBC_CONNECTION_URL" ^
    --templateProperty jdbctojdbc.output.driver="OUTPUT_DRIVER" ^
    --templateProperty jdbctojdbc.output.table="OUTPUT_JDBC_TABLE" ^
    --templateProperty jdbctojdbc.input.fetchsize="FETCHSIZE" ^
    --templateProperty jdbctojdbc.input.partitioncolumn="INPUT_PARTITION_COLUMN" ^
    --templateProperty jdbctojdbc.input.lowerbound="LOWERBOUND" ^
    --templateProperty jdbctojdbc.input.upperbound="UPPERBOUND" ^
    --templateProperty jdbctojdbc.numpartitions="NUM_PARTITIONS" ^
    --templateProperty jdbctojdbc.output.mode="MODE" ^
    --templateProperty jdbctojdbc.output.batch.size="BATCH_SIZE" ^
    --templateProperty jdbctojdbc.output.primary.key="PRIMARY_KEY" ^
    --templateProperty jdbctojdbc.output.create.table.option="TABLE_PROPERTIES" ^
    --templateProperty jdbctojdbc.sessioninitstatement="JDBC_SESSION_INIT" ^
    --templateProperty jdbctojdbc.temp.view.name="TEMP_VIEW" ^
    --templateProperty jdbctojdbc.sql.query="TEMP_QUERY"

REST

Ersetzen Sie diese Werte in den folgenden Anfragedaten:

PROJECT_ID: erforderlich. Ihre Google Cloud-Projekt-ID, die in den IAM-Einstellungen aufgeführt ist.
REGION: erforderlich. Compute Engine-Region.
SUBNET: Optional. Wenn kein Subnetz angegeben ist, wird das Subnetz in der angegebenen REGION im Netzwerk default ausgewählt.
Beispiel:projects/PROJECT_ID/regions/REGION/subnetworks/SUBNET_NAME
TEMPLATE_VERSION: erforderlich. Geben Sie latest für die aktuelle Vorlagenversion oder das Datum einer bestimmten Version an, z. B. 2023-03-17_v0.1.0-beta. Rufen Sie gs://dataproc-templates-binaries auf oder führen Sie gcloud storage ls gs://dataproc-templates-binaries aus, um die verfügbaren Vorlagenversionen aufzulisten.
INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH und OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH: Erforderlich. Der vollständige Cloud Storage-Pfad, einschließlich des Dateinamens, in dem die JAR-Dateien des JDBC-Eingabe- und ‑Ausgabe-Connectors gespeichert sind.
Hinweis:Wenn Eingabe- und Ausgabe-Jars identisch sind, reicht es aus, nur INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH festzulegen.

Mit den folgenden Befehlen können Sie JDBC-Connectors zum Hochladen in Cloud Storage herunterladen:
- MySQL:
```
  wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.30.tar.gz
```
- PostgreSQL:
```
  wget https://jdbc.postgresql.org/download/postgresql-42.2.6.jar
```
- Microsoft SQL Server:
```
  wget https://repo1.maven.org/maven2/com/microsoft/sqlserver/mssql-jdbc/6.4.0.jre8/mssql-jdbc-6.4.0.jre8.jar
```
- Oracle:
```
  wget https://repo1.maven.org/maven2/com/oracle/database/jdbc/ojdbc8/21.7.0.0/ojdbc8-21.7.0.0.jar
```

Die folgenden Variablen werden verwendet, um die erforderliche JDBC-Eingabe-URL zu erstellen:

INPUT_JDBC_HOST
INPUT_JDBC_PORT
INPUT_JDBC_DATABASE oder für Oracle: INPUT_JDBC_SERVICE
INPUT_JDBC_USERNAME
INPUT_JDBC_PASSWORD

Erstellen Sie INPUT_JDBC_CONNECTION_URL in einem der folgenden connectorspezifischen Formate:

MySQL:

jdbc:mysql://INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_DATABASE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

PostgreSQL:

jdbc:postgresql://INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_DATABASE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

Microsoft SQL Server:

jdbc:sqlserver://INPUT_JDBC_HOST:INPUT_JDBC_PORT;databaseName=INPUT_JDBC_DATABASE;user=INPUT_JDBC_USERNAME;password=INPUT_JDBC_PASSWORD

Oracle:

jdbc:oracle:thin:@//INPUT_JDBC_HOST:INPUT_JDBC_PORT/INPUT_JDBC_SERVICE?user=INPUT_JDBC_USERNAME&password=INPUT_JDBC_PASSWORD

Die folgenden Variablen werden verwendet, um die erforderliche JDBC-Ausgabe-URL zu erstellen:

OUTPUT_JDBC_HOST
OUTPUT_JDBC_PORT
OUTPUT_JDBC_DATABASE oder für Oracle: OUTPUT_JDBC_SERVICE
OUTPUT_JDBC_USERNAME
OUTPUT_JDBC_PASSWORD

Erstellen Sie OUTPUT_JDBC_CONNECTION_URL in einem der folgenden connectorspezifischen Formate:

MySQL:

jdbc:mysql://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_DATABASE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

PostgreSQL:

jdbc:postgresql://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_DATABASE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

Microsoft SQL Server:

jdbc:sqlserver://OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT;databaseName=OUTPUT_JDBC_DATABASE;user=OUTPUT_JDBC_USERNAME;password=OUTPUT_JDBC_PASSWORD

Oracle:

jdbc:oracle:thin:@//OUTPUT_JDBC_HOST:OUTPUT_JDBC_PORT/OUTPUT_JDBC_SERVICE?user=OUTPUT_JDBC_USERNAME&password=OUTPUT_JDBC_PASSWORD

INPUT_JDBC_TABLE: erforderlich. Name der JDBC-Eingabetabelle oder SQL-Abfrage für die JDBC-Eingabetabelle.
Beispiel (die SQL-Abfrage muss in Klammern stehen): (select * from TABLE_NAME) as ALIAS_TABLE_NAME
OUTPUT_JDBC_TABLE: erforderlich. JDBC-Tabelle, in der die Ausgabe gespeichert wird.
INPUT_DRIVER und OUTPUT_DRIVER: Erforderlich. Der JDBC-Ein- und Ausgabetreiber, der für die Verbindung verwendet wird:
- MySQL:
```
com.mysql.cj.jdbc.Driver
```
- PostgreSQL:
```
org.postgresql.Driver
```
- Microsoft SQL Server:
```
com.microsoft.sqlserver.jdbc.SQLServerDriver
```
- Oracle:
```
oracle.jdbc.driver.OracleDriver
```
INPUT_PARTITION_COLUMN, LOWERBOUND, UPPERBOUND, NUM_PARTITIONS: Optional. Wenn sie verwendet werden, müssen alle folgenden Parameter angegeben werden:
- INPUT_PARTITION_COLUMN: Name der Spalte für die Partitionierung der JDBC-Eingabetabelle.
- LOWERBOUND: Untergrenze der JDBC-Eingabetabellenpartitionsspalte, die zur Bestimmung der Partitionsinkrementierung verwendet wird.
- UPPERBOUND: Obergrenze der JDBC-Eingabetabellenpartitionsspalte, die zur Bestimmung der Partitionslänge verwendet wird.
- NUM_PARTITIONS: Die maximale Anzahl von Partitionen, die für die Parallelität von Tabellenlese- und ‑schreibvorgängen verwendet werden können. Falls angegeben, wird dieser Wert für die JDBC-Ein- und ‑Ausgabeverbindung verwendet.
FETCHSIZE: Optional. Anzahl der Zeilen, die pro Roundtrip abgerufen werden sollen.
BATCH_SIZE: Optional. Anzahl der Datensätze, die pro Roundtrip eingefügt werden sollen. Standardeinstellung: 1000
MODE: Optional. Schreibmodus für die JDBC-Ausgabe. Optionen: Append, Overwrite, Ignore oder ErrorIfExists.
TABLE_PROPERTIES: Optional. Mit dieser Option können datenbankspezifische Tabellen- und Partitionsoptionen beim Erstellen der Ausgabetabelle festgelegt werden.
PRIMARY_KEY: Optional. Primärschlüsselspalte für die Ausgabetabelle. Die angegebene Spalte darf keine doppelten Werte enthalten, da sonst ein Fehler ausgegeben wird.
JDBC_SESSION_INIT: Optional. Anweisung zur Sitzungsinitialisierung zum Lesen von Java-Vorlagen.
LOG_LEVEL: Optional. Ebene der Protokollierung. Kann einer der folgenden Werte sein: ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE oder WARN. Standard: INFO.
TEMP_VIEW und TEMP_QUERY: Optional. Mit diesen beiden optionalen Parametern können Sie eine Spark SQL-Transformation anwenden, während Daten in Cloud Storage geladen werden. TEMP_VIEW muss mit dem in der Abfrage verwendeten Tabellennamen übereinstimmen und TEMP_QUERY ist die Abfrageanweisung.
SERVICE_ACCOUNT: Optional. Wenn nicht angegeben, wird das Compute Engine-Standarddienstkonto verwendet.
PROPERTY und PROPERTY_VALUE: Optional. Durch Kommas getrennte Liste von Spark-Eigenschaft=value-Paaren.
LABEL und LABEL_VALUE: Optional. Durch Kommas getrennte Liste von label=value-Paaren.
KMS_KEY: Optional. Der Cloud Key Management Service-Schlüssel, der für die Verschlüsselung verwendet werden soll. Wenn kein Schlüssel angegeben ist, werden Daten mit einem Google-owned and Google-managed encryption keyim Ruhezustand verschlüsselt.
Beispiel:projects/PROJECT_ID/regions/REGION/keyRings/KEY_RING_NAME/cryptoKeys/KEY_NAME

HTTP-Methode und URL:

POST https://dataproc.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/batches

JSON-Text anfordern:


{
  "environmentConfig": {
    "executionConfig": {
      "subnetworkUri": "SUBNET",
      "kmsKey": "KMS_KEY",
      "serviceAccount": "SERVICE_ACCOUNT"
    }
  },
  "labels": {
    "LABEL": "LABEL_VALUE"
  },
  "runtimeConfig": {
    "version": "1.2",
    "properties": {
      "PROPERTY": "PROPERTY_VALUE"
    }
  },
  "sparkBatch": {
    "mainClass": "com.google.cloud.dataproc.templates.main.DataProcTemplate",
    "args": [
      "--template","JDBCTOJDBC",
      "--templateProperty","log.level=LOG_LEVEL",
      "--templateProperty","project.id=PROJECT_ID",
      "--templateProperty","jdbctojdbc.input.url=INPUT_JDBC_CONNECTION_URL",
      "--templateProperty","jdbctojdbc.input.driver=INPUT_DRIVER",
      "--templateProperty","jdbctojdbc.input.table=INPUT_TABLE",
      "--templateProperty","jdbctojdbc.output.url=OUTPUT_JDBC_CONNECTION_URL",
      "--templateProperty","jdbctojdbc.output.driver=OUTPUT_DRIVER",
      "--templateProperty","jdbctojdbc.output.table=OUTPUT_TABLE",
      "--templateProperty","jdbctojdbc.input.fetchsize=FETCHSIZE",
      "--templateProperty","jdbctojdbc.input.partitioncolumn=INPUT_PARTITION_COLUMN",
      "--templateProperty","jdbctojdbc.input.lowerbound=LOWERBOUND",
      "--templateProperty","jdbctojdbc.input.upperbound=UPPERBOUND",
      "--templateProperty","jdbctojdbc.numpartitions=NUM_PARTITIONS",
      "--templateProperty","jdbctojdbc.output.mode=MODE",
      "--templateProperty","jdbctojdbc.output.batch.size=BATCH_SIZE",
      "--templateProperty","jdbctojdbc.output.primary.key=PRIMARY_KEY",
      "--templateProperty","jdbctojdbc.output.create.table.option=TABLE_PROPERTIES",
      "--templateProperty","jdbctojdbc.sessioninitstatement=JDBC_SESSION_INIT",
      "--templateProperty","jdbctojdbc.temp.view.name=TEMP_VIEW",
      "--templateProperty","jdbctojdbc.sql.query=TEMP_QUERY"
    ],
    "jarFileUris": [
      "gs://dataproc-templates-binaries/TEMPLATE_VERSION/java/dataproc-templates.jar",
      "INPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH",
      "OUTPUT_JDBC_CONNECTOR_CLOUD_STORAGE_PATH"
    ]
  }
}

Wenn Sie die Anfrage senden möchten, maximieren Sie eine der folgenden Optionen:

curl (Linux, macOS oder Cloud Shell)

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto bei der gcloud-Befehlszeile angemeldet haben. Dazu haben Sie gcloud init oder gcloud auth login ausgeführt oder die Cloud Shell genutzt, die Sie automatisch bei der gcloud-Befehlszeile anmeldet. Um herauszufinden, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://dataproc.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/batches"

PowerShell (Windows)

Hinweis: Der folgende Befehl setzt voraus, dass Sie sich mit Ihrem Nutzerkonto bei der gcloud-Befehlszeile angemeldet haben. Dazu führen Sie gcloud init oder gcloud auth login aus. Um herauszufinden, welches Konto gerade aktiv ist, führen Sie gcloud auth list aus.

Speichern Sie den Anfragetext in einer Datei mit dem Namen request.json und führen Sie den folgenden Befehl aus:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://dataproc.googleapis.com/v1/projects/PROJECT_ID/locations/REGION/batches" | Select-Object -Expand Content

Sie sollten eine JSON-Antwort ähnlich wie diese erhalten:


{
  "name": "projects/PROJECT_ID/regions/REGION/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.dataproc.v1.BatchOperationMetadata",
    "batch": "projects/PROJECT_ID/locations/REGION/batches/BATCH_ID",
    "batchUuid": "de8af8d4-3599-4a7c-915c-798201ed1583",
    "createTime": "2023-02-24T03:31:03.440329Z",
    "operationType": "BATCH",
    "description": "Batch"
  }
}