This guide describes how to deploy, configure, and run data pipelines that use the SAP OData plugin.
You can use SAP as a source for batch-based data extraction in Cloud Data Fusion using the Open Data Protocol (OData). The SAP OData plugin helps you configure and execute data transfers from SAP OData Catalog Services without any coding.
- Configure the SAP ERP system (activate DataSources in SAP).
- Deploy the plugin in your Cloud Data Fusion environment.
- Download the SAP transport from Cloud Data Fusion and install it in SAP.
- Use Cloud Data Fusion and SAP OData to create data pipelines for integrating SAP data.
Before you begin
To use this plugin, you will need domain knowledge in the following areas:
- Building pipelines in Cloud Data Fusion
- Access management with IAM
- Configuring SAP Cloud and on-premises enterprise resource planning (ERP) systems
The tasks on this page are performed by people with the following roles in Google Cloud or in their SAP system:
|Google Cloud Admin||Users assigned this role are administrators of Google Cloud accounts.|
|Cloud Data Fusion User||Users assigned this role are authorized to design and run data
pipelines. They are granted, at minimum, the Data Fusion Viewer
|SAP Admin||Users assigned this role are administrators of the SAP system. They have access to download software from the SAP service site. It is not an IAM role.|
|SAP User||Users assigned this role are authorized to connect to an SAP system. It is not an IAM role.|
Prerequisites for OData extraction
The OData Catalog Service must be activated in the SAP system.
Data must be populated in OData service.
Prerequisites for your SAP system
In SAP NetWeaver 7.02 to SAP NetWeaver release 7.31, the OData and SAP Gateway functionalities are delivered with the following SAP software components:
In SAP NetWeaver release 7.40 and later, all the functionalities are available in the component
SAP_GWFND, which must be made available in SAP NetWeaver.
Optional: Install SAP transport files
The SAP components that are needed for load balancing calls to SAP are delivered as SAP transport files that are archived as a zip file (one transport request, which consists of one cofile and one data file). You can use this step to limit multiple parallel calls to SAP, based on the available work processes in SAP.
The zip file download is available when you deploy the plugin in the Cloud Data Fusion Hub.
The SAP transport request IDs and associated files are provided in the following table:
|Transport ID||Cofile||Data file||Content|
|ED1K900360||K900360.ED1||R900360.ED1||RFC function modules exposed through OData|
When you import the transport files into SAP, the following SAP OData projects are created:
ICF service node:
To install the SAP transport, follow these steps:
Step 1: Upload the transport request files
- Log into the operating system of the SAP Instance.
- Use the SAP transaction code
AL11to get the path for the
DIR_TRANSfolder. Typically, the path is
- Copy the cofiles to the
- Copy the data files to the
- Set the User and Group of data and cofile to
Step 2: Import the transport request files
The SAP administrator can import the transport request files by using one of the following options:
Option 1: Import the transport request files by using the SAP transport management system
- Log in to the SAP system as an SAP administrator.
- Enter the transaction STMS.
- Click Overview > Imports.
- In the Queue column, double-click the current SID.
- Click Extras > Other requests > Add.
- Select the transport request ID and click Continue.
- Select the transport request in the import queue, and then click Request > Import.
- Enter the Client number.
On the Options tab, select Overwrite originals and Ignore invalid component version (if available).
(Optional) To schedule a reimport of the transports for a later time, select Leave transport requests in queue for later import and Import transport requests again. This is useful for SAP system upgrades and backup restorations.
To verify the import, use any transactions, such as
Option 2: Import the transport request files at the operating system level
- Log in to the SAP system as an SAP system administrator.
Add the appropriate requests to the import buffer by running the following command:
tp addtobuffer TRANSPORT_REQUEST_ID SID
tp addtobuffer IB1K903958 DD1
Import the transport requests by running the following command:
tp import TRANSPORT_REQUEST_ID SID client=NNN U1238
NNNwith the client number. For example:
tp import IB1K903958 DD1 client=800 U1238
Verify that the function module and authorization roles were imported successfully by using any appropriate transactions, such as
Get a list of filterable columns for an SAP catalog service
Only some DataSource columns can be used for filter conditions (this is an SAP limitation by design).
To get a list of filterable columns for an SAP catalog service, follow these steps:
- Log in to the SAP system.
- Go to t-code
Enter the OData Project name, which is a substring of Service name. For example:
- Service name:
- Project name:
- Service name:
Go to the entity that you want to filter and select Properties.
You can use the fields shown in the Properties as filters. Supported operations are Equal and Between (Range).
For a list of operators supported in the expression language, see the OData open source documentation: URI Conventions (OData Version 2.0).
Example URI with filters:
/sap/opu/odata/sap/MM_PUR_POITEMS_MONI_SRV/C_PurchaseOrderItemMoni(P_DisplayCurrency='USD')/Results/?$filter=(PurchaseOrder eq '4500000000')
Configure the SAP ERP system
The SAP OData plugin uses an OData service that is activated on each SAP Server from which the data is extracted. This OData service can be a standard provided by SAP or a custom OData service developed on your SAP system.
Step 1: Install SAP Gateway 2.0
The SAP (Basis) admin must verify that the SAP Gateway 2.0 components are available in the SAP source system, depending on the NetWeaver release. For more information about installing the SAP Gateway 2.0, log in to SAP ONE Support Launchpad and see Note 1569624 (login required) .
Step 2: Activate the OData service
Activate the required OData service on the source system. For more information, see Front-end server: Activate OData services.
Step 3: Create an Authorization Role
To connect to the DataSource, create an Authorization Role with the required authorizations in SAP, and then grant it to the SAP user.
To create the Authorization Role in SAP, follow these steps:
- In the SAP GUI, enter the transaction code PFCG to open the Role Maintenance window.
In the Role field, enter a name for the role.
Click Single Role.
The Create Roles window opens.
In the Description field, enter a description and click Save.
Authorizations for SAP OData plugin.
Click the Authorizations tab. The title of the window changes to Change Roles.
Under Edit Authorization Data and Generate Profiles, clickChange Authorization Data.
The Choose Template window opens.
Click Do not select templates.
The Change role: Authorizations window opens.
Provide the authorizations shown in the following SAP Authorization table.
To activate the Authorization Role, click the Generate icon.
|Object Class||Object Class Text||Authorization object||Authorization object Text||Authorization||Text||Value|
|AAAB||Cross-application Authorization Objects||S_SERVICE||Check at Start of External Services||SRV_NAME||Program, transaction or function module name||
|AAAB||Cross-application Authorization Objects||S_SERVICE||Check at Start of External Services||SRV_TYPE||Type of Check Flag and Authorization Default Values||
|FI||Financial Accounting||F_UNI_HIER||Universal Hierarchy Access||ACTVT||Activity||
|FI||Financial Accounting||F_UNI_HIER||Universal Hierarchy Access||HRYTYPE||Hierarchy Type||
|FI||Financial Accounting||F_UNI_HIER||Universal Hierarchy Access||HRYID||Hierarchy ID||
To design and run a data pipeline in Cloud Data Fusion (as the Cloud Data Fusion user), you need SAP user credentials (username and password) to configure the plugin to connect to the DataSource.
The SAP user must be of the
Dialog types. To avoid using
SAP dialog resources, the
Communications type is recommended. Users can be
created using SAP transaction code SU01.
Optional: Step 4: Secure the connection
You can secure the communication over the network between your private Cloud Data Fusion instance and SAP.
To secure the connection, follow these steps:
- The SAP admin must generate an X509 certificate. To generate the certificate, see Creating an SSL Server PSE.
- The Google Cloud Admin must copy the X509 file to a readable Cloud Storage bucket in the same project as the Cloud Data Fusion instance and give the bucket path to the Cloud Data Fusion user, who enters it when they configure the plugin.
- The Google Cloud Admin must grant read access for the X509 file to the Cloud Data Fusion user who designs and runs pipelines.
Optional: Step 5: Create custom OData services
You can customize how data is extracted by creating custom OData services in SAP:
- To create custom OData services, see Creation of OData services for beginners.
- To create custom OData services using core data services (CDS) views, see How to Create an OData service and Exposing CDS Views as an OData service.
- Any custom OData service must support
$countqueries. These queries let the plugin partition the data for sequential and parallel extraction. If used, the
$selectqueries must also be supported.
Set up Cloud Data Fusion
Ensure that communication is enabled between the Cloud Data Fusion instance and the SAP server. For private instances, set up network peering. After network peering is established with the project where the SAP Systems are hosted, no additional configuration is required to connect to your Cloud Data Fusion instance. Both the SAP system and Cloud Data Fusion instance need to be inside of the same project.
Step 1: Set up your Cloud Data Fusion environment
To configure your Cloud Data Fusion environment for the plugin:
Go to the instance details:
In the Google Cloud console, go to the Cloud Data Fusion page.
Click Instances, and then click the instance's name to go to the Instance details page.
Check that the instance has been upgraded to version 6.4.0 or later. If the instance is in an earlier version, you need to upgrade it.
Click View instance. When the Cloud Data Fusion UI opens, click Hub.
Select the SAP tab > SAP OData.
If the SAP tab is not visible, see Troubleshooting SAP integrations.
Click Deploy SAP OData Plugin.
The plugin now appears in the Source menu on the Studio page.
Step 2: Configure the plugin
The SAP OData plugin reads the content of an SAP DataSource.
To filter the records, you can configure the following properties on the SAP OData Properties page.
|Reference Name||Name used to uniquely identify this source for lineage, annotating metadata, etc.|
|SAP OData Base URL||SAP Gateway OData Base URL ((use the complete URL path, similar to
|OData Version||Supported SAP OData version.|
|Service Name||Name of the SAP OData service from which you want to extract an entity.|
|Entity Name||Name of the entity that is being extracted, such as
|Get Schema button||Generates a schema based on the metadata from SAP, with automatic mapping of SAP data types to corresponding Cloud Data Fusion data types (same functionality as the Validate button).|
|SAP Type||Basic (via Username and Password).|
|SAP Logon Username||SAP Username
Recommended: If the SAP Logon Username changes periodically, use a macro.
|SAP Logon Password||SAP User password
Recommended: Use secure macros for sensitive values, such as passwords.
|SAP X.509 Client Certificate
(See Using X.509 Client Certificates on SAP NetWeaver Application Server for ABAP.
|GCP Project ID||A globally unique identifier for your project. This field is mandatory if the X.509 Certificate Cloud Storage Path field does not contain a macro value.|
|GCS Path||The Cloud Storage bucket path that contains the user-uploaded X.509 certificate, which corresponds to the SAP application server for secure calls based on your requirements (see the Secure the connection step).|
|Passphrase||Passphrase corresponding to the provided X.509 certificate.|
|Filter Options||Indicates the value a field must have to be read. Use this filter condition to restrict the output data volume. For example: `Price Gt 200` selects the records with a `Price` field value greater than `200`. (See Get a list of filterable columns for an SAP catalog service.)|
|Select Fields||Fields to be preserved in the extracted data (for example: Category, Price, Name, Supplier/Address).|
|Expand Fields||List of complex fields to be expanded in the extracted output data (for example: Products/Suppliers).|
|Number of Rows to Skip||Total number of rows to skip (for example: 10).|
|Number of Rows to Fetch||Total number of rows to be extracted.|
|Number of Splits to Generate||The number of splits used to partition the input data. More partitions
increase the level of parallelism, but require more resources and
If left blank, the plugin chooses an optimal value (recommended).
|Batch Size||Number of rows to fetch in each network call to SAP. A small size causes
frequent network calls repeating the associated overhead. A large size
might slow down data retrieval and cause excessive resource usage in SAP.
If the value is set to
Supported OData types
The following table shows the mapping between OData v2 data types used in SAP applications and Cloud Data Fusion data types.
|OData type||Description (SAP)||Cloud Data Fusion data type|
|SByte||Signed 8-bit integer value||
|Byte||Unsigned 8-bit integer value||
|Int16||Signed 16-bit integer value||
|Int32||Signed 32-bit integer value||
|Int64||Signed 64-bit integer value appended with the character: 'L'
|Single||Floating point number with 7-digit precision that can represent values
with an approximate range of ± 1.18e -38 through ± 3.40e +38, appended
with the character: 'f'
|Double||Floating point number with 15-digit precision that can represent values
with approximate ranges of ± 2.23e -308 through ± 1.79e +308, appended
with the character: 'd'
|Decimal||Numeric values with fixed precision and scale describing a numeric value
ranging from negative 10^255 + 1 to positive 10^255 -1, appended with the
character: 'M' or 'm'
|Guid||A 16-byte (128-bit) unique identifier value, starting with the
|String||Fixed or variable-length character data encoded in UTF-8||
|Binary||Fixed or variable-length binary data, starting with either 'X' or
'binary' (both are case-sensitive)
|Boolean||Mathematical concept of binary-valued logic||
|Date/Time||Date and time with values ranging from 12:00:00 AM on January 1, 1753 to 11:59:59 PM on December 31, 9999||
|Time||Time of day with values ranging from 0:00:00.x to 23:59:59.y, where 'x' and 'y' depend on precision||
|DateTimeOffset||Date and time as an Offset, in minutes from GMT, with values ranging from 12:00:00 AM on January 1, 1753 to 11:59:59 PM, December 31, 9999||
|Navigation and Non-Navigation Properties (multiplicity = *)||Collections of a simple type, with a multiplicity of one-to-many.||
|Properties (multiplicity = 0.1)||References to other complex types with a multiplicity of one-to-one||
Click Validate on the top right or Get Schema.
The plugin validates the properties and generates a schema based on the metadata from SAP. It automatically maps SAP data types to corresponding Cloud Data Fusion data types.
Run a data pipeline
- After deploying the pipeline, click Configure on the top center panel.
- Select Resources.
- If needed, change the Executor CPU and Memory based on the overall data size and the number of transformations used in the pipeline.
- Click Save.
- To start the data pipeline, click Run.
The plugin uses Cloud Data Fusion's parallelization capabilities. The following guidelines can help you configure the runtime environment so that you provide sufficient resources to the runtime engine to achieve the intended degree of parallelism and performance.
Optimize the plugin configuration
Recommended: Unless you are familiar with your SAP system's memory settings, leave the Number of Splits to Generate and Batch Size blank (unspecified).
For better performance when you run your pipeline, use the following configurations:
Number of Splits to Generate: values between
16are recommended. But they can increase to
32, or even
64, with appropriate configurations on the SAP side (allocating appropriate memory resources for the work processes in SAP). This configuration improves parallelism on the Cloud Data Fusion side. The runtime engine creates the specified number of partitions (and SAP connections) while extracting the records.
If the Configuration Service (which comes with the plugin when you import the SAP transport file) is available: the plugin defaults to the SAP system's configuration. The splits are 50% of the available dialog work processes in SAP. Note: The Configuration Service can only be imported from S4HANA systems.
If the Configuration Service isn't available, the default is
In either case, if you specify a different value, the value you provide prevails over the default split value,except that it is capped by the available dialog processes in SAP, minus two splits.
If the number of records to extract is less than
2500, the number of splits is
Batch Size: this is the count of records to fetch in every network call to SAP. A smaller batch size causes frequent network calls, repeating the associated overhead. By default, the minimum count is
1000and the maximum is
For more information, see OData entity limits.
Cloud Data Fusion resource settings
Recommended: Use 1 CPU and 4 GB of memory per Executor (this value applies to each Executor process). Set these in the Configure > Resources dialog.
Dataproc cluster settings
Recommended: At minimum, allocate a total of CPUs (across workers) greater than the intended number of splits (see Plugin configuration).
Each worker must have 6.5 GB or more memory allocated per CPU in the Dataproc settings (this translates to 4 GB or more available per Cloud Data Fusion Executor). Other settings can be kept at the default values.
Recommended: Use a persistent Dataproc cluster to reduce the data pipeline runtime (this eliminates the Provisioning step which might take a few minutes or more). Set this in the Compute Engine configuration section.
Sample configurations and throughput
Sample development and test configurations
- Dataproc cluster with 8 workers, each with 4 CPUs and 26 GB memory. Generate up to 28 splits.
- Dataproc cluster with 2 workers, each with 8 CPUs and 52 GB memory. Generate up to 12 splits.
Sample production configurations and throughput
- Dataproc cluster with 8 workers, each with 8 CPUs and 32 GB memory. Generate up to 32 splits (half of the available CPUs).
- Dataproc cluster with 16 workers, each with 8 CPUs and 32 GB memory. Generate up to 64 splits (half the available CPUs).
Sample throughput for an SAP S4HANA 1909 production source system
The following table has sample throughput. Throughput shown is without filter options unless specified otherwise. When using filter options, throughput is reduced.
|Batch size||Splits||OData Service||Total rows||Rows extracted||Throughput (rows per second)|
|1000||4||ZACDOCA_CDS||5.37 M||5.37 M||1069|
|2500||10||ZACDOCA_CDS||5.37 M||5.37 M||3384|
|5000||8||ZACDOCA_CDS||5.37 M||5.37 M||4630|
|5000||9||ZACDOCA_CDS||5.37 M||5.37 M||4817|
Sample throughput for an SAP S4HANA cloud production source system
|Batch size||Splits||OData Service||Total rows||Rows extracted||Throughput (GB/hour)|
|2500||40||TEST_04_UOM_ODATA_CDS/||201 M||10 M||25.48|
|5000||50||TEST_04_UOM_ODATA_CDS/||201 M||10 M||26.78|
Supported SAP products and versions
Supported sources include SAP S4/HANA 1909 and later, S4/HANA on SAP cloud, and any SAP application capable of exposing OData Services.
The transport file that contains the custom OData service for load balancing the calls to SAP must be imported in S4/HANA 1909 and later. The service helps calculate the number of splits (data partitions) that the plugin can read in parallel (see number of splits).
OData version 2 is supported.
The plugin was tested with SAP S/4HANA servers deployed on Google Cloud.
SAP OData Catalog Services are supported for extraction
The plugin supports the following DataSource types:
- Transaction data
- CDS views exposed through OData
No SAP notes are required before extraction, but the SAP system must have SAP Gateway available. For more information, see note 1560585 (this external site requires an SAP login).
Limits on the volume of data or record width
There is no defined limit to the volume of data extracted. We have tested with up to 6 million rows extracted in one call, with a record width of 1 KB. For SAP S4/HANA on cloud, we have tested with up to 10 million rows extracted in one call, with a record width of 1 KB.
Expected plugin throughput
For an environment configured according to the guidelines in the Performance section, the plugin can extract around 38 GB per hour. Actual performance might vary with the Cloud Data Fusion and SAP system loads or network traffic.
Delta (changed data) extraction
Delta extraction isn't supported.
At runtime, the plugin writes log entries in the Cloud Data Fusion data pipeline
log. These entries are prefixed with
CDF_SAP for easy identification.
At design time, when you validate the plugin settings, messages are displayed in the Properties tab and are highlighted in red.
The following table lists some common error messages (text in
|Message ID||Message||Recommended action|
|None||Required property '
||Enter an actual value or macro variable.|
|None||Invalid value for property '
||Enter a non-negative whole number (0 or greater, without a decimal) or macro variable.|
|CDF_SAP_ODATA_01505||Failed to prepare the Cloud Data Fusion output schema. Please check the provided runtime macros value.||Ensure the provided macro values are correct.|
|N/A||SAP X509 certificated '<UI input in GCS Path>' is missing. Please make sure the required X509 certificate is uploaded to your specified Google Cloud Storage bucket '<GCS bucket name>'.||Ensure the provided Cloud Storage path is correct.|
|CDF_SAP_ODATA_01532||Generic error code anything related to SAP OData connectivity issues
Failed to call given SAP OData service. Root Cause: <SAP OData service root cause message>
|Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01534||Generic error code anything related to SAP OData service error.
Service validation failed. Root Cause:
|Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01503||Failed to fetch total available record count from <SAP OData service entity name>. Root Cause: <SAP Odata service root cause message>||Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01506||No records found to extract in <SAP OData service entity name>. Please ensure that the provided entity contains records.||Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01537||Failed to process records for <SAP OData service entity name>. Root Cause: <SAP OData service root cause message>||Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01536||Failed to pull records from <SAP OData service entity name>. Root Cause: <SAP OData service root cause message>||Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01504||Failed to generate the encoded metadata string for the given OData service <SAP OData service name>. Root Cause: <SAP OData service root cause message>||Check the root cause displayed in the message and take appropriate action.|
|CDF_SAP_ODATA_01533||Failed to decode the metadata from the given encoded metadata string for service <SAP OData service name>. Root Cause: <SAP OData service root cause message>||Check the root cause displayed in the message and take appropriate action.|