With the Replication Flow feature of SAP Datasphere, you can replicate data from SAP S/4HANA to BigQuery.
This guide explains how to replicate data from SAP S/4HANA to BigQuery through SAP Datasphere when you're using SAP LT Replication Server (SLT)-based replication for SAP S/4HANA.
The high-level steps are as follows:
- Connect SAP Datasphere to the SAP S/4HANA source system.
- Connect SAP Datasphere to the Google Cloud project that contains the target BigQuery dataset.
- Create a replication flow.
- Run the replication flow.
- Validate the replicated data in BigQuery.
For information about setting up CDS-based replication, see Set up CDS-based replication: SAP S/4HANA to BigQuery through SAP Datasphere.
Before you begin
Before you begin, make sure that you or your administrators have completed the following prerequisites:
In the Tenant Configuration page of your SAP Datasphere tenant, enable the Premium Outbound Integration blocks. For information about how to do this, see the SAP documentation Configure the Size of Your SAP Datasphere Tenant.
Validate the latest considerations and limitation of SAP Datasphere replication flows provided in the SAP Note 3297105 - Important considerations for SAP Datasphere Replication Flows.
Review the information about the required SAP software versions, recommended system landscape, considerations for the supported source objects, and more, provided in the SAP Note 2890171 - SAP Data Intelligence / SAP Datasphere - ABAP Integration.
The SLT that is embedded in SAP S/4HANA is supported by SAP Datasphere replication flows only starting from SAP S/4HANA 2022. If you're using an earlier version of SAP S/4HANA, or an older NetWeaver-based SAP application such as SAP ECC, then you need to consider a standalone SLT server. For information about the required SAP software versions, recommended system landscape, considerations for the supported source objects, and more, see the SAP Note 2890171 - SAP Data Intelligence / SAP Datasphere - ABAP Integration.
You have a Google Cloud account and project.
Billing is enabled for your project. For more information, see how to confirm that billing is enabled for your project.
Make sure that the BigQuery API is enabled in your Google Cloud project.
Connect SAP Datasphere to the SAP S/4HANA source system
This section provides instructions to establish a connection between SAP Datasphere and the SAP S/4HANA source system.
Install SAP Cloud Connector
To securely connect your SAP Datasphere tenant to the SAP S/4HANA source system, SAP Cloud Connector is required when your SAP S/4HANA source system is running on-premises, hosted on any cloud environment, or if you're using the SAP S/4HANA Cloud Private Edition. However, if you're using the SAP S/4HANA Cloud Public Edition, then the SAP Cloud Connector is not needed. In that case, skip the SAP Cloud Connector installation and configuration, and move to Create a connection to the SAP S/4HANA source system.
If your SAP S/4HANA source system is running on-premises or hosted on any cloud environment, then you need to install and configure the SAP Cloud Connector on your operating system (OS). For information about OS-specific requirements and instructions to install SAP Cloud Connector, see the SAP documentation Preparing Cloud Connector Connectivity.
If you're using the SAP S/4HANA Cloud Private Edition, then the SAP Cloud Connector is pre-installed as part of the SAP S/4HANA setup. In that case, skip the SAP Cloud Connector installation, and move to Configure SAP Cloud Connector.
Configure SAP Cloud Connector
You configure SAP Cloud Connector to specify the SAP Datasphere subaccount, mapping to the SAP S4/HANA source system in your network, and the accessible resources.
This section highlights the most important steps involved in configuring SAP Cloud Connector. For detailed information about configuring SAP Cloud Connector, see the SAP documentation Configure Cloud Connector.
The most important steps are as follows:
In your web browser, access the SAP Cloud Connector administration UI using the host where your SAP Cloud Connector is installed and the port. For example: http://localhost:8443.
Log in to SAP Cloud Connector. If you're logging in for the first time after installing SAP Cloud Connector, then use the following default credentials:
- Username:
Administrator
- Password:
manage
Before proceeding, change the default password. For more information, see the SAP documentation Initial Configuration.
- Username:
Specify the following details to connect your SAP Cloud Connector to your SAP BTP subaccount:
- Details about your SAP Datasphere subaccount, including the subaccount name, region, and subaccount user. For more information about these fields, see the SAP documentation Configure Cloud Connector.
- For the specified subaccount, a location ID that identifies the location of your SAP Cloud Connector.
To provide access to the SAP S/4HANA source system, add the system mapping information, including information about the internal host and the virtual host system.
For accessing data based on tables with SAP LT Replication Server, you must specify the following resources:
- LTAMB_ -Prefix
- LTAPE_ -Prefix
- RFC_FUNCTION_SEARCH
Save your configuration.
Create a mass transfer configuration
Before running the replication from SAP S/4HANA using SLT, you need to create a mass transfer configuration to specify details of your source system connection, target system connection, and transfer settings.
To create a mass transfer configuration, perform the following steps:
In the SAP GUI, enter transaction code
LTRC
.Click the Create configuration icon. The Create Configuration wizard opens.
In the Configuration Name and Description fields, enter a name and a description for the configuration, and then click Next.
In the Source System Connection Details panel:
- Select the RFC Connection radio button.
In the RFC Destination field, specify the name of the RFC connection to the SAP S/4HANA source system.
Select the checkboxes for Allow Multiple Usage and Read from Single Client as appropriate. For more information about these options, see the SAP LT Replication Server documentation.
Click Next.
In the Target System Connection Details panel:
- Select the radio button for Other.
- In the Scenario field, select SAP Data Intelligence (Replication Management service).
- Click Next.
On the Specify Transfer Settings panel:
In the Data Transfer Settings section, for the Initial Load Mode field, select Performance Optimized.
In the Job options section, enter starting values for the following fields:
- Number of Data Transfer Jobs
- Number of Initial Load Jobs
- Number of Calculation Jobs
In the Replication Options section, select the Real Time radio button.
Click Next.
Review the configuration, and click Save.
Make a note of the three-digit ID in the Mass Transfer column. You use it in a later step.
Create a connection to the SAP S/4HANA source system
In SAP Datasphere, create a source connection to use the SAP S/4HANA source system for data access. You use this connection to create replication flows.
To create a connection to the SAP S/4HANA source system, perform the following steps:
In SAP Datasphere, go to Data Builder, and click New Connection.
Select the connection type SAP ABAP.
Specify the following connection properties:
- Protocol: select RFC.
- SAP Logon Connection Type: select Application Server.
- Use Cloud Connector: set to True.
Specify other properties specific for your application server and SAP system. For more information, see the SAP documentation SAP ABAP Connections.
To validate the connection between SAP Datasphere and SAP S/4HANA, select your connection, and click the Validate Connection icon.
For more information about how to create a connection between SAP Datasphere and SAP S/4HANA, see the SAP documentation Create a Connection.
Before you can use the connection for replication flows, check the SAP Notes relevant to replication flows and implement any necessary note on your SAP S/4HANA system. For more information about the required SAP Notes, see:
- SAP Notes listed under the section Replication Flows.
- SAP Notes listed under the section Source Systems for SAP Data Intelligence.
Connect SAP Datasphere to Google Cloud project
This section provides instructions to establish a connection between SAP Datasphere and your Google Cloud project that contains the target BigQuery dataset.
Create a service account
For the authentication and authorization of SAP Datasphere, you need an IAM service account in your Google Cloud project. You grant roles to the service account that contains permissions to interact with BigQuery.
You also need to create a JSON key for the service account. You upload the JSON key to SAP Datasphere to authenticate with Google Cloud.
To create a service account, perform the following steps:
In the Google Cloud console, go to the IAM & Admin Service accounts page.
If prompted, select your Google Cloud project.
Click Create Service Account.
Specify a name for the service account and, optionally, a description.
Click Create and Continue.
In the Grant this service account access to project panel, select the following roles:
- BigQuery Data Owner
- BigQuery Job User
Click Continue.
Click Done. The service account appears in the list of service accounts for the project.
Download JSON key for the service account
To download a JSON key for the service account, perform the following steps:
- Click the email address of the service account that you want to create a key for.
- Click the Keys tab.
- Click the Add key drop-down menu, then select Create new key.
- Select JSON as the Key type and click Create.
Clicking Create downloads a service account key file. Make sure to store the key file securely, because it can be used to authenticate as your service account. For more information, see Create and delete service account keys.
Create a BigQuery dataset
To create a BigQuery dataset, your user account must have the proper IAM permissions for BigQuery. For more information, see Required permissions.
To create a BigQuery dataset, perform the following steps:
In the Google Cloud console, go to the BigQuery page:
Next to your project ID, click the View actions icon,
, and then Create dataset.In the Dataset ID field, enter a unique name. For more information, see Name datasets.
In the Location type field, choose a geographic location for the dataset that one you are planning to utilize. After a dataset is created, the location can't be changed.
For more information about how to create BigQuery datasets, see Create datasets.
Upload SSL certificates to SAP Datasphere
To encrypt the data transmitted between SAP and Google Cloud, you need to upload the required Google SSL certificates to SAP Datasphere.
To upload the SSL certificates, perform the following steps:
From the Google Trust Services repository, download the following certificates:
- GTS Root R1
- GTS CA 1C3
In the SAP Datasphere, go to System > Configuration > Security.
Click Add Certificate.
Browse your local directory and select the certificates that you downloaded from the Google Trust Services repository.
Click Upload.
For more information from SAP about uploading certificates to SAP Datasphere, see Manage Certificates for Connections.
Upload the driver for BigQuery to SAP Datasphere
The BigQuery ODBC driver acts as a bridge between SAP Datasphere and BigQuery for replication flows. To enable access to BigQuery, you need to upload the required ODBC driver files to SAP Datasphere.
For more information from SAP about uploading the required ODBC driver files to SAP Datasphere, see Upload Third-Party ODBC Drivers (Required for Data Flows).
To upload the driver files, perform the following steps:
From ODBC and JDBC drivers for BigQuery, download the required BigQuery ODBC driver.
In the SAP Datasphere, go to System > Configuration > Data Integration.
Go to Third-Party Drivers and click Upload.
Browse your local directory and select the driver file that you downloaded from ODBC and JDBC drivers for BigQuery.
Click Upload.
Click Sync to synchronize the driver changes. After the synchronization is finished, you can use data flows with the connection.
Create a connection to Google Cloud project
To replicate data from your SAP S/4HANA source system to the target BigQuery dataset, you need to create a replication flow in your SAP Datasphere tenant.
To create a connection to Google Cloud project, perform the following steps:
In SAP Datasphere, go to Connections, and create a new connection in your space.
Choose the connection type as Google BigQuery.
In the Connection details sections, specify the following:
- Project ID: enter your Google Cloud project ID in lowercase.
- Location: enter your Google Cloud project location.
In the Credential section, upload the JSON key file that is used for authentication. For more information, see Download JSON key for the service account.
To validate the connection between SAP Datasphere and BigQuery, select your connection, and click the Validate Connection icon.
For more information from SAP about the connection to connect to and access data from BigQuery, see Google BigQuery Connections.
Create a replication flow
You create a replication flow to copy SAP data from your SAP S/4HANA source system to the target BigQuery dataset.
To create a replication flow through SLT, perform the following steps:
In SAP Datasphere, go to Data Builder, and click New Replication Flow.
Specify the source for your replication flow:
Select the source connection of the type SAP ABAP that you created in the section Create a connection to the SAP S/4HANA source system.
Select SLT-SAP LT Replication Server as a source container, and then add the mass transfer ID of your configuration that you created in the section Create a mass transfer configuration.
Add source objects as required.
For more information, see the SAP documentation Add a Source.
Select one of the load types: Initial only or Initial and delta.
Specify the target environment for your replication flow:
Select the connection to the Google Cloud project that contains the target BigQuery dataset.
Select the container, which is the dataset in BigQuery, to which you want to replicate your data.
For more information, see the SAP documentation Add a Target.
Create mappings to specify how the source data is to be changed on its way into the target. For more information, see the SAP documentation Define Mapping.
Save the replication flow.
Deploy the replication flow.
For more information, see the SAP documentation Creating a Replication Flow.
Run the replication flow
Once your replication flow is configured and deployed, you can run it.
To run a replication flow, select the replication flow, and click Run.
Once completed, the Run Status section in the Property panel is updated. For more information, see the SAP documentation Running a flow.
Monitor replication flow status
You can view and monitor the execution details of replication flows.
To monitor replication flow status, perform the following steps:
In the SAP Datasphere, go to Data Integration Monitor > Flows.
Select a flow run in the left panel to view its details.
For more information, see the SAP documentation Monitoring flows.
Validate the replicated data in BigQuery
After the replication flow run is complete, validate the replicated table and data in BigQuery.
To validate the replicated data in BigQuery, perform the following steps:
In the Google Cloud console, go to the BigQuery page.
In the Explorer section, expand your project to view the dataset and its tables.
Select the required table. The table information is displayed under a tab in the content pane on the right side of the page.
In the table information section, click the following headings to view the SAP data:
- Preview: shows the data replicated from the SAP S/4HANA source system.
- Details: shows the table size, the total number of rows, and other details.