Best practices for the Salesforce batch source

This page describes best practices for improving performance when you use a Salesforce batch source in Cloud Data Fusion.

Improve performance with PK chunking

PK chunking breaks up large datasets into smaller datasets, or chunks.

Enabling PK chunking in the Salesforce batch source plugin has the following benefits:

To use PK chunking, follow these steps:

Go to the Cloud Data Fusion web interface and open your pipeline on the Studio page.
Optional: If you haven't added a Salesforce node in your pipeline, add one:
1. In the Source menu, click Salesforce. The Salesforce node appears in your pipeline. If you don't see the Salesforce source on the Studio page, deploy the Salesforce plugins from the Cloud Data Fusion Hub.
To configure the source, go to the Salesforce node and click Properties.
Turn on Enable PK chunking.
In the Chunk size field, enter the number of records per chunk. The default value is 100000 records. The maximum is 250000 records.
Click Validate.

To reduce the number of API calls in Salesforce, retrieve records with SObject query filters or SOQL queries.

SObject query filters: configure the filter in the Salesforce plugin properties in the SObject name field. For more information, see Configure the plugin.
SOQL queries: configure the queries in the Salesforce plugin properties in the SOQL query field. For more information, see SOQL queries for the Salesforce source.