Use case: SOQL queries for the Salesforce source

This page shows how to use SOQL relationship queries when you use the Salesforce source in Cloud Data Fusion.

The Salesforce source lets you seamlessly connect to Salesforce and load large amounts of data into Google Cloud. To simplify loading the data, you can use SOQL relationship queries to retrieve records and reduce the number of API calls in Salesforce.

Before you begin

  1. Deploy and configure the properties for the Salesforce source in Cloud Data Fusion. For more information, see Salesforce batch source.

  2. On the Salesforce node in your pipeline, click Properties. This opens the Salesforce plugin properties page.

The following sections describe how to configure the SOQL query field on the Properties page.

Scenario 1: Relationship query with polymorphic key and limits

The following relationship query example has a polymorphic key and a limit:

SELECT Id, Owner.Name FROM Task WHERE Owner.FirstName like 'B%' Limit 100

This query fetches data from a Task SObject related to Owner. It reads data from the selected fields in the Task SObject. It has a WHERE clause and a placeholder, which you can assign a suitable value.

This query lets you access the required fields using a placeholder variable. It limits the quantity of records fetched to 100.

Scenario 2: Relationship query with child-to-parent with custom objects

The following query fetches data from a custom object with child-to-parent relationship:

SELECT Email,newsales__c,Account__r.OwnerId FROM lead WHERE Account__r.Lead Source LIKE 'C%

This query uses a SELECT clause to fetch data from the Lead SObject in Salesforce with the reference field, Account__r.OwnerId.

The query returns data from the selected fields in the Lead SObject and the relational fields from the lookup linked to the Account parent object. You can query multiple data fields from a cluster of relational SObjects.

Scenario 3: Relationship query with WHERE and OFFSET clauses

The following query fetches data from multiple SObjects: Account and Contacts, related to a specific Industry type and OFFSET clause:

SELECT Name, (SELECT LastName FROM Contacts WHERE CreatedBy.Alias = 'x') FROM Account WHERE Industry = 'media'
offset 4

The OFFSET clause lets you to return results on multiple pages, which is an efficient way to handle large results sets.

What's next