Integration with Google Ads
This page describes the required configurations to bring data from Google Ads as a data source of the marketing workload of Cortex Framework Data Foundation.
Google Ads is an online advertising platform that allows businesses to advertise their products or services across various Google properties. Cortex Framework brings your Google Ads data together with other marketing channels, analyzes it comprehensively, and uses AI to improve your campaign results.
The following diagram describes how Google Ads data is available through the marketing workload of Cortex Framework Data Foundation:
Configuration file
The config.json
file configures the settings required to transfer data from
any data source, including Google Ads. This file contains the following parameters for
Google Ads:
"marketing": {
"deployGoogleAds": true,
"GoogleAds": {
"deployCDC": true,
"lookbackDays": 180,
"datasets": {
"cdc": "",
"raw": "",
"reporting": "REPORTING_GoogleAds"
}
}
}
The following table describes the value for each Google Ads marketing parameter:
Parameter | Meaning | Default Value | Description |
marketing.deployGoogleAds
|
Deploy Google Ads | true
|
Execute the deployment for Google Ads data source. |
marketing.GoogleAds.deployCDC
|
Deploy CDC for Google Ads | true
|
Generate Google Ads CDC processing scripts to run as DAGs in Cloud Composer. |
marketing.GoogleAds.lookbackDays
|
Lookback days for Google Ads | 180
|
Number of days to start fetching data from Google Ads API. |
marketing.GoogleAds.datasets.cdc
|
CDC dataset for Google Ads | CDC dataset for Google Ads. | |
marketing.GoogleAds.datasets.raw
|
Raw dataset for Google Ads | Raw dataset for Google Ads. | |
marketing.GoogleAds.datasets.reporting
|
Reporting dataset for Google Ads | "REPORTING_GoogleAds"
|
Reporting dataset for Google Ads. |
Data Model
This section describes the Google Ads Data Model using the Entity Relationship Diagram (ERD).
Base views
These are the blue objects in the ERD and are views on CDC tables with
no transforms other than some column name aliases. See scripts in
src/marketing/src/GoogleAds/src/reporting/ddls
.
Reporting views
These are the green objects in the ERD and are reporting views that contain
aggregate metrics. See scripts in
src/marketing/src/GoogleAds/src/reporting/ddls
.
API connection
Cortex Framework ingestion templates use the Google Ads API to retrieve reporting attributes and metrics from Google Ads. The current Cortex Framework templates use Google Ads API version 17.1. Consider the Google Ads API limitations:
- Basic access operations per day: 15000 (paginated requests containing
valid
next_page_token
are not counted). - Max page size: 10000 rows per page.
- Recommended default parameters: Page size equals to 10000 rows per page.
For more information about the API connection, see Google Ads API documentation..
Account authentication
Follow these steps to set up account authentication:
- In the Google Cloud console, click Navigation menu > API & Services > Credentials > Create credentials.
Create a OAuth Client ID credential with the following characteristics. For more information, see Using OAuth 2.0 to Access Google APIs.
Application type: "Web Application" Name: CHOSEN_NAME #(For example,"Cortex Authentication Client"). Authorized redirect URIs: http://127.0.0.1
Replace
CHOSEN_NAME
with the chosen name for OAuth Client ID credential account.Save the
Client ID
andClient secret
after the credential is configured. You need it later.Generate a fresh token using OAuth 2.0 Access Google APIs. Cortex Data Foundation automatically detects and ingest data from all customers (accounts) that are accessible to the credentials used to generate the token.
Create a secret using Secret Manager:
- In the Google Cloud console, click Secret Manager.
- Create a secret called
cortex-framework-google-ads-yaml
using the following format and changing the values according with your settings:
{"developer_token": "DEVELOPER_TOKEN_VALUE", "refresh_token": "REFRESH_TOKEN_VALUE", "client_id": "CLIENT_ID_VALUE", "client_secret": "CLIENT_SECRET_VALUE", "use_proto_plus": False}
Replace the following:
DEVELOPER_TOKEN_VALUE
with the developer token value available in Google Ads account.REFRESH_TOKEN_VALUE
with the refresh token value obtained in step 4.CLIENT_ID_VALUE
with the client ID value obtained in the OAuth setup in step 2.CLIENT_SECRET_VALUE
with the client secret value obtained from the OAuth setup in step 2.
Data Freshness and Delay
As a general rule, data freshness for Cortex Framework data sources is limited by what upstream connection allows for, as well as the frequency of your DAG execution. Adjust your DAG execution frequency to align with upstream frequency, resource constraints, and your business needs.
Data retrieved using Google Ads API is generally available with 3+ hour latency. They may be adjusted afterwards due to conversions and invalid traffic detection. For more information, see the following About data freshness article in the Google Ads Help Center.
Cloud Composer connections permissions
Create the following connections in Cloud Composer. See more details in the Manage Airflow connections documentation.
Connection Name | Purpose |
googleads_raw_dataflow
|
For Google Ads API > BigQuery Raw Dataset. |
googleads_cdc_bq
|
For Raw dataset > CDC dataset transfer. |
googleads_reporting_bq
|
For CDC dataset > Reporting dataset transfer. |
Cloud Composer service account permissions
Grant Dataflow permissions to the service account used in
Cloud Composer (as configured in the googleads_raw_dataflow
connection). See instructions in Dataflow documentation.
Ingestion settings
Control Source to Raw
and Raw to CDC
data pipelines through the
settings in the file src/GoogleAds/config/ingestion_settings.yaml
. This section describes the parameters of
each data pipeline.
Source to raw tables
This section describes which entities are fetched by APIs and how. Each entry corresponds with one Google Ads entity. Based on this config, Cortex creates Airflow DAGs that run Dataflow pipelines to fetch data using Google Ads APIs.
The following parameters control the settings for Source to Raw
for each entry:
Parameter | Description |
load_frequency
|
How frequently a DAG for this entity runs to fetch data from Google Ads. For more information about possible values, see Airflow documentation. |
api_name
|
API Resource Name (for example,
customer ).
|
table_name
|
Table in Raw dataset where the
fetched data is stored (for example, customer ).
|
schema_file
|
Schema file in src/table_schema
directory that maps API response fields to destination table's column names.
|
key
|
Columns (separated by comma) that form a unique record for this table. |
is_metrics_table
|
Indicates if a given entry is for a metric entity (in Google Ads API). System treats such tables a bit differently due to the aggregated nature of such tables. |
partition_details
|
Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition. |
cluster_details
|
Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings. |
Raw to CDC tables
This section describes which entries control how data is moved from raw tables to CDC tables. Each entry corresponds with a raw table (which in turn corresponds with Google Ads API entity as mentioned).
The following parameters control the settings for Raw to CDC
for each entry:
Parameter | Description |
table_name
|
Table in CDC dataset where
the raw data after CDC transformation is stored (for example, customer ).
|
raw_table
|
Table on which raw data has been replicated. |
key
|
Columns (separated by comma) that form a unique record for this table. |
load_frequency
|
How frequently a DAG for this entity runs to populate the CDC table. For more information about possible values, see Airflow documentation. |
schema_file
|
Schema file in src/table_schema
directory that maps raw columns to CDC columns and data type of the CDC column.
This is the same schema file that's referred to in the previous section.
|
partition_details
|
Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition. |
cluster_details
|
Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings. |
Reporting settings
You can configure and control how Cortex Framework generates
data for the Google Ads final reporting layer using the reporting
settings file src/GoogleAds/config/reporting_settings.yaml
.
This file controls
how reporting layer BigQuery objects
(tables, views,functions or stored procedures) are generated.
For more information, see Customizing reporting settings file.
What's next?
- For more information about other data sources and workloads, see Data sources and workloads.
- For more information about the steps for deployment in production environments, see Cortex Framework Data Foundation deployment prerequisites.