Integration with TikTok
This page describes the required configurations to bring data from TikTok as a data source of the marketing workload of Cortex Framework Data Foundation.
TikTok is a popular social media app known for short-form videos that Cortex Framework can bring data to analyze an overall marketing performance. By combining data from TikTok and various sources, you can gain a more comprehensive understanding of your target audience and the effectiveness of your social media campaigns across different platforms.
The following diagram describes how TikTok data is available through the marketing workload of Cortex Framework Data Foundation:
Configuration file
The config.json
file configures the settings required to connect to data sources for transferring
data from various workloads. This file contains the following parameters for TikTok:
"marketing": {
"deployTikTok": true,
},
"TikTok": {
"deployCDC": true,
"datasets": {
"cdc": "",
"raw": "",
"reporting": "REPORTING_TikTok"
}
}
The following table describes the value for each marketing parameter:
Parameter | Meaning | Default Value | Description |
marketing.deployTikTok
|
Deploy TikTok | true
|
Execute the deployment for TikTok data source. |
marketing.TikTok.deployCDC
|
Deploy CDC scripts for TikTok | true
|
Generate TikTok CDC processing scripts to run as DAGs in Cloud Composer. |
marketing.TikTok.datasets.cdc
|
CDC dataset for TikTok | CDC dataset for TikTok. | |
marketing.TikTok.datasets.raw
|
Raw dataset for TikTok | Raw dataset for TikTok. | |
marketing.TikTok.datasets.reporting
|
Reporting dataset for TikTok | "REPORTING_TikTok"
|
Reporting dataset for TikTok. |
Data Model
This section describes the TikTok Data Model using the Entity Relationship Diagram (ERD).
Base views
These are the blue objects in the ERD and are views on CDC tables with
no transforms other than some column name aliases. See scripts in
src/marketing/src/TikTok/src/reporting/ddls
.
Reporting views
These are the green objects in the ERD and are reporting views that contain
aggregate metrics. See scripts in
src/marketing/src/TikTok/src/reporting/ddls
.
API connection
Cortex Framework uses TikTok Reporting APIs, version v1.3, as the authoritative source for TikTok data. Cortex Framework uses the synchronous mode and calls Basic Reporting APIs to retrieve performance metrics for advertisements and ad groups. This ensures that Cortex Framework has access to up-to-date and accurate information from TikTok, enabling effective data analysis and reporting.
For more information about the API connection, see TikTok Reporting APIs.
Account authentication
To configure a TikTok account and account authentication, follow these steps:
- Set up a TikTok Developer Account, if you don't have it already.
- Create an app for Cortex Framework integration. See
TikTok API for Business
for more information. Ensure you select the following two in the scopes for the app:
Ad Account Management/Ad Account Information
Reporting/All
- Get app ID, secret and long term access token as described in the
TikTok guide,
and store them respectively in Secret Manager
with the following names:
- App ID:
cortex_tiktok_app_id
- Secret:
cortex_tiktok_app_secret
- Long term access token:
cortex_tiktok_access_token
- App ID:
Data Freshness and Delay
As a general rule, data freshness for Cortex Framework data sources is limited by what upstream connection allows for, as well as the frequency of your DAG execution. Adjust your DAG execution frequency to align with upstream frequency, resource constraints, and your business needs.
With TikTok Marketing API, most data (excluding conversions) is available near real time.
Cloud Composer connections
Create the following connections in Cloud Composer. For more details, see Manage Airflow connections documentation.
Connection Name | Purpose |
tiktok_raw_dataflow
|
For TikTok API > BigQuery Raw Dataset |
tiktok_cdc_bq
|
For Raw dataset > CDC dataset transfer |
tiktok_reporting_bq
|
For CDC dataset > Reporting dataset transfer |
Cloud Composer service account permissions
Grant Dataflow permissions to the service account used in
Cloud Composer (as configured in the tiktok_raw_dataflow
connection).
See instructions in Dataflow documentation.
Also, the same service account should also have Secret Manager Accessor access.
Ingestion settings
Control Source to Raw
and Raw to CDC
data pipelines through the settings
in the file src/TikTok/config/ingestion_settings.yaml
. This section describes the parameters
of each data pipeline.
Source to raw tables
This section has entries that control how data from TikTok is fetched and where data end up in the raw dataset. Each entry corresponds with one raw table that has data fetched from TikTok API for that entity. Based on this configuration parameters, Cortex Framework creates Airflow DAGs that run Dataflow pipelines to process data from TikTok APIs.
The following parameters control the settings for Source to Raw
for
each entry:
Parameter | Description |
base_table
|
Table in Raw dataset where the data for an entity is stored(for example, 'Ad' data). |
load_frequency
|
How often a DAG is run for this entity to process data. See Airflow documentation for details on possible values. |
schema_file
|
Schema file in src/table_schema
directory that maps API response fields to destination table's column names.
|
partition_details
|
Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition. |
cluster_details
|
Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings. |
Raw to CDC tables
This section has entries that control how data moves from raw tables to CDC
tables. Each entry corresponds with a CDC table (which in turn corresponds
with an entity mentioned for the Source to Raw table
.)
The following parameters control the settings for Raw to CDC
for each entry:
Parameter | Description |
base_table
|
Table in CDC dataset where the raw
data after CDC transformation is stored (for example,auction_ad_performance )
|
load_frequency
|
How frequently a DAG for this entity runs to populate the CDC table. For more information, see Airflow documentation for details on possible values. |
row_identifiers
|
List of columns (separated by comma) that forms a unique record for this table. |
partition_details
|
Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition. |
cluster_details
|
Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings. |
Reporting settings
Configure and control how Cortex Framework generates data for
the TikTok final reporting layer using the reporting settings
filesrc/TikTok/config/reporting_settings.yaml
.
This file controls how reporting layer BigQuery objects
(tables, views, functions or stored procedures) are generated.
For more information, see Customizing reporting settings file.
What's next?
- For more information about other data sources and workloads, see Data sources and workloads.
- For more information about the steps for deployment in production environments, see Cortex Framework Data Foundation deployment prerequisites.