Cortex Framework: integration with TikTok

This page describes the required configurations to bring data from TikTok as a data source of the marketing workload of Cortex Framework Data Foundation.

TikTok is a popular social media app known for short-form videos that Cortex Framework can bring data to analyze an overall marketing performance. By combining data from TikTok and various sources, you can gain a more comprehensive understanding of your target audience and the effectiveness of your social media campaigns across different platforms.

The following diagram describes how TikTok data is available through the marketing workload of Cortex Framework Data Foundation:

TikTok data source

Figure 1. TikTok data source.

Configuration file

The config.json file configures the settings required to connect to data sources for transferring data from various workloads. This file contains the following parameters for TikTok:

   "marketing": {
        "deployTikTok": true,
        },
        "TikTok": {
            "deployCDC": true,
            "datasets": {
                "cdc": "",
                "raw": "",
                "reporting": "REPORTING_TikTok"
            }
        }

The following table describes the value for each marketing parameter:

Parameter	Meaning	Default Value	Description
`marketing.deployTikTok`	Deploy TikTok	`true`	Execute the deployment for TikTok data source.
`marketing.TikTok.deployCDC`	Deploy CDC scripts for TikTok	`true`	Generate TikTok CDC processing scripts to run as DAGs in Cloud Composer.
`marketing.TikTok.datasets.cdc`	CDC dataset for TikTok		CDC dataset for TikTok.
`marketing.TikTok.datasets.raw`	Raw dataset for TikTok		Raw dataset for TikTok.
`marketing.TikTok.datasets.reporting`	Reporting dataset for TikTok	`"REPORTING_TikTok"`	Reporting dataset for TikTok.

Data Model

This section describes the TikTok Data Model using the Entity Relationship Diagram (ERD).

Figure 2. TikTok: Entity Relationship Diagram.

Base views

These are the blue objects in the ERD and are views on CDC tables with no transforms other than some column name aliases. See scripts in src/marketing/src/TikTok/src/reporting/ddls.

Reporting views

These are the green objects in the ERD and are reporting views that contain aggregate metrics. See scripts in src/marketing/src/TikTok/src/reporting/ddls.

API connection

Cortex Framework uses TikTok Reporting APIs, version v1.3, as the authoritative source for TikTok data. Cortex Framework uses the synchronous mode and calls Basic Reporting APIs to retrieve performance metrics for advertisements and ad groups. This ensures that Cortex Framework has access to up-to-date and accurate information from TikTok, enabling effective data analysis and reporting.

For more information about the API connection, see TikTok Reporting APIs.

Account authentication

To configure a TikTok account and account authentication, follow these steps:

Set up a TikTok Developer Account, if you don't have it already.
Create an app for Cortex Framework integration. See TikTok API for Business for more information. Ensure you select the following two in the scopes for the app:
- Ad Account Management/Ad Account Information
- Reporting/All
Get app ID, secret and long term access token as described in the TikTok guide, and store them respectively in Secret Manager with the following names:
- App ID: cortex_tiktok_app_id
- Secret: cortex_tiktok_app_secret
- Long term access token: cortex_tiktok_access_token

Cloud Composer connections

Create the following connections in Cloud Composer. For more details, see Manage Airflow connections documentation.

Connection Name	Purpose
`tiktok_raw_dataflow`	For TikTok API > BigQuery Raw Dataset
`tiktok_cdc_bq`	For Raw dataset > CDC dataset transfer
`tiktok_reporting_bq`	For CDC dataset > Reporting dataset transfer

Cloud Composer service account permissions

Grant Dataflow permissions to the service account used in Cloud Composer (as configured in the tiktok_raw_dataflow connection). See instructions in Dataflow documentation.

Also, the same service account should also have Secret Manager Accessor access.

Ingestion settings

Control Source to Raw and Raw to CDC data pipelines through the settings in the file src/TikTok/config/ingestion_settings.yaml. This section describes the parameters of each data pipeline.

Source to raw tables

This section has entries that control how data from TikTok is fetched and where data end up in the raw dataset. Each entry corresponds with one raw table that has data fetched from TikTok API for that entity. Based on this configuration parameters, Cortex Framework creates Airflow DAGs that run Dataflow pipelines to process data from TikTok APIs.

The following parameters control the settings for Source to Raw for each entry:

Parameter	Description
`base_table`	Table in Raw dataset where the data for an entity is stored(for example, 'Ad' data).
`load_frequency`	How often a DAG is run for this entity to process data. See Airflow documentation for details on possible values.
`schema_file`	Schema file in `src/table_schema` directory that maps API response fields to destination table's column names.
`partition_details`	Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition.
`cluster_details`	Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings.

Raw to CDC tables

This section has entries that control how data moves from raw tables to CDC tables. Each entry corresponds with a CDC table (which in turn corresponds with an entity mentioned for the Source to Raw table.)

The following parameters control the settings for Raw to CDC for each entry:

Parameter	Description
`base_table`	Table in CDC dataset where the raw data after CDC transformation is stored (for example,`auction_ad_performance`)
`load_frequency`	How frequently a DAG for this entity runs to populate the CDC table. For more information, see Airflow documentation for details on possible values.
`row_identifiers`	List of columns (separated by comma) that forms a unique record for this table.
`partition_details`	Optional: If you want this table to be partitioned for performance considerations. For more information, see Table Partition.
`cluster_details`	Optional: If you want this table to be clustered for performance considerations. For more information, see Cluster Settings.

Reporting settings

Configure and control how Cortex Framework generates data for the TikTok final reporting layer using the reporting settings filesrc/TikTok/config/reporting_settings.yaml. This file controls how reporting layer BigQuery objects (tables, views, functions or stored procedures) are generated.

For more information, see Customizing reporting settings file.

What's next?

For more information about other data sources and workloads, see Data sources and workloads.
For more information about the steps for deployment in production environments, see Cortex Framework Data Foundation deployment prerequisites.