An Amazon Managed Streaming for Apache Kafka (Amazon MSK) import topic lets you continuously ingest data from Amazon MSK as an external source and into Pub/Sub. Then you can stream the data into any of the destinations that Pub/Sub supports.
This document shows you how to create and manage Amazon MSK import topics. To create a standard topic, see Create a standard topic.
For more information about import topics, see About import topics.
Before you begin
Know more about the Pub/Sub publish process.
Configure the required roles and permissions to manage Amazon MSK import topics including the following:
Set up workload identity federation so that Google Cloud can access the external streaming service.
Required roles and permissions
To get the permissions that
you need to create and manage Amazon MSK import topics,
ask your administrator to grant you the
Pub/Sub Editor (roles/pubsub.editor
) IAM role on your topic or project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create and manage Amazon MSK import topics. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create and manage Amazon MSK import topics:
-
Create an import topic:
pubsub.topics.create
-
Delete an import topic:
pubsub.topics.delete
-
Get an import topic:
pubsub.topics.get
-
List an import topic:
pubsub.topics.list
-
Publish to an import topic:
pubsub.topics.publish
-
Update an import topic:
pubsub.topics.update
-
Get the IAM policy for an import topic:
pubsub.topics.getIamPolicy
-
Configure the IAM policy for an import topic:
pubsub.topics.setIamPolicy
You might also be able to get these permissions with custom roles or other predefined roles.
You can configure access control at the project level and the individual resource level.
Set up federated identity to access Amazon MSK
Workload Identity Federation lets Google Cloud services access workloads running outside of Google Cloud. With identity federation, you don't need to maintain or pass credentials to Google Cloud to access your resources in other clouds. Instead, you can use the identities of the workloads themselves to authenticate to Google Cloud and access resources.
Create a service account in Google Cloud
This is an optional step. If you already have a service account, you can use it in this procedure instead of creating a new service account. If you are using an existing service account, go to Record the service account unique ID for the next step.
For Amazon MSK import topics, Pub/Sub uses the service account as the identity to access resources from AWS.
For more information about creating a service account, including prerequisites, required roles and permissions, and naming guidelines, see Create service accounts. After you create a service account, you might need to wait for 60 seconds or more before you use the service account. This behavior occurs because read operations are eventually consistent; it can take time for the new service account to become visible.
Record the service account unique ID
You need a service account unique ID to set up a role in the AWS console.
In the Google Cloud console, go to the Service account details page.
Click the service account that you just created or the one that you are planning to use.
From the Service account details page, record the Unique ID number.
You need the ID as part of the workflow to set up a role in the AWS console.
Add the service account token creator role to the Pub/Sub service account
The Service account token creator role (roles/iam.serviceAccountTokenCreator
)
lets principals create short-lived credentials
for a service account. These tokens or credentials are used to impersonate
the service account.
For more information about service account impersonation, see Service account impersonation.
You can also add the Pub/Sub publisher role (roles/pubsub.publisher
)
during this procedure. For more information about the role and why you are adding it,
see Add the Pub/Sub publisher role to the Pub/Sub service account.
In the Google Cloud console, go to the IAM page.
Click the Include Google-provided role grants checkbox.
Look for the service account that has the format
service-{PROJECT_NUMBER}@gcp-sa-pubsub.iam.gserviceaccount.com
.For this service account, click the Edit Principal button.
If required, click Add another role.
Search and click the Service account token creator role (
roles/iam.serviceAccountTokenCreator
).Click Save.
Create a policy in AWS
You need a policy in AWS to let Pub/Sub authenticate to AWS so that Pub/Sub can ingest data from Amazon MSK.
- For more methods and information about how to create a policy in AWS, see Creating IAM policies.
To create a policy in AWS, perform the following steps:
Sign in to the AWS Management Console and open the IAM console.
In the navigation pane of the console for IAM, click Access Management > Policies.
Click Create policy.
For Click a service, click MSK.
For Action allowed, click Read > GetBootstrapBrokers.
This action grants permission to get the bootstrap brokers that Pub/Sub uses to connect to the MSK cluster.
Click Add more permissions.
For Select a service, click Apache Kafka APIs for MSK.
For Action allowed, select the following:
List > DescribeTopic
This action grants permission to allow the Pub/Sub ingestion topic to get details about the Amazon MSK Kafka topic.
Read > ReadData
This action grants permission to read data from the Amazon MSK Kafka topic.
Write > Connect
This action grants permission to connect and authenticate to the Amazon MSK Kafka cluster.
For Resources, specify the cluster ARN (if you want to restrict the policy to specific clusters, which is recommended).
Click Add more permissions.
For Select a service, click STS.
For Action allowed, click Write > AssumeRoleWithWebIdentity.
This action grants permission to obtain a set of temporary security credentials for Pub/Sub to authenticate to Amazon MSK by using identity federation.
Click Next.
Enter a policy name and description.
Click Create policy.
Create a role in AWS using a custom trust policy
You must create a role in AWS so that Pub/Sub can authenticate to AWS to ingest data from Amazon MSK.
Sign in to the AWS Management Console and open the IAM console.
In the navigation pane of the console for IAM, click Roles.
Click Create role.
For Select trusted entity, click Custom trust policy.
In the Custom trust policy section, enter or paste the following:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Federated": "accounts.google.com" }, "Action": "sts:AssumeRoleWithWebIdentity", "Condition": { "StringEquals": { "accounts.google.com:sub": "<SERVICE_ACCOUNT_UNIQUE_ID>" } } } ] }
Replace
<SERVICE_ACCOUNT_UNIQUE_ID>
with the unique ID of the service account that you recorded in Record the service account unique ID.Click Next.
For Add permissions, search and click the custom policy that you just created.
Click Next.
Enter a role name and description.
Click Create role.
Add the Pub/Sub publisher role to the Pub/Sub principal
To enable publishing, you must assign a publisher role to the Pub/Sub service account so that Pub/Sub is able to publish to the Amazon MSK import topic.
Enable publishing from all topics
Use this method if you have not created any Amazon MSK import topics.
In the Google Cloud console, go to the IAM page.
Click the Include Google-provided role grants checkbox.
Look for the service account that has the format
service-{PROJECT_NUMBER}@gcp-sa-pubsub.iam.gserviceaccount.com
.For this service account, click the Edit Principal button.
If required, click Add another role.
Search and click the Pub/Sub publisher role (
roles/pubsub.publisher
).Click Save.
Enable publishing from a single topic
Use this method only if the Amazon MSK import topic already exists.
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Run the
gcloud pubsub topics add-iam-policy-binding
command:gcloud pubsub topics add-iam-policy-binding TOPIC_ID \ --member="serviceAccount:service-PROJECT_NUMBER@gcp-sa-pubsub.iam.gserviceaccount.com" \ --role="roles/pubsub.publisher"
Replace the following:
TOPIC_ID
: the topic ID of the Amazon MSK import topic.PROJECT_NUMBER
: the project number. To view the project number, see Identifying projects.
Add the service account user role to the service account
The Service Account User role (roles/iam.serviceAccountUser
) includes the
permission iam.serviceAccounts.actAs
that lets a principal attach a service
account to the Amazon MSK import topic's ingestion settings and
use that service account for federated identity.
In the Google Cloud console, go to the IAM page.
For the principal that's issuing the create or update topic calls, click the Edit Principal button.
If required, click Add another role.
Search and click the Service account user role (
roles/iam.serviceAccountUser
).Click Save.
Use Amazon MSK import topics
You can create a new import topic or edit an existing topic.
Considerations
Creating the topic and subscription separately, even if done in rapid succession, can lead to data loss. There's a short window where the topic exists without a subscription. If any data is sent to the topic during this time, it is lost. By creating the topic first, creating the subscription, and then converting the topic to an import topic, you guarantee that no messages are missed during the import process.
If you need to re-create the Kafka topic of an existing import topic with the same name, you can't just delete the Kafka topic and re-create it. This action can invalidate Pub/Sub's offset management, which can lead to data loss. To mitigate this, follow these steps:
- Delete the Pub/Sub import topic.
- Delete the Kafka topic.
- Create the Kafka topic.
- Create the Pub/Sub import topic.
Data from an Amazon MSK Kafka topic is always read from the earliest offset.
Create Amazon MSK import topics
To know more about properties associated with a topic, see Properties of a topic.
Ensure that you have completed the following procedures:
To create an Amazon MSK import topic, follow these steps:
Console
In the Google Cloud console, go to the Topics page.
Click Create topic.
In the Topic ID field, enter an ID for your Amazon MSK import topic. For more information about naming topics, see the naming guidelines.
Select Add a default subscription.
Select Enable ingestion.
For ingestion source, select Amazon MSK.
Enter the following details:
Cluster ARN: The ARN of the Amazon MSK that you are ingesting into Pub/Sub. The ARN format is as follows:
arn:aws:kafka:${Region}:${Account}:cluster/${ClusterName}/${ClusterId}
.Topic: The name of the Amazon MSK Kafka topic that you are ingesting into Pub/Sub.
AWS Role ARN: The ARN of the AWS role. The ARN format of the role is as follows:
arn:aws:iam:${Account}:role/${RoleName}
.Service account: The service account that you created in Create a service account in Google Cloud.
Click Create topic.
gcloud
-
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Run the
gcloud pubsub topics create
command:gcloud pubsub topics create TOPIC_ID \ --aws-msk-ingestion-cluster-arn MSK_CLUSTER_ARN \ --aws-msk-ingestion-topic MSK_TOPIC \ --aws-msk-ingestion-aws-role-arn MSK_ROLE_ARN \ --aws-msk-ingestion-service-account PUBSUB_SERVICE_ACCOUNT
Replace the following:
TOPIC_ID
: the name or ID of your Pub/Sub topic.MSK_CLUSTER_ARN
: the ARN for the Amazon MSK cluster that you are ingesting into Pub/Sub. The ARN format is as follows:arn:aws:kafka:${Region}:${Account}:cluster/${ClusterName}/${ClusterId}
.MSK_TOPIC
: the name of the Amazon MSK Kafka topic that you are ingesting into Pub/Sub.MSK_ROLE_ARN
: the ARN of the AWS role. The ARN format of the role is as follows:arn:aws:iam:${Account}:role/${RoleName}
.PUBSUB_SERVICE_ACCOUNT
: the service account that you created in Create a service account in Google Cloud.
Go
Before trying this sample, follow the Go setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Go API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Java
Before trying this sample, follow the Java setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Java API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js
Before trying this sample, follow the Node.js setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Node.js API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Python
Before trying this sample, follow the Python setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Python API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
C++
Before trying this sample, follow the C++ setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub C++ API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
Node.js (TypeScript)
Before trying this sample, follow the Node.js setup instructions in the Pub/Sub quickstart using client libraries. For more information, see the Pub/Sub Node.js API reference documentation.
To authenticate to Pub/Sub, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.
For more information about ARNs, see Amazon Resource Names (ARNs) and IAM Identifiers.
If you run into issues, see Troubleshooting an Amazon MSK import topic.
Edit Amazon MSK import topics
To edit the ingestion data source settings of an Amazon MSK import topic, follow these steps:
Console
In the Google Cloud console, go to the Topics page.
Click the Amazon MSK import topic.
In the topic details page, click Edit.
Update the fields that you want to change.
Click Update.
gcloud
-
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Run the
gcloud pubsub topics update
command with all the flags mentioned in the following sample:gcloud pubsub topics update TOPIC_ID \ --aws-msk-ingestion-cluster-arn MSK_CLUSTER_ARN \ --aws-msk-ingestion-topic MSK_TOPIC \ --aws-msk-ingestion-aws-role-arn MSK_ROLE_ARN \ --aws-msk-ingestion-service-account PUBSUB_SERVICE_ACCOUNT
Replace the following:
- TOPIC_ID: the name or ID of your Pub/Sub topic.
- MSK_CLUSTER_ARN: the ARN for the Amazon MSK
cluster that you are ingesting into Pub/Sub. The ARN format
is as follows:
arn:aws:kafka:${Region}:${Account}:cluster/${ClusterName}/${ClusterId}
. - MSK_TOPIC: the name of the Amazon MSK Kafka topic that you are ingesting into Pub/Sub.
- MSK_ROLE_ARN: the ARN of the AWS role. The ARN format
of the role is as follows:
arn:aws:iam:${Account}:role/${RoleName}
. - PUBSUB_SERVICE_ACCOUNT: the service account that you created in Create a service account in Google Cloud.
Quotas and limits
The publisher throughput for import topics is bound by the publish quota of the topic. For more information, see Pub/Sub quotas and limits.
What's next
Choose the type of subscription for your topic.
Learn how to publish a message to a topic.
Create or modify a topic with gcloud CLI, REST APIs, or Client libraries.