[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-29。"],[[["\u003cp\u003eThis page demonstrates how to use Google Transfer Operators in Cloud Composer to transfer data from external services into Google Cloud, specifically showing examples for Amazon S3 and Azure FileShare.\u003c/p\u003e\n"],["\u003cp\u003eTransferring data from Amazon S3 to Cloud Storage requires installing the \u003ccode\u003eapache-airflow-providers-amazon\u003c/code\u003e package, configuring an Amazon S3 connection in the Airflow UI, and using the \u003ccode\u003eS3ToGCSOperator\u003c/code\u003e in a DAG.\u003c/p\u003e\n"],["\u003cp\u003eTransferring data from Azure FileShare to Cloud Storage requires installing the \u003ccode\u003eapache-airflow-providers-microsoft-azure\u003c/code\u003e package, setting up an Azure FileShare connection, and using the \u003ccode\u003eAzureFileShareToGCSOperator\u003c/code\u003e within a DAG.\u003c/p\u003e\n"],["\u003cp\u003eCredentials for connections to external services, such as Amazon S3 and Azure FileShare, should be securely stored in Secret Manager for enhanced security.\u003c/p\u003e\n"],["\u003cp\u003eAfter data is transferred from external sources to Cloud Storage, it can be accessed and used by other Airflow tasks or DAGs by placing it into the \u003ccode\u003e/data\u003c/code\u003e folder of your environment's bucket.\u003c/p\u003e\n"]]],[],null,["# Transfer data with Google Transfer Operators\n\n**Cloud Composer 3** \\| [Cloud Composer 2](/composer/docs/composer-2/transfer-data-with-transfer-operators \"View this page for Cloud Composer 2\") \\| [Cloud Composer 1](/composer/docs/composer-1/transfer-data-with-transfer-operators \"View this page for Cloud Composer 1\")\n\n\u003cbr /\u003e\n\n| **Note:** This page is **not yet revised for Cloud Composer 3** and displays content for Cloud Composer 2.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\nThis page demonstrates how to transfer data from other services with Google\nTransfer Operators in your DAGs.\n\nAbout Google Transfer Operators\n-------------------------------\n\n[Google Transfer Operators](https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/transfer/index.html) are a\nset of Airflow operators that you can use to pull data from other services into\nGoogle Cloud.\n\nThis guide shows operators for Azure FileShare Storage and Amazon S3 that work\nwith Cloud Storage. There are many more transfer operators that work\nwith services within Google Cloud and with services other than\nGoogle Cloud.\n\nAmazon S3 to Cloud Storage\n--------------------------\n\nThis section demonstrates how to synchronize data from Amazon S3 to a\nCloud Storage bucket.\n\n### Install the Amazon provider package\n\nThe `apache-airflow-providers-amazon` package contains the connection\ntypes and functionality that interacts with Amazon S3.\n[Install this PyPI package](/composer/docs/composer-2/install-python-dependencies#install-pypi) in your\nenvironment.\n\n### Configure a connection to Amazon S3\n\nThe Amazon provider package provides a connection type for Amazon S3. You\ncreate a connection of this type. The connection for Cloud Storage,\nnamed `google_cloud_default` is already set up in your environment.\n\nSet up a connection to Amazon S3 in the following way:\n\n1. In [Airflow UI](/composer/docs/composer-2/access-airflow-web-interface), go to **Admin** \\\u003e **Connections**.\n2. Create a new connection.\n3. Select `Amazon S3` as the connection type.\n4. The following example uses a connection named `aws_s3`. You can use this name, or any other name for the connection.\n5. Specify connection parameters as described in the Airflow documentation for [Amazon Web Services Connection](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html). For example, to set up a connection with AWS access keys, you generate an access key for your account on AWS, then provide the AWS access key ID as a login the AWS secret access key as a password for the connection.\n\n| **Note:** We recommend to **store all credentials for connections in Secret Manager** . For more information, see [Configure Secret Manager for your environment](/composer/docs/composer-2/configure-secret-manager). For example, you can create a secret named `airflow-connections-aws_s3` that stores credentials for the `aws_s3` connection.\n\n### Transfer data from Amazon S3\n\nIf you want to operate on the synchronized data later in another DAG or task,\npull it to the `/data` folder of your environment's bucket. This folder is\nsynchronized to other Airflow workers, so that tasks in your DAG\ncan operate on it.\n\nThe following example DAG does the following:\n\n- Synchronizes contents of the `/data-for-gcs` directory from an S3 bucket to the `/data/from-s3/data-for-gcs/` folder in your environment's bucket.\n- Waits for two minutes, for the data to synchronize to all Airflow workers in your environment.\n- Outputs the list of files in this directory using the `ls` command. Replace this task with other Airflow operators that work with your data.\n\n import datetime\n import airflow\n from airflow.providers.google.cloud.transfers.s3_to_gcs import S3ToGCSOperator\n from airflow.operators.bash_operator import BashOperator\n\n with airflow.DAG(\n 'composer_sample_aws_to_gcs',\n start_date=datetime.datetime(2022, 1, 1),\n schedule_interval=None,\n ) as dag:\n\n transfer_dir_from_s3 = S3ToGCSOperator(\n task_id='transfer_dir_from_s3',\n aws_conn_id='aws_s3',\n prefix='data-for-gcs',\n bucket='example-s3-bucket-transfer-operators',\n dest_gcs='gs://us-central1-example-environ-361f2312-bucket/data/from-s3/')\n\n sleep_2min = BashOperator(\n task_id='sleep_2min',\n bash_command='sleep 2m')\n\n print_dir_files = BashOperator(\n task_id='print_dir_files',\n bash_command='ls /home/airflow/gcs/data/from-s3/data-for-gcs/')\n\n\n transfer_dir_from_s3 \u003e\u003e sleep_2min \u003e\u003e print_dir_files\n\nAzure FileShare to Cloud Storage\n--------------------------------\n\nThis section demonstrates how to synchronize data from Azure FileShare to a\nCloud Storage bucket.\n\n### Install the Microsoft Azure provider package\n\nThe `apache-airflow-providers-microsoft-azure` package contains the connection\ntypes and functionality that interacts with Microsoft Azure.\n[Install this PyPI package](/composer/docs/composer-2/install-python-dependencies#install-pypi) in your\nenvironment.\n\n### Configure a connection to Azure FileShare\n\nThe Microsoft Azure provider package provides a connection type for Azure File\nShare. You create a connection of this type. The connection for\nCloud Storage, named `google_cloud_default` is already set up in\nyour environment.\n\nSet up a connection to Azure FileShare in the following way:\n\n1. In [Airflow UI](/composer/docs/composer-2/access-airflow-web-interface), go to **Admin** \\\u003e **Connections**.\n2. Create a new connection.\n3. Select `Azure FileShare` as the connection type.\n4. The following example uses a connection named `azure_fileshare`. You can use this name, or any other name for the connection.\n5. Specify connection parameters as described in the Airflow documentation for [Microsoft Azure File Share Connection](https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/connections/azure_fileshare.html). For example, you can specify a connection string for your storage account access key.\n\n| **Note:** We recommend to **store all credentials for connections in Secret Manager** . For more information, see [Configure Secret Manager for your environment](/composer/docs/composer-2/configure-secret-manager). For example, you can create a secret named `airflow-connections-azure_fileshare` that stores credentials for the `azure_fileshare` connection.\n\n### Transfer data from Azure FileShare\n\nIf you want to operate on the synchronized data later in another DAG or task,\npull it to the `/data` folder of your environment's bucket. This folder is\nsynchronized to other Airflow workers, so that tasks in your DAG\ncan operate on it.\n\nThe following DAG does the following:\n\nThe following example DAG does the following:\n\n- Synchronizes contents of the `/data-for-gcs` directory from Azure File Share to the `/data/from-azure` folder in your environment's bucket.\n- Waits for two minutes, for the data to synchronize to all Airflow workers in your environment.\n- Outputs the list of files in this directory using the `ls` command. Replace this task with other Airflow operators that work with your data.\n\n import datetime\n import airflow\n from airflow.providers.google.cloud.transfers.azure_fileshare_to_gcs import AzureFileShareToGCSOperator\n from airflow.operators.bash_operator import BashOperator\n\n with airflow.DAG(\n 'composer_sample_azure_to_gcs',\n start_date=datetime.datetime(2022, 1, 1),\n schedule_interval=None,\n ) as dag:\n\n transfer_dir_from_azure = AzureFileShareToGCSOperator(\n task_id='transfer_dir_from_azure',\n azure_fileshare_conn_id='azure_fileshare',\n share_name='example-file-share',\n directory_name='data-for-gcs',\n dest_gcs='gs://us-central1-example-environ-361f2312-bucket/data/from-azure/')\n\n sleep_2min = BashOperator(\n task_id='sleep_2min',\n bash_command='sleep 2m')\n\n print_dir_files = BashOperator(\n task_id='print_dir_files',\n bash_command='ls /home/airflow/gcs/data/from-azure/')\n\n\n transfer_dir_from_azure \u003e\u003e sleep_2min \u003e\u003e print_dir_files\n\nWhat's next\n-----------\n\n- [Use GKE operators](/composer/docs/composer-2/use-gke-operator)"]]