[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-29。"],[[["\u003cp\u003eThis page demonstrates how to use Google Transfer Operators in Cloud Composer 1 to transfer data from external services like Amazon S3 and Azure FileShare into Google Cloud Storage.\u003c/p\u003e\n"],["\u003cp\u003eTo use these operators, you must first install the relevant provider packages, such as \u003ccode\u003eapache-airflow-providers-amazon\u003c/code\u003e for Amazon S3 and \u003ccode\u003eapache-airflow-providers-microsoft-azure\u003c/code\u003e for Azure FileShare, in your Cloud Composer environment.\u003c/p\u003e\n"],["\u003cp\u003eA connection to the external service must be configured through the Airflow UI, where you can define parameters like access keys, connection strings, or account details, and it is recommended to store all credentials for connections in Secret Manager.\u003c/p\u003e\n"],["\u003cp\u003eExample DAGs are provided that synchronize data from both Amazon S3 and Azure FileShare to a specified folder within the Cloud Storage bucket of the environment, where the data can then be used in tasks or other DAGs.\u003c/p\u003e\n"],["\u003cp\u003eThe page emphasizes that these operators support transferring data from various services, not just the mentioned examples, and recommends the use of backport provider packages for Airflow 1 environments.\u003c/p\u003e\n"]]],[],null,["\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\n[Cloud Composer 3](/composer/docs/composer-3/transfer-data-with-transfer-operators \"View this page for Cloud Composer 3\") \\| [Cloud Composer 2](/composer/docs/composer-2/transfer-data-with-transfer-operators \"View this page for Cloud Composer 2\") \\| **Cloud Composer 1**\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\nThis page demonstrates how to transfer data from other services with Google\nTransfer Operators in your DAGs.\n\nAbout Google Transfer Operators\n\n[Google Transfer Operators](https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/transfer/index.html) are a\nset of Airflow operators that you can use to pull data from other services into\nGoogle Cloud.\n\nThis guide shows operators for Azure FileShare Storage and Amazon S3 that work\nwith Cloud Storage. There are many more transfer operators that work\nwith services within Google Cloud and with services other than\nGoogle Cloud.\n\nBefore you begin\n\n- This guide is for Airflow 2. If your environment uses Airflow 1, use [backport provider packages](/composer/docs/composer-1/backport-packages) to import operators and to make required connection types available in your environment.\n\nAmazon S3 to Cloud Storage\n\nThis section demonstrates how to synchronize data from Amazon S3 to a\nCloud Storage bucket.\n\nInstall the Amazon provider package\n\nThe `apache-airflow-providers-amazon` package contains the connection\ntypes and functionality that interacts with Amazon S3.\n[Install this PyPI package](/composer/docs/composer-1/install-python-dependencies#install-pypi) in your\nenvironment.\n\nConfigure a connection to Amazon S3\n\nThe Amazon provider package provides a connection type for Amazon S3. You\ncreate a connection of this type. The connection for Cloud Storage,\nnamed `google_cloud_default` is already set up in your environment.\n\nSet up a connection to Amazon S3 in the following way:\n\n1. In [Airflow UI](/composer/docs/composer-1/access-airflow-web-interface), go to **Admin** \\\u003e **Connections**.\n2. Create a new connection.\n3. Select `Amazon S3` as the connection type.\n4. The following example uses a connection named `aws_s3`. You can use this name, or any other name for the connection.\n5. Specify connection parameters as described in the Airflow documentation for [Amazon Web Services Connection](https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/aws.html). For example, to set up a connection with AWS access keys, you generate an access key for your account on AWS, then provide the AWS access key ID as a login the AWS secret access key as a password for the connection.\n\n| **Note:** We recommend to **store all credentials for connections in Secret Manager** . For more information, see [Configure Secret Manager for your environment](/composer/docs/composer-1/configure-secret-manager). For example, you can create a secret named `airflow-connections-aws_s3` that stores credentials for the `aws_s3` connection.\n\nTransfer data from Amazon S3\n\nIf you want to operate on the synchronized data later in another DAG or task,\npull it to the `/data` folder of your environment's bucket. This folder is\nsynchronized to other Airflow workers, so that tasks in your DAG\ncan operate on it.\n\nThe following example DAG does the following:\n\n- Synchronizes contents of the `/data-for-gcs` directory from an S3 bucket to the `/data/from-s3/data-for-gcs/` folder in your environment's bucket.\n- Waits for two minutes, for the data to synchronize to all Airflow workers in your environment.\n- Outputs the list of files in this directory using the `ls` command. Replace this task with other Airflow operators that work with your data.\n\n import datetime\n import airflow\n from airflow.providers.google.cloud.transfers.s3_to_gcs import S3ToGCSOperator\n from airflow.operators.bash_operator import BashOperator\n\n with airflow.DAG(\n 'composer_sample_aws_to_gcs',\n start_date=datetime.datetime(2022, 1, 1),\n schedule_interval=None,\n ) as dag:\n\n transfer_dir_from_s3 = S3ToGCSOperator(\n task_id='transfer_dir_from_s3',\n aws_conn_id='aws_s3',\n prefix='data-for-gcs',\n bucket='example-s3-bucket-transfer-operators',\n dest_gcs='gs://us-central1-example-environ-361f2312-bucket/data/from-s3/')\n\n sleep_2min = BashOperator(\n task_id='sleep_2min',\n bash_command='sleep 2m')\n\n print_dir_files = BashOperator(\n task_id='print_dir_files',\n bash_command='ls /home/airflow/gcs/data/from-s3/data-for-gcs/')\n\n\n transfer_dir_from_s3 \u003e\u003e sleep_2min \u003e\u003e print_dir_files\n\nAzure FileShare to Cloud Storage\n\nThis section demonstrates how to synchronize data from Azure FileShare to a\nCloud Storage bucket.\n\nInstall the Microsoft Azure provider package\n\nThe `apache-airflow-providers-microsoft-azure` package contains the connection\ntypes and functionality that interacts with Microsoft Azure.\n[Install this PyPI package](/composer/docs/composer-1/install-python-dependencies#install-pypi) in your\nenvironment.\n\nConfigure a connection to Azure FileShare\n\nThe Microsoft Azure provider package provides a connection type for Azure File\nShare. You create a connection of this type. The connection for\nCloud Storage, named `google_cloud_default` is already set up in\nyour environment.\n\nSet up a connection to Azure FileShare in the following way:\n\n1. In [Airflow UI](/composer/docs/composer-1/access-airflow-web-interface), go to **Admin** \\\u003e **Connections**.\n2. Create a new connection.\n3. Select `Azure FileShare` as the connection type.\n4. The following example uses a connection named `azure_fileshare`. You can use this name, or any other name for the connection.\n5. Specify connection parameters as described in the Airflow documentation for [Microsoft Azure File Share Connection](https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/connections/azure_fileshare.html). For example, you can specify a connection string for your storage account access key.\n\n| **Note:** We recommend to **store all credentials for connections in Secret Manager** . For more information, see [Configure Secret Manager for your environment](/composer/docs/composer-1/configure-secret-manager). For example, you can create a secret named `airflow-connections-azure_fileshare` that stores credentials for the `azure_fileshare` connection.\n\nTransfer data from Azure FileShare\n\nIf you want to operate on the synchronized data later in another DAG or task,\npull it to the `/data` folder of your environment's bucket. This folder is\nsynchronized to other Airflow workers, so that tasks in your DAG\ncan operate on it.\n\nThe following DAG does the following:\n\nThe following example DAG does the following:\n\n- Synchronizes contents of the `/data-for-gcs` directory from Azure File Share to the `/data/from-azure` folder in your environment's bucket.\n- Waits for two minutes, for the data to synchronize to all Airflow workers in your environment.\n- Outputs the list of files in this directory using the `ls` command. Replace this task with other Airflow operators that work with your data.\n\n import datetime\n import airflow\n from airflow.providers.google.cloud.transfers.azure_fileshare_to_gcs import AzureFileShareToGCSOperator\n from airflow.operators.bash_operator import BashOperator\n\n with airflow.DAG(\n 'composer_sample_azure_to_gcs',\n start_date=datetime.datetime(2022, 1, 1),\n schedule_interval=None,\n ) as dag:\n\n transfer_dir_from_azure = AzureFileShareToGCSOperator(\n task_id='transfer_dir_from_azure',\n azure_fileshare_conn_id='azure_fileshare',\n share_name='example-file-share',\n directory_name='data-for-gcs',\n dest_gcs='gs://us-central1-example-environ-361f2312-bucket/data/from-azure/')\n\n sleep_2min = BashOperator(\n task_id='sleep_2min',\n bash_command='sleep 2m')\n\n print_dir_files = BashOperator(\n task_id='print_dir_files',\n bash_command='ls /home/airflow/gcs/data/from-azure/')\n\n\n transfer_dir_from_azure \u003e\u003e sleep_2min \u003e\u003e print_dir_files\n\nWhat's next\n\n- [Use GKE operators](/composer/docs/composer-1/use-gke-operator)"]]