Manage transfer agents

Stay organized with collections Save and categorize content based on your preferences.

Storage Transfer Service agents are applications running inside a Docker container, that coordinate with Storage Transfer Service to read data from POSIX file system sources, and/or write data to POSIX file system sinks.

If your transfer does not involve a POSIX file system, you do not need to set up agents.

This document describes how to administer transfer agents on your servers.

Overview

  • Agent processes are dynamic. While you are running a transfer, you can add agents to increase performance. Newly started agents join the assigned agent pool and perform work from existing transfers. You can use this to adjust how many agents are running, or to adapt transfer performance to changing transfer demand.

  • Agent processes are a fault-tolerant collective. If one agent stops running, the remaining agents continue to do work. If all of your agents stop, when you restart the agents the transfer resumes where the agents stopped. This enables you to avoid monitoring agents, retrying transfers, or implementing recovery logic. You can patch, move, and dynamically scale your agent pools without transfer downtime by coordinating agents with Google Kubernetes Engine.

    For example, you submit two transfers while two agents are running. If one of the agents stops due to a machine reboot or operating system patch, the remaining agent continues working. The two transfers are still running, but slower since a single agent is moving data. If the remaining agent also stops, then all transfers stop making progress, since there are no agents running. When you restart the agent processes, the transfers resume where they left off.

  • Agent processes belong to a pool. They collectively move your data in parallel. Because of this, all agents within a pool must have the same access to all data sources that you want to transfer.

    For example, if you are transferring data from a particular file system, you must mount the file system to every machine that is hosting agents in your agent pool. If some agents in your pool can reach a data source and others can't, transfers from that data source won't succeed.

Before you begin

Before configuring your transfers, make sure you have configured access: for users and service accounts.

If you'll be using gcloud commands, install the gcloud CLI.

Install and run transfer agents

To install and run a transfer agent:

Google Cloud console

  1. In the Google Cloud console, go to the Agent pools page.

    Go to Agent pools

  2. Select the agent pool to which to add the new agent.

  3. Click Install agent.

  4. Follow the instructions to install and run the agent.

    For more information about the agent's command-line options, see Agent command-line options.

gcloud CLI

To install one or more agents using the gcloud CLI, run gcloud transfer agents install:

gcloud transfer agents install --pool=POOL_NAME --count=NUM_AGENTS

The tool walks you through any required steps to install the agent(s). This command installs NUM_AGENTS agent(s) on your machine, mapped to the pool name specified as POOL_NAME, and authenticates the agent using your gcloud credentials. The pool name must exist, or an error is returned.

To run agents using a service account key, use the --creds-file option:

gcloud transfer agents install --pool=POOL_NAME --count=NUM_AGENTS \
  --creds-file=/relative/path/to/service-account-key.json

For a full list of optional flags, run gcloud transfer agents install --help or read the gcloud transfer reference.

We recommend installing more than one agent for each machine. For more information about determining how many agents to run, see Maximizing transfer agent performance.

Confirm agent connections

After you install your transfer agents, you can verify that they're connected to your agent pool.

To confirm that your agents are connected:

  1. In the Google Cloud console, go to the Agent pools page.

    Go to Agent pools

    Your agent pools are displayed, with the number of connected agents.

  2. Select an agent pool to view details on connected agents.

If a new agent doesn't show up in the agent pool page within 10 minutes of its creation, see Agents are not connected.

Monitor agent activity

You can use Cloud Monitoring alerts to monitor agent activity.

Monitoring is available along project, agent_pool, and agent_id dimensions.

Using this monitoring data, you can set up alerts to notify you of potential issues with your transfer. To do so, create an alert on either of the following Google Cloud metrics:

Metric name What it describes Suggested uses
storagetransfer.googleapis.com/agent/transferred_bytes_count Measures how quickly a specific agent is moving data across all jobs that it services at a point in time. Alert for dips in performance.
storagetransfer.googleapis.com/agent/connected A boolean that is True for each agent that Google Cloud received a recent heartbeat message from.
  • Alert for failing agents
  • Failing below a number of agents that you consider necessary for reasonable performance
  • Signal an issue with agent machines

Stop an agent

To stop an agent, run docker stop on the agent's Docker container ID. To find the ID and stop the agent:

  1. In the Google Cloud console, go to the Agent pools page.

    Go to Agent pools

  2. Select the agent pool containing the agent to stop.

  3. Select an agent from the list. Use the Filter field to search for prefixes, agent status, agent age, and more.

  4. Click Stop agent. The docker stop command with the specific container ID is displayed.

  5. Run the command on the machine on which the agent is running. A successful docker stop command returns the container ID.

Once stopped, the agent is shown in the agent pools list as Disconnected.

Delete an agent

To delete specific agents, list which agents are running on your machine:

docker container list --all --filter ancestor=gcr.io/cloud-ingest/tsop-agent

Then pass the agent IDs to transfer agents delete:

gcloud transfer agents delete --ids=id1,id2,…

To delete all agents running on the machine, use either the --all flag or the --uninstall flag. Both flags delete all agents on the machine; the --uninstall flag additionally uninstalls the agent Docker image.

gcloud transfer agents delete --all
gcloud transfer agents delete --uninstall

File system transfer details

Incremental transfers

Storage Transfer Service begins all transfers by computing the data present at the source and destination to determine which source files are new, updated, or deleted since the last transfer. We do this to reduce the amount of data we send from your machines, to use bandwidth effectively, and to reduce transfer times.

To detect whether a file has changed, we use an algorithm similar to gsutil rsync: we check the last modified time and size of the source file, and compare that to the last modified time and size recorded when the file was last copied. When we detect a new or changed file, we copy the entire file to its destination. For more information about file freshness, see Data consistency details.

By default we detect, but do not act on, files deleted on the source. If you choose the sync option Delete destination files that aren't also in the source when creating or editing, your transfer will delete the corresponding object at the destination.

If you choose the sync option Delete destination files that aren't also in the source, files that are accidentally deleted at the source are also deleted at the destination. To prevent data loss from accidental deletions, we recommend enabling object versioning in your destination bucket if you choose to use this option. Then, if you delete a file accidentally, you can restore your objects in Cloud Storage to an older version.

Data consistency details

A successful transfer operation will transfer all source files which existed and were not modified during the operation's entire running time. Source files that were created, updated, or deleted during a transfer may or may not have those changes reflected in the destination data set.

Storage Transfer Service uses a file's last modification time and size to determine if it changed. If a file is updated without changing its last modification time or size, and you enable the delete-objects-from-source option, you may lose data from that change.

When using the delete-objects-from-source feature, we strongly recommend that you freeze writes to the source for the duration of the transfer to protect against data loss.

To freeze writes to your source, do either of the following:

  • Clone the directory you intend to transfer, and then use the cloned directory as the transfer source.
  • Halt applications that write to the source directory.

If it's important to capture changes that occurred during a transfer, you can either re-run the transfer, or set the source file system as read-only while the operation is running.

Since Cloud Storage doesn't have the notion of directories, empty source directories are not transferred.