Storage Transfer Service agents are applications running inside a Docker container, that coordinate with Storage Transfer Service to read data from POSIX file system sources, and/or write data to POSIX file system sinks.
If your transfer does not involve a POSIX file system, you do not need to set up agents.
This document describes how to administer transfer agents on your servers.
Agent processes are dynamic. While you are running a transfer, you can add agents to increase performance. Newly started agents join the assigned agent pool and perform work from existing transfers. You can use this to adjust how many agents are running, or to adapt transfer performance to changing transfer demand.
Agent processes are a fault-tolerant collective. If one agent stops running, the remaining agents continue to do work. If all of your agents stop, when you restart the agents the transfer resumes where the agents stopped. This enables you to avoid monitoring agents, retrying transfers, or implementing recovery logic. You can patch, move, and dynamically scale your agent pools without transfer downtime by coordinating agents with Google Kubernetes Engine.
For example, you submit two transfers while two agents are running. If one of the agents stops due to a machine reboot or operating system patch, the remaining agent continues working. The two transfers are still running, but slower since a single agent is moving data. If the remaining agent also stops, then all transfers stop making progress, since there are no agents running. When you restart the agent processes, the transfers resume where they left off.
Agent processes belong to a pool. They collectively move your data in parallel. Because of this, all agents within a pool must have the same access to all data sources that you want to transfer.
For example, if you are transferring data from a particular file system, you must mount the file system to every machine that is hosting agents in your agent pool. If some agents in your pool can reach a data source and others can't, transfers from that data source won't succeed.
Before you begin
Before configuring your transfers, make sure you have configured access: for users and service accounts.
If you'll be using
install the gcloud CLI.
Install and run transfer agents
To install and run a transfer agent:
Google Cloud console
To install one or more agents using the gcloud CLI, run
gcloud transfer agents install:
gcloud transfer agents install --pool=POOL_NAME --count=NUM_AGENTS
The tool walks you through any required steps to install the agent(s). This
command installs NUM_AGENTS agent(s) on your machine, mapped to
the pool name specified as POOL_NAME, and authenticates the
agent using your
gcloud credentials. The pool name must exist, or an
error is returned.
To run agents using a
service account key, use
gcloud transfer agents install --pool=POOL_NAME --count=NUM_AGENTS \ --creds-file=/relative/path/to/service-account-key.json
For a full list of optional flags, run
gcloud transfer agents install --help or read the
gcloud transfer reference.
We recommend installing more than one agent for each machine. For more information about determining how many agents to run, see Maximizing transfer agent performance.
Confirm agent connections
After you install your transfer agents, you can verify that they're connected to your agent pool.
To confirm that your agents are connected:
In the Google Cloud console, go to the Agent pools page.
Your agent pools are displayed, with the number of connected agents.
Select an agent pool to view details on connected agents.
If a new agent doesn't show up in the agent pool page within 10 minutes of its creation, see Agents are not connected.
Monitor agent activity
You can use Cloud Monitoring alerts to monitor agent activity.
Monitoring is available along
Using this monitoring data, you can set up alerts to notify you of potential issues with your transfer. To do so, create an alert on either of the following Google Cloud metrics:
|Metric name||What it describes||Suggested uses|
|storagetransfer.googleapis.com/agent/transferred_bytes_count||Measures how quickly a specific agent is moving data across all jobs that it services at a point in time.||Alert for dips in performance.|
|storagetransfer.googleapis.com/agent/connected||A boolean that is True for each agent that Google Cloud received a recent heartbeat message from.||
Stop an agent
To stop an agent, run
docker stop on the agent's Docker container ID. To find
the ID and stop the agent:
In the Google Cloud console, go to the Agent pools page.
Select the agent pool containing the agent to stop.
Select an agent from the list. Use the Filter field to search for prefixes, agent status, agent age, and more.
Click Stop agent. The
docker stopcommand with the specific container ID is displayed.
Run the command on the machine on which the agent is running. A successful
docker stopcommand returns the container ID.
Once stopped, the agent is shown in the agent pools list as Disconnected.
Delete an agent
To delete specific agents, list which agents are running on your machine:
docker container list --all --filter ancestor=gcr.io/cloud-ingest/tsop-agent
Then pass the agent IDs to
transfer agents delete:
gcloud transfer agents delete --ids=id1,id2,…
To delete all agents running on the machine, use either the
--uninstall flag. Both flags delete all agents on the machine;
--uninstall flag additionally uninstalls the agent Docker image.
gcloud transfer agents delete --all gcloud transfer agents delete --uninstall
File system transfer details
Storage Transfer Service begins all transfers by computing the data present at the source and destination to determine which source files are new, updated, or deleted since the last transfer. We do this to reduce the amount of data we send from your machines, to use bandwidth effectively, and to reduce transfer times.
To detect whether a file has changed, we use an algorithm similar to
check the last modified time and size of the source file, and compare that to
the last modified time and size recorded when the file was last copied. When we
detect a new or changed file, we copy the entire file to its destination. For
more information about file freshness, see Data consistency details.
By default we detect, but do not act on, files deleted on the source. If you choose the sync option Delete destination files that aren't also in the source when creating or editing, your transfer will delete the corresponding object at the destination.
If you choose the sync option Delete destination files that aren't also in the source, files that are accidentally deleted at the source are also deleted at the destination. To prevent data loss from accidental deletions, we recommend enabling object versioning in your destination bucket if you choose to use this option. Then, if you delete a file accidentally, you can restore your objects in Cloud Storage to an older version.
Data consistency details
A successful transfer operation will transfer all source files which existed and were not modified during the operation's entire running time. Source files that were created, updated, or deleted during a transfer may or may not have those changes reflected in the destination data set.
Storage Transfer Service uses a file's last modification time and size to determine if it
changed. If a file is updated without changing its last modification time or
size, and you enable the
delete-objects-from-source option, you may lose data
from that change.
When using the
delete-objects-from-source feature, we strongly recommend that
you freeze writes to the source for the duration of the transfer to protect
against data loss.
To freeze writes to your source, do either of the following:
- Clone the directory you intend to transfer, and then use the cloned directory as the transfer source.
- Halt applications that write to the source directory.
If it's important to capture changes that occurred during a transfer, you can either re-run the transfer, or set the source file system as read-only while the operation is running.
Since Cloud Storage doesn't have the notion of directories, empty source directories are not transferred.