Preparing for Data Transfer

Before using Google Transfer Appliance, complete each of the following tasks to make sure your data migration project runs smoothly:

  • Identify the data you want to capture.
  • Prepare the data you want to capture.
  • Review and if necessary adjust your network settings to make sure that Transfer Appliance can connect to and move data from computers on your network.

Plan for data transfer

Planning for capturing data with Transfer Appliance is similar to planning for a file system backup. You need to identify the servers and directories or folders that contain the data you want to capture. The basic methodology is the same whether the data is on a Microsoft Windows or Linux file system.

Once you have identified the data you want to capture, make sure that you have the correct permissions to access it.

To create capture jobs, you run the capture utility from a workstation or perform an appliance capture from Transfer Appliance against the identified locations. Transfer Appliance can handle any number of parallel capture jobs (both workstation and appliance) up to the limit of your system resources and network bandwidth to handle them, so we recommend that if possible you run multiple capture jobs at once to transfer data more quickly.

Plan to suspend operations for backup software (NetWorker, NetBackup, Tivoli Storage Manager, etc.) when running data capture jobs, so that there aren't any data access conflicts.

In order to easily manage your migration project, plan to aggregate data into capture jobs using whatever criteria makes the most sense to you. For example, you can create one job for web data and another for accounting data, or you can create a separate job for each data source.

Give some thought to meaningful job names as well. They are used to identify each capture job, and the files it contains, for the rest of the data migration project. For example, if you capture the file e:\sourcedatafolder\data1\file_001 with the Windows capture utility command tacapture.exe this-job e:\sourcedatafolder\data1, the file will reside at gs://<bucket_name>/this-job/e:/sourcedatafolder/data1/file_001 in the Google Cloud Storage staging bucket when your data is uploaded into Cloud Platform.

You should plan to allocate time for verifying the captured data before shipping the appliance back. The verification step is optional but recommended. The time taken to verify the data will depend on the amount of data collected and its deduplication ratio. The approximate time required for verification is displayed in the Transfer Appliance web interface when you start the Prepare for Shipping operation.

Prepare your data

To prepare your data, make sure the files that will be captured meet certain requirements, and see if it makes sense to consolidate the data.

Review your files for Transfer Appliance requirements

Make sure all of the files in the directories you want to capture meet the following criteria:

  • Are less than 5 terabytes (TB) in size.
  • Have names that follow these object name requirements. If any of your files don't meet these requirements, you can get around that by aggregating those files into a file with an acceptable name by using zip or a similar utility.
  • Aren't symbolic links.

Files with any of these properties are skipped by the data capture process. Transfer Appliance tracks skipped files, so you can fix these files and try again to capture them.

Consider consolidating your data

Depending on the location of your data and the type of capture you want to do, it is worth considering whether to consolidate the files you want to capture in a centralized set of folders or directories, and possibly to aggregate them as well.

Consolidation offers the following benefits:

  • Makes it easier to set up capture jobs by having related data all in the same place.
  • Saves Transfer Appliance storage space by avoiding accidental capture of files you don't actually want to transfer.
  • Makes it easier to configure security and network access to the data if such changes are needed.

Consolidation may also improve capture job performance if it makes it possible for Transfer Appliance to be on the same subnet as the servers it is capturing data from, as well as the workstation you are using if you are performing a workstation capture. For more information on network requirements, see the following Prepare the network section.

Prepare the network

Many networks implement strict security policies. These policies can prevent Transfer Appliance from functioning properly because they block certain network ports or filter certain types of network traffic. Use the following instructions for verifying that your network is configured to allow Transfer Appliance to connect and capture data.

In both workstation and appliance capture scenarios, the data to be captured is transferred over the network at least once. As such, the configuration and capabilities of the network environment has a significant impact on the speed and success of the capture process. For better capture performance, we recommend that there be as few network hops between the data source and the Transfer Appliance as possible. For an appliance capture scenario, place the Transfer Appliance on the same subnet as the data source, if at all possible. Similarly, for a workstation capture scenario, place both the workstation and the data source on the same subnet as Transfer Appliance.

Transfer Appliance doesn't support Network Address Translation (NAT) or Port Address Translation (PAT), so it won't work in network configurations where Transfer Appliance and workstation are separated by a network device performing NAT or PAT.

Before running a capture job, we recommend testing network connectivity between Transfer Appliance and either the workstation for a workstation capture or the network share containing the data for an appliance capture. You can test network connectivity by using the procedure at Testing Network Connectivity.

Firewall port requirements

You might need to open ports in the firewall so that Transfer Appliance can communicate with computers on the network. The following tables describe the ports that are open on Transfer Appliance and the ports that need to be open on computers that communicate with Transfer Appliance.

Transfer Appliance ports that are open

Service Port Protocol Type
SSH 6422 TCP ingress
FTP 21 TCP ingress
HTTPS 443 TCP ingress
TA Appliance Media Server 25025 TCP ingress
NFS 2049 TCP ingress
RPC 111 TCP ingress

Linux workstation ports that must be open

Service Port Protocol Type
SSH (default) 22 TCP ingress
SSH for Transfer Appliance 6422 TCP egress
FTP 21 TCP egress
HTTPS 443 TCP egress
Transfer Appliance media server 25025 TCP egress
Dynamic ports for data capture 32768-61000 TCP ingress

The Internet Assigned Numbers Authority (IANA) suggests the range 49152 to 65535 for dynamic or private ports. Many Linux kernels use the port range 32768 to 61000. You can find the dynamic ports on most Linux systems by executing the command $ cat /proc/sys/net/ipv4/ip_local_port_range.

Windows workstation ports that must be open

Service Port Protocol Type
SSH for Transfer Appliance 6422 TCP egress
FTP 21 TCP egress
HTTPS 443 TCP egress
Transfer Appliance media server 25025 TCP egress
Dynamic ports for data capture 49152-65535 TCP ingress

For a list of dynamic ports used by Windows, see Dynamic Ports in Windows Server 2008 and Windows Vista (or: How I learned to stop worrying and love the IANA).

If you prefer not to open the dynamic ports for access for either Linux or Windows, you can specify the starting port for an alternative port range by using the -p option with the capture utility. For more information, see Capture Utility Reference. If you specify a port range, make sure it is unique for each job, as having port overlap between jobs can cause errors.

Make firewall modifications

When using workstation for data capture, it might be necessary to adjust or disable the operating system’s built-in firewall. The following sections outline how to open the required ports in the firewall, or disable the firewall altogether.

Add firewall exceptions

To add exceptions to firewalls:

Microsoft Windows

For Microsoft Windows, you can add firewall exceptions to allow network traffic to and from the Transfer Appliance. To do this, open a command prompt as administrator and type the following command:

netsh advfirewall firewall add rule name=rule-name dir=in localport=port-range protocol=TCP action=allow

For example, you would run the following command to open ports 10000-10100:

C:\Users\Administrator>netsh advfirewall firewall add rule name="open_10000-10100" dir=in localport=10000-10100 protocol=TCP action=allow

This allows you to initiate capture over the opened ports by using the -p option with the capture command. For example, you would use the following command to run a data capture job on the port range starting at port 1000:

C:\Program Files\DaTA>DaTACapture.exe myJob C:\Dump -p 10000

RHEL/CentOS 6.4/6.8

For Red Hat Enterprise Linux (RHEL) or CentOS 6.4 or 6.8, you can add firewall exceptions to allow network traffic to and from the Transfer Appliance. This requires adding firewall exceptions to the file /etc/sysconfig/iptables.

  1. Open a terminal and type the following command to update iptables:

    $ -A INPUT -p service -m service --dport starting-port:ending-port -j ACCEPT

    For example, to open ports 10000-10100 to TCP and UDP traffic, run the following commands:

    $ -A INPUT -p tcp -m tcp --dport 10000:10100 -j ACCEPT
    $ -A INPUT -p udp -m udp --dport 10000:10100 -j ACCEPT
  2. Restart the iptables service by running the following command:

    $ sudo service iptables restart
  3. Verify that the firewall rule was added by running the following command to check the iptables status:

    $ sudo service iptables status

    And looking for the following line in the results:

    ACCEPT service -- 0.0.0.0/0 0.0.0.0/0 service dpts:starting-port:ending-port

RHEL/CentOS 7.0/7.2

For Red Hat Enterprise Linux (RHEL) or CentOS 7.0 or 7.2, you can add firewall exceptions to allow network traffic to and from the Transfer Appliance. This requires adding firewall exceptions to the file /etc/sysconfig/iptables.

  1. Open a terminal and type the following command to update iptables:

    $ -A INPUT -p service -m service --dport starting-port:ending-port -j ACCEPT

    For example, to open ports 10000-10100 to TCP and UDP traffic, open a terminal and run the following commands:

    $ -A INPUT -p tcp -m tcp --dport 10000:10100 -j ACCEPT
    $ -A INPUT -p udp -m udp --dport 10000:10100 -j ACCEPT
  2. Restart the iptables service by running the following command:

    $ sudo systemctl restart iptables
  3. Verify that the firewall restarted by checking the iptables status:

    $ sudo systemctl status iptables

    And looking for the following line in the results:

    Active: active (exited) since [date and time of iptables restart]

  4. Check that the firewall rules were added by running the following command:

    $ sudo iptables -L -n

    And looking for the following line in the results:

    ACCEPT service -- 0.0.0.0/0 0.0.0.0/0 service dpts:starting-port:ending-port

Ubuntu 12.04/14.04

For Ubuntu 12.04 and 14.04, you can add firewall exceptions to allow network traffic to and from the Transfer Appliance. This is done using the ufw allow command.

  1. Open a terminal and type the following command to open a port range to a particular type of traffic:

    $ sudo ufw allow proto service to any port starting-port:ending-port

    For example, to open ports 10000-10100 to TCP and UDP traffic, run the following command:

    $ sudo ufw allow proto tcp to any port 10000:10100
  2. Verify that the firewall rule was added by checking ufw status:

    $ sudo ufw status

    And looking for the following lines in the results:

    starting-port:ending-port/service ALLOW Anywhere
    starting-port:ending-port/service ALLOW Anywhere (v6)

Disable firewalls

To disable firewalls:

Microsoft Windows

To temporarily disable the firewall on a Microsoft Windows workstation, open a command prompt as administrator, and issue the following command:

netsh advfirewall set allprofiles state off

For example:

C:\Users\Administrator>netsh advfirewall set allprofiles state off

RHEL/CentOS 6.4/6.8

To temporarily disable the iptables firewall on a Red Hat Enterprise Linux (RHEL) or CentOS 6.4 or 6.8 system, open a terminal, and run the following command:

$ sudo service iptables stop

Verify the firewall is off by checking the iptables status:

$ sudo service iptables status

And look for the following line in the results:

iptables: Firewall is not running.

RHEL/CentOS 7.0/7.2

To temporarily disable the iptables firewall on a Red Hat Enterprise Linux (RHEL) or CentOS 7.0 or 7.2 system, open a terminal and run the following command:

$ sudo systemctl stop iptables

Verify the firewall is off by checking the iptables status:

$ sudo systemctl status iptables

And look for the following lines in the results:

Active: inactive (dead) since [date and time of iptables stop]
[start of iptables stop] compute1.slfs systemd[1]: Stopped IPv4 firewall with iptables.
[completion of iptables stop] compute1.slfs systemd[1]: Stopped IPv4 firewall with iptables

Ubuntu 12.04/14.04

To temporarily disable the ufw firewall on Ubuntu 12.04 or 14.04, open a terminal and run the following command:

$ sudo ufw disable

Verify the firewall is off by checking the ufw status:

$ sudo ufw status

And look for the following lines in the results:

Status: inactive

Next steps

Once you receive a Transfer Appliance, use the procedures in Setting Up and Configuring Transfer Appliance to set it up and configure it.

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Transfer Appliance