Performing a Linux Workstation Capture

Use the following procedures to capture data by using a Linux workstation. For supported versions of Linux, see supported operating systems.

Prerequisites

You need to know the following information so that you can connect to Google Transfer Appliance from a workstation:

  • The URL for Transfer Appliance. This is displayed on the Transfer Appliance console.
  • The password for the capture user credentials provided to you by Google.

Also, before proceeding, make sure you have read Preparing for Data Transfer and performed any necessary data or network preparation.

Install the capture utility on a Linux workstation

Use the following procedure to install the capture utility on a Linux workstation.

  1. On a workstation, connect to the Transfer Appliance web interface.
  2. Select Download Linux Capture Utility in the Operations pane to download the capture utility installer.
  3. Untar the Linux capture utility download:

    $ tar -xvzf CaptureUtility.tar.gz

    This creates a directory called CaptureUtility under the current directory. The CaptureUtility directory contains the installer file installer.sh.

  4. Switch to the CaptureUtility directory:

    $ cd CaptureUtility

  5. Install the Linux capture utility by running installer.sh:

    $ ./installer.sh

    The installer installs the capture utility as TACapture.sh.

Test connectivity to the Transfer Appliance

Before running your first data capture job, check to make sure the workstation can connect to Transfer Appliance.

  1. On the Linux workstation, open a terminal.
  2. Test connectivity with Transfer Appliance by running TACapture.sh with the -t option.

    $ ./TACapture.sh -t

  3. You are prompted to enter the capture user password provided to you by Google.

    Enter Transfer Appliance <Transfer Appliance IP address> "cuser" password for SSH

    The workstation pings Transfer Appliance and returns results similar to the following:

    Test: Ping Transfer Appliance (123.456.78.90)...OK
    Test: SSH to Transfer Appliance (123.456.78.90:6422)...OK
    Test: Control connection with Transfer Appliance (123.456.78.90:25025)...OK Test: Return connection from Transfer Appliance to this workstation, (could take several minutes)...OK

    If the connection test fails, refer to the error message returned to determine the cause. The most common reason for connectivity test failure is a firewall blocking the ports needed for data capture. Refer to Prepare the network for details on which ports need to be open and instructions for opening them.

Perform a workstation capture from a Linux workstation

After checking connectivity, use the capture utility to start a data capture job.

  1. Run TACapture.sh and specify a job name and capture target to run a data capture job:

    $ ./TACapture.sh [JOB NAME] [CAPTURE DIRECTORY]

    where [JOB NAME] is the name of the data capture job and [CAPTURE DIRECTORY] is the directory that contains the data to capture. The capture utility recursively captures all data in the directories under the one specified.

  2. Enter the capture user password provided to you by Google when prompted:

    Enter Transfer Appliance <Transfer Appliance IP address> "cuser" password for SSH

    The capture job runs and displays a completion message when it is finished. Don't close the terminal window while the capture job is running or the job will be terminated.

For example, the following command creates a job named data-capture to capture the data in the directory /mnt/data and all subdirectories.

$ ./TACapture.sh data-capture /mnt/data

Use a specific port range for data capture

By default, a data capture job on Linux dynamically chooses a port range between 32768 and 61000 to use. If those ports aren't open on your network or you prefer to use a different port range, you can specify a port to start the port range from by using the -p option. Each data capture task in the job requires its own data streaming port and chooses one from the range between the specified port number and ([port number] + [data capture tasks] - 1).

For example, the following command creates a job named dataFactory that uses 8 parallel data capture tasks on ports 60000 - 60007 to capture the data in the directory /mnt/data and all subdirectories.

$ ./TACapture.sh dataFactory /mnt/data -p 60000

For more information about the options for the capture utility, see Capture Utility Reference.

Use a specified number of data capture tasks

By default, a data capture job uses 8 parallel data capture tasks for optimal throughput. To use fewer parallel capture tasks, use the -m option with TACapture.sh. You might want to do this if you want to use fewer ports or have poor network bandwidth.

For example, the following command creates a job named dataFactory that uses 6 parallel data capture tasks to capture the data in the directory /mnt/data and all subdirectories.

$ ./TACapture.sh dataFactory /mnt/data -m 6

For more information about the options for the capture utility, see Capture Utility Reference.

Use a file specification to identify data for capture

By default, a job captures data from a single directory that you specify. To capture data from one or more files instead, create a text file containing a file specification and use the -f option with TACapture.sh. The text file must contain one absolute path to a target data file on each line, as illustrated below:

/home/user/finances
/usr/bin/app/archive/data
/etc/acme/logs/log_001

For example, the following command creates a job named dataFactory that captures the data identified in the file filespec.

$ ./TACapture.sh dataFactory -f /home/user/filespec

For more information about the options for the capture utility, see Capture Utility Reference.

Next steps

To perform parallel data capture tasks using a Microsoft Windows workstation or appliance capture, use the procedures in Performing a Microsoft Windows Workstation Capture or Performing an Appliance Capture.

To monitor data capture jobs, use the procedures in Monitoring Capture Jobs.

If you are done capturing data, use the procedures in Preparing and Shipping Transfer Appliance.

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Google Transfer Appliance