Discover and collect data

The first phase in an assessment is to discover and collect data. For VMware, you can optionally perform a discovery. For other platforms, you run a collection.

  • Discovery (VMware and AWS only): A discovery lets you list all VMs, and is sufficient to assess suitability for some modernization journeys. However, you may also want to run a collection, which provides additional data for all journeys.

  • Collection: A collection collects data about the VM and writes that data to a tar file (Linux) or zip file (Windows). The machine the mfit CLI is running on uploads the collection script to the guest machine and then downloads the results. You can run the script locally on the VM, or run the script remotely. For Linux and Windows VMs deployed by VMware, remote execution is supported by mfit; otherwise you can use mfit discover ssh.

Perform a discovery

VMware

You can perform a discovery using the vSphere API or RVTools.

vSphere API discovery

Use the vSphere API to collect data about all VMs in a vCenter visible to the user running the tool. You can also scope a discovery to a specific folder, cluster, or data center.

  1. Change to the mfit directory:

    cd ~/mfit
    
  2. To perform the discovery, run the following command:

    ./mfit discover vsphere -u USERNAME --url https://VSPHERE_URL
    
  3. Enter the vCenter password when prompted.

  4. After you download and import the tar using mfit discover import you can assess the collected data and run a report, as described in Generate fit assessment reports.

Scope a discovery

  • Run the following commands to perform the discovery at the root:

    ./mfit discover vsphere --url https://VSPHERE_URL -u USERNAME --path /
    
  • Run the following command to perform the discovery at a specific folder:

    ./mfit discover vsphere --url https://VSPHERE_URL -u USERNAME --path datacenter/vm/folder
    
  • Run the following command to perform the discovery at a specific cluster:

    ./mfit discover vsphere --url https://VSPHERE_URL -u USERNAME --path datacenter/host/cluster
    
  • Run the following command to perform the discovery at a specific data center:

    ./mfit discover vsphere --url https://VSPHERE_URL -u USERNAME --path datacenter
    

Adjust timeout

The default timeout is 15 minutes. However, when you perform inventory collection against a vCenter server with more than 1000 VMs or guest collection against a vCenter server with more than 100 VMs, increase the timeout in proportion to the total number of VMs. Follow this general rule:

  • If you perform an inventory collection, increase the timeout by 15 minutes every 1000 additional VMs. For example, if you have 2000 VMs, set the timeout value to 30 minutes.
  • If you perform a guest collection, increase the timeout by 15 minutes every 100 additional VMs. For example, if you have 300 VMs, set the timeout value to 45 minutes.

To change the timeout setting, set the --timeout flag to the required timeout in seconds:

./mfit discover vsphere -u USERNAME --url https://VSPHERE_URL --timeout TIMEOUT_IN_SECONDS

RVTools discovery

The fit assessment tool supports analyzing the RVTools .xlsx report files from a single VMware vCenter. To generate detailed fit assessment reports based on your existing RVTools export, run the following command:

./mfit discover rvtools name.xlsx

AWS

To perform the discovery, run the following command:

mfit discover aws -r REGION

The CLI prompts you for your access key ID and secret access key.

The output of the discover command should look similar to the following:

Collected 166 VMs
[✓] Collection completed.

Adjust timeout

By default, the timeout is set to 15 minutes. However, when you perform inventory collection against an AWS region with more than 10,000 VMs, increase the timeout in proportion to the total number of VMs in that region. For every 10,000 more VMs, increase the timeout by 15 minutes.

To change the timeout setting, set the --timeout flag to the required timeout in seconds:

mfit discover aws -r REGION --timeout TIMEOUT_IN_SECONDS

Collect data

The following sections describe how to run the collection scripts:

Collect data remotely using VMware tools

For VMs hosted on vSphere, mfit can use VMware tools to deploy and run the collection scripts remotely on both Linux and Windows VMs. When using VMware tools, the mfit tool:

  • Uploads the collection script to the VM
  • Runs the script on the VM
  • Downloads and imports the results

Two sets of credentials are required to collect data remotely:

  • The vCenter server username passed to the tool to connect to vSphere must have the following privileges on the VM:

    • Guest operation modifications
    • Guest operation program execution
    • Guest operation queries
  • User credentials for the VM:

    • On Windows, you must have administrator privileges.
    • On Linux, root access is not required, but root access guarantees that mfit can collect all the fit assessment data.

To collect data using VMware tools:

  1. Log in to your Linux VM hosting mfit.

  2. Change to the mfit directory:

    cd ~/mfit
    
  3. Make sure the VM is powered on. Pass the vCenter server user, the VM user, and the VM_ID (the name of the VM or MOREF) to the command:

    mfit discover vsphere guest --url https://VSPHERE_URL \
       -u VCENTER_USER --vm-user VM_USER VM_ID
    

    You're prompted to enter the password for the VCENTER_USER and VM_USER.

    If the vSphere cluster has multiple data centers, you must use the --dc option to specify the data center name:

    mfit discover vsphere guest --url https://VSPHERE_URL --dc DATACENTER_NAME \
       -u VCENTER_USER --vm-user VM_USER VM_ID
    

To collect data from multiple vsphere VMs in parallel using VMware tools:

  1. All the VMs you would like to collect from must have the same username and password. If you have separate credentials for your Windows or Linux VMs, you can filter to one or the other using the --os-family flag.

  2. Log in to your Linux VM hosting mfit.

  3. Change to the mfit directory:

    cd ~/mfit
    
  4. Make sure that all VMs are powered on, and pass the vCenter server user and the VM user to the command:

    mfit discover vsphere guest all --url https://VSPHERE_URL \
       -u VCENTER_USER --vm-user VM_USER --timeout TIMEOUT_IN_SECONDS
    

    You're prompted to enter the password for the VCENTER_USER and VM_USER.

    You can optionally limit the collection to only Windows/Linux VMs, change the level of parallelism, and increase the verbosity of stdout using flags:

    mfit discover vsphere guest all --url https://VSPHERE_URL \
       -u VCENTER_USER --vm-user VM_USER --os-family windows --max-parallelism MAX_PARALLELISM -v --timeout TIMEOUT_IN_SECONDS
    

    You can also scope the collection to specific VMs using the --path flag, as described for VMware in the Perform a discovery section.

Collect data remotely over SSH

If the Linux machine hosting mfit has ssh access to the source Linux VM (Windows VMs are not supported), then mfit can connect to the remote VM over ssh to collect data.

When using ssh, the mfit tool:

  • Uploads the collection script to the VM.
  • Runs the script on the VM with the VM user credentials passed to mfit. While the VM user credentials do not require root access, having root access guarantees that mfit can collect all fit assessment data.
  • Downloads and imports the results.

You can use two modes to run ssh:

  • Native (default): Uses the ssh binary and configurations on the mfit machine. Native mode can use the local SSH configuration files by default, such as ~/.ssh/config and ~/.ssh/known_hosts, of the workstation hosting it.

    Enter the password when prompted, or use sshpass to pass the password or private key file passphrase on the command line. For example:

    sshpass -p password mfit discover ssh IP-ADDRESS
    
  • Embedded: Uses the built-in ssh library. This mode lets you use the embedded ssh client if native mode malfunctions in your environment. However, it does not use the local SSH configuration files by default. You can use the -i option to specify an SSH private key file.

To collect data over ssh:

  1. Log in to the Linux VM hosting mfit.

  2. Change to the mfit directory:

    cd ~/mfit
    
  3. Run mfit:

    1. Use native mode (default) to collect data:

      mfit discover ssh VM_IP_HOSTNAME
      

      The SSH private key file of the user invoking mfit is used for SSH authentication.

      Enter the username of an account on the Linux VM when prompted. The collection script runs using these credentials. If the SSH private key of the user invoking mfit fails to authenticate to the VM with the username, you're also prompted for a password.

    2. Specify the VM user with native mode:

      mfit discover ssh -u USER VM_IP_HOSTNAME
      

      Enter the password for the user when prompted.

    3. Use the -v option to specify verbose mode:

      mfit discover ssh -u USER -v VM_IP_HOSTNAME
      
    4. Use the -i option to specify the SSH private key file. For example, to specify .ssh/my_private_key:

      mfit discover ssh -i ~/.ssh/my_private_key -u USER VM_IP_HOSTNAME
      
    5. Use embedded mode to specify the password on the command line:

      mfit discover ssh --ssh-client embedded -u USER --passphrase PASSWORD VM_IP_HOSTNAME
      

      Because the embedded form of the command does not use the local SSH configuration files by default, the USER specified in the command must be able to access the VM over ssh and have privileges on the VM to execute the collection script.

    6. Use the -i option with embedded mode:

      mfit discover ssh --ssh-client embedded -i ~/.ssh/id_rsa -u USER --password PASSWORD VM_IP_HOSTNAME
      
    7. The mfit command lets you specify most ssh flags. These flags are then passed to the ssh command using the -a/--ssh-args option. For example, to use a socks proxy:

      mfit discover ssh -u USER> \
          -a '-o' -a 'ProxyCommand=nc -X 5 -x 127.0.0.1:proxy port %h %p'
          VM_IP_HOSTNAME
      

Collect data on an individual Linux VM

You can run the mfit-linux-collect script on a VM to gather data about that VM. The data can then be imported by downloading the tar file to the machine with mfit installed and run mfit discover import path-to-tar. You typically run the script by specifying the sudo option. You can optionally run the script using the privileges of the user running the tool. However, the script might not be able to collect all assessment data.

  1. Log in to your VM.

  2. Change to the mfit directory:

    cd ~/mfit
    
  3. Run the collection script on the VM:

    sudo ./mfit-linux-collect.sh
    

    The script outputs a tar file named m4a-collect-MACHINE_NAME-TIMESTAMP.tar to the current directory and to /var/mfit/m4a-collect-TIMESTAMP.tar. The timestamp is in the format YYYY-MM-DD-hh-mm. Learn more about the tar file format.

    You can pass the optional arguments to the script:

    • --readonly to omit writing the output to /var/mfit/m4a-collect-TIMESTAMP.tar. Some of the features of Migrate to Containers rely on this information. See Integration with workload containerization for more information.
    • --output to save the tar file within the specified path.
  4. To import the collected data, run the following command:

    mfit discover import path-to-tar
    
  5. You can now assess the collected data and run a report, as described in Generate fit assessment reports.

Collect data on an individual Windows VM

  1. Log in to your VM.

  2. Open PowerShell using the Run as Administrator option.

  3. Change to the mfit directory:

    cd ~/mfit
    
  4. Run the collection script on the VM:

    powershell -ExecutionPolicy ByPass -File .\mfit-windows-collect.ps1
    

    The script outputs a zip file named m4a-collect-MACHINE_NAME-TIMESTAMP.zip to the current directory. Include an output path to specify a different location:

    .\mfit-windows-collect.ps1 \path\to\output\file.zip
    
  5. You can now assess the collected data and run a report, as described in Generate fit assessment reports. While you run data collection on the Windows VM, you must use the Linux machine to run fit assessment. You need to take the collected data file and move it to your Linux machine (administrator console).

Collect data using StratoZone

You can use StratoZone StratoProbe version 5.0.2.1 or later for assessing VMs.

  1. Create a local folder ./collection_data on a workstation where mFIT is installed.

  2. Download the collected tar files stored on the StratoProbe workstation at C:\Program Files\StratoProbe\data\mFIT to the ./collection_data folder on the workstation where mFIT is installed.

  3. Run the following command to perform discovery from the collected files:

    ./mfit discover import ./collection_data
    

Delete information from a local database

After you've performed a collection and run a report on the data, you can delete the data from the local database using the following command:

mfit discover purge-db -db DATABASE_NAME

What's next