Using the Linux discovery tool

Migrate for Anthos and GKE provides a self-service tool that you run on a Linux VM workload to determine the workload's fit for migration to a container.

The tool outputs a report describing the analysis results for the VM that describes any issues that must be resolved before migration and an overall fit assessment of either:

  • Excellent fit.
  • Good fit with some manual work.
  • Needs minor work before migrating.
  • Needs moderate work before migrating.
  • Needs major work before migrating.
  • No fit.

See Calculating the fit assessment for a description of how the tool determines the overall fit assessment for a VM.

How the tool works

The Linux discovery tool operates in two distinct phases:

  • Collect phase - A bash script named m4a-fit-collect.sh collects information about the Linux VM to be migrated and writes the collected data to a tar file. A copy of the data remains on the VM filesystem for later usage during migration.

  • Analysis phase - The m4a-fit-analysis tool parses the output of the collect phase, applies a set of rules, and creates a fit assessment along with a detailed report describing the tool's findings. You can view the report as an HTML file or as a JSON file.

You can run the collect tool and analysis tool on the same VM. However, if you have multiple VMs, you can instead run the collect tool on each VM separately, then upload the tar file from each VM to a single machine for analysis. The m4a-fit-analysis tool can process multiple tar files at once to output a fit assessment and analysis for each VM.

Integration with workload migration

Data obtained by the Linux discovery tool about a source VM during the collect phase can be used by Migrate for Anthos and GKE to generate parts of the migration plan for the VM.

For example, data collected by the Linux discovery tool is used to discover information about Service endpoints exposed by the migrated VM. Therefore, you must run the tool on a source VM if you want to auto-populate information about Service endpoints. See Customizing Service endpoints for more information.

View the fit assessment report

To view the detailed report output by the m4a-fit-analysis tool, you can either:

  • Open the HTML file in a browser
  • Upload the JSON file to the Google Cloud Console

View the HTML output

Open the HTML file in a browser to view the report. The following image shows the HTML output of the tool for the evaluation of a VM named "my-vm":

The HTML output of the LDT.

Where:

  • The table contains one row for each VM analyzed, including a link to details about the VM.

  • The Data collection date, Identified OS, M4A fit score, and Workload type columns contain summary information about the VM and the analysis results.

  • The Additional info column contains a link to details about each VM, including information such as listening ports, mount points, NFS mount points, and other information.

  • Each rule column shows the rule ID and description, for example "A1-STO-1: Unsupported network mount", and the results from applying a rule to the VM. A value of:

    • Not Detected means that rule detected no migration issue.
    • Detected means the rule detected a migration issue for the VM. Click Detected to view details about the rule output.

Upload the JSON file to the Google Cloud Console

To view the report in the Google Cloud Console:

  1. Open the Migrate for Anthos and GKE page in the Google Cloud Console.

    Go to the Migrate to containers page

  2. Select the Fit Assessment tab.

  3. Select Browse, and then select the JSON file for upload.

  4. Select Open to view the report.

    The report displays:

    • The Prepared by, Assessment date, and Fit assessment tool fields contain summary information about the report.
    • The Migration Journey Breakdown area displays:
      • Total number of VMs analyzed.
      • Number of VMs ready to migrate, or Containerize.
      • Number of VMs not ready to migrate.
    • One row for each VM analyzed, including fit assessment for the VM.
  5. In the Assessed VMs table, select the name of a VM to view details about the VM, including information such as listening ports, mount points, NFS mount points, and other information.

    For the selected VM, you also see each rule description and rule ID, and the result of applying the rule to the VM:

    The Console output of the LDT.

Prerequisites

The Linux discovery tool has the following prerequisites:

  • The target VM being evaluated must be running to ensure that applications, processes, and open ports are discoverable.

  • The machine used to run the analysis tool, m4a-fit-analysis, must run Linux kernel version later than 2.6.23.

  • You must run the collect script as sudo.

Installing and running the tool

You must download the collection script and the analysis tool. You can either:

  • Download both tools to a single VM.
  • If you have multiple VMs, download the collection script to each workload VM, then upload the collected data to a central machine for analysis by the analysis tool.

To evaluate a VM:

  1. Log in to your VM.

  2. Create a directory for the collection script and analysis tool:

    mkdir m4a
    cd m4a
  3. Download the collection script to the VM and make it executable:

    wget https://anthos-migrate-release.storage.googleapis.com/v1.9.0/linux/amd64/m4a-fit-collect.sh
    chmod +x m4a-fit-collect.sh
  4. Download the analysis tool to the VM and make it executable:

    wget https://anthos-migrate-release.storage.googleapis.com/v1.9.0/linux/amd64/m4a-fit-analysis
    chmod +x m4a-fit-analysis
  5. Run the collect script on the VM:

    sudo ./m4a-fit-collect.sh

    The script outputs a tar file named m4a-collect-machinename-timestamp.tar to the current directory and to /var/m4a/m4a-collect-timestamp.tar.

    The timestamp is in the format YYYY-MM-DD-hh-mm. See Collect script operation for a description of the tar file format.

    Note: If you installed the analysis tool on a central machine, upload the tar file to that machine for processing.

  6. Run the analysis tool on the tar file:

    ./m4a-fit-analysis m4a-collect-machinename-timestamp.tar

    The tool outputs two files to the current directory:

    • An HTML file named analysis-report-timestamp.html. The timestamp is in the format YYYY-MM-DD-hh-mm. View this file in a browser to examine the report.

    • A JSON file named analysis-report-timestamp.json containing a JSON format of the output. You can use the file as input to the Google Cloud Console.

    The output files contain information about the analysis, including the fit assessment. See Report file format for more information.

    To run the analysis tool on multiple tar files, you can use the command:

    $ ./m4a-fit-analysis tarFile1 tarFile2 tarFile3 ...

    The tool outputs a single row to the output files for each input tar file. Within the report, you can identify each VM by its hostname, meaning the value returned by running the hostname command on the VM.

    Use the --verbosity option to control the output of the tool. Options include: panic, fatal, error(default), warning, info, debug, and trace.

  7. Open analysis-report-timestamp.html in a browser to view the report. See Analysis tool operation for a description of the file format.

Collect script operation

The collect script runs a series of Linux commands to gather information about the source VM and also collects information from several files on the VM.

The following sections describe the operation of the script. You can also examine the script in a text editor to see more detailed information.

Script commands

The script runs the following Linux commands:

Command Description
netstat -tlnp List all active listening ports
ps -o pid,user,%mem,comm,args -e List of all running user processes
dpkg -l List installed packages (debian based)
rpm -qa List of installed packages (rpm based)
sestatus Get SELinux status
lsmod Get loaded kernel modules
systemctl List running services (SystemD baseD)
service --status-all List running services (Init.d /Upstart based)
lsof /dev / List open handles to files and hardware devices
docker ps List running Docker containers
ip addr List IP addresses assigned to NICs
ifconfig Show NIC configs and assigned IPs
blkid List block device attributes
lsblk --json -p --output NAME,PARTFLAGS,PARTTYPE,UUID,LABEL,FSTYPE" List block devices

Files collected

The script copies the following files to the generated tar file:

Path Description
/etc/fstab List of mounts to be mounted at startup
/etc/hosts

/etc/resolv.conf

/etc/hostname

/etc/HOSTNAME

/proc/sys/kernel/hostname

Aliases for hosts and DNS data
/etc/issue

/etc/*-release

The name of the Linux distribution
/etc/network/interfaces

/etc/dhcp/dhclient-up-hooks

/etc/NetworkManager/conf.d/*

/etc/systemd/resolved.conf

/etc/sysconfig/network-scripts/*

/etc/sysconfig/network/*

The configured interfaces
/proc/cpuinfo CPU information
/proc/meminfo The current memory usage/total on the VM
/proc/self/mounts The currently mounted devices
/etc/exports List of NFS exports
/opt/IBM/WebSphere/AppServer/properties/version/installed.xml Websphere version (when installed at default)
/opt/IBM/WebSphere/AppServer/properties/version/WAS.product Websphere info (when installed at default)
/sys/class/net/* NIC information

Directories examined

The script searches the following directories, to a depth of two, to locate the directories of installed utilities and software:

  • /opt/
  • /usr/share/
  • /etc/
  • /usr/sbin/
  • /usr/local/bin/

Collect tar file format

The m4a-fit-collect.sh script outputs a tar file named m4a-collect-machinename-timestamp.tar to the current directory and to /var/m4a/m4a-collect-timestamp.tar.

While not required, you can optionally expand the tar file by using the command:

tar xvf m4a-collect-machinename-timestamp.tar

The tar file has the following format:

collect.log # Log output of the script
files # Directory containing files with their full path from root. For example:
   |- etc/fstab
   |- etc/hostname
   |- etc/network/interfaces
   |- ...
commands # Output of commands run by the script:
   |- dpkg
   |- netstat
   |- ps
   |- ...
found_paths # Text file with the list of installation directories
machinename # Text file with machine name
ostype # Text file with operating system type (Linux)
timestamp # Text file with collection timestamp
version # Text file with version number of the script

Analyze tool operation

The analyze tool examines the contents of the tar file from a VM, applies a set of rules, and outputs the following report files containing the fit assessment and analysis results:

  • An HTML file named analysis-report-timestamp.html to the current directory. The timestamp is in the format YYYY-MM-DD-hh-mm.

  • A JSON file named analysis-report-timestamp.json containing a JSON format of the output. You can use the JSON file as input to the Google Cloud Console.

Calculating the fit assessment

A rule violation detected by the tool affects the final fit assessment, where each rule has its own predefined severity. For example, the tool detects the SELinux is enabled on the VM, corresponding to rule A1-STO-3: SELinux enforced. That rule has a severity of "Needs moderate work before migrating" so the final fit assessment of the tool is "Needs moderate work before migrating".

If the tool detects multiple rule violations, then only the rule with the highest severity is applied to the final fit assessment. For example, two rule violations are detected:

  • An incompatible file system is detected, rule A1-STO-2: incompatible filesystem, with a severity of "No fit".

  • SELinux enabled is detected with a severity of "Needs moderate work before migrating".

The tool only returns the fit assessment associated with the most severe of the two rules. Therefore, it returns "No fit".

Report content

The report contains the following information for each VM:

Field Description
VM name The name of the VM.
Data collection date The timestamp of the analysis.
Identified OS The operating system which is always Linux.
Fit assessment The fit assessment. See for information on interpreting this assessment.
Workload type

If detected, shows IBM WebSphere.

Additional info

Summary of information about the VM, including:

  • Listening ports discovered on the VM.
  • Detected disk mounts on the VM.
  • NFS mount points on the VM.

The report also contains the results of applying each rule to the VM:

Rule ID Description Severity Notes
A1-STO-1 Unsupported network mount. If /etc/fstab or /proc/self/mounts has network mounts, you need to manually add the CSI volumes to the migration plan. Good fit See Mounting External Volumes for more on how to attach NFS/CIFS volumes to deployment YAML.
A1-STO-2 Incompatible file-system. No fit You cannot migrate workloads with an incompatible file system.
A1-STO-3 SELinux enforced on VM. SELinux does not work well in nested containers so the recommendation is to disable it before migrating. Needs moderate work Disable SELinux or manually apply an apparmor profile before migrating.
A1-STO-4-5

NFS share exported. Detected an NFS export file and the NFS server kernel module is loaded.

Two rules are applied, where each returns a different severity:

  • A1-STO-4: NFS server detected: Needs moderate effort
  • A1-STO-5: NFS server detected and Workload type is web server: Needs minor effort
See description Migrate NFS servers to Cloud Filestore.
A1-NET-1-3

Found listener on non-0.0.0.0 IP address, meaning a binding to a specific network interface was detected. If there are ports being listened to on a specific NIC (and not 0.0.0.0, *, or loopback) it usually means there is a multi-NIC setup.

Three rules are applied, where each returns a different severity:

  • A1-NET-1: One IP detected: Good fit
  • A1-NET-2: Two or more IPs detected: Needs minor work
  • A1-NET-3: Two or more IPs detected listening on the same port: Needs moderate work
See description Update VM to listen on any one NIC because Migrate for Anthos and GKE only supports one NIC.
A1-NET-4

Found usage of multiple NICs. The tool ignores virtual devices, corresponding to symlinks, and devices that are down, according to /sys/class/net/DEVICE/operstate.

The report details list all the detected NICs.

Needs moderate effort

The existence of multiple NICs on the source VM can mean that the VM uses multiple IP addresses. However, GKE and Anthos do not support multiple IP addresses. Therefore, a source VM that relies on multiple NICs might not work property after migration.

A1-APP-2

Running DB inside container. Check for running database applications that are not a good fit for migration:

  • Mysqld
  • Postgres
  • Mongodb
  • Redis-server
  • Cassandra
  • Elasticsearch
Needs minor work Consider migrating to Cloud SQL.
A1-APP-3 Docker running on VM. Nesting Docker inside containers is not supported. If dockerd is running: Good fit Consider using Migrate for Compute Engine or running the containers directly on GKE/Anthos.
A1-NET-5 Usage of static hosts. Static hosts definitions detected in /etc/hosts. Good fit See Adding entries to Pod /etc/hosts with HostAliases for information on modifying your static hosts.
A1-STO-7 Open block device detected by lsof. No fit Not compatible with Migrate for Anthos and GKE.

Version history

Changes to the tool for Migrate for Anthos and GKE 1.8.0

For the Migrate for Anthos and GKE 1.8.0 release, we added new features and changed existing features of the tool. The following table describes these changes:

Change Description
Changed evaluation of rule A1-STO-2 No longer assigning a low fit score when temporary filesystems are detected.

Changes to the tool for Migrate for Anthos and GKE 1.7.5

For the Migrate for Anthos and GKE 1.7.5 release, we added new features and changed existing features of the tool. The following table describes these changes:

Change Description
Can now view fit assessment report in the Google Cloud Console Upload the JSON file to the Google Cloud Console for viewing.
Column names changed The column names in the generated report have been renamed.
Added multiple NIC rule The tool now tests for multiple NICs. If detected, the tool applies a fit assessment of "Needs moderate effort".
Change fit assessment of rule A1-NET-3 If two or more IPs are detected listening on the same port, the fit assessment is now "Needs moderate work".

Changes to the tool for Migrate for Anthos and GKE 1.7

For the Migrate for Anthos and GKE 1.7 release, we added new features and changed existing features of the tool. The following table describes these changes:

Change Description
Removed the fit score. In the previous release, the fit score was in the range of 0 (no fit) to 10 (great fit). The score has been replaced by an assessment value as shown above.
The weight of all rules has been removed. The weight of all rules have been removed and replaced with an assessment result.
Replaced the CSV file report format with an HTML and a JSON file. Use the HTML file to view the report in a browser, and use the JSON file as input to a data visualization tool. See HTML report file format for more.
Location of the tar file created by m4a-fit-collect.sh As in previous release, the script writes the tar file to the current directory, but now it also writes it to /var/m4a/m4a-collect-timestamp.tar.
Version file added to the tar file created by m4a-fit-collect.sh The file contains the version of the script.
Added new column for the detected Workload type If detected, shows IBM WebSphere.
Added new summary fields for each VM in the report. These fields include the listening ports, mount points, NFS mount points, and other information.

What's next