Using the Linux discovery tool

Migrate for Anthos provides a self-service tool that you run on a Linux VM workload to determine the workload's fit for migration to a container. The tool outputs a score between 0 to 10, where 10 indicates a great fit for migration with little manual effort required along with a report describing the analysis results, and 0 indicates no fit or issues that must be resolved before migration.

How the tool works

The Linux discovery tool operates in two distinct phases:

  • Collect phase - A bash script named m4a-fit-collect.sh collects information about the Linux VM to be migrated and writes the collected data to a tar file.

  • Analysis phase - The m4a-fit-analysis tool parses the output of the collect phase, applies a set of rules, and creates a fit score between 0 and 10 along with a detailed report describing the tool's findings.

    You interpret the fit score as follows:

    Fit score Description
    Score 9-10 Great fit with none to very little manual work.
    Score 7-8 Fit with some manual work.
    Score 5-6 Fit with non-trivial manual work.
    Score 2-4 Possible fit with substantial manual work and troubleshooting.
    Score 0-1 No fit or severe issues are expected.

You can run the collect tool and analysis tool on the same VM. However, if you have multiple VMs, you can instead run the collect tool on each VM separately, then upload the tar file from each VM to a single machine for analysis. The m4a-fit-analysis tool can process multiple tar files at once to output a fit score and analysis for each VM.

Prerequisites

The Linux discovery tool has the following prerequisites:

  • The target VM being evaluated must be running to ensure that applications, processes, and open ports are discoverable.

  • The machine used to run the analysis tool, m4a-fit-analysis, must run Linux kernel version later than 2.6.23.

Installing and running the tool

You must download the collection script and the analysis tool. You can either:

  • Download both tools to a single VM.
  • If you have multiple VMs, download the collection script to each workload VM, then upload the collected data to a central machine for analysis by the analysis tool.

To evaluate a VM:

  1. Log in to your VM.

  2. Create a directory for the collection script and analysis tool:

    mkdir m4a 
    cd m4a
  3. Download the collection script to the VM and make it executable:

    wget https://anthos-migrate-release.storage.googleapis.com/v1.5.0/linux/amd64/m4a-fit-collect.sh
    chmod +x m4a-fit-collect.sh
  4. Download the analysis tool to the VM and make it executable:

    wget https://anthos-migrate-release.storage.googleapis.com/v1.5.0/linux/amd64/m4a-fit-analysis
    chmod +x m4a-fit-analysis
  5. Run the collect script on the VM:

    sudo ./m4a-fit-collect.sh

    The script outputs a tar file named m4a-collect-machinename-timestamp.tar to the current directory. The timestamp is in the format YYYY-MM-DD-hh-mm. See Collect script operation for a description of the tar file format.

    Note: If you installed the analysis tool on a central machine, upload the tar file to that machine for processing.

  6. Run the analysis tool on the tar file:

    ./m4a-fit-analysis tarFile

    The tool outputs a csv file named analysis-report-timestamp.csv to the current directory. The timestamp is in the format YYYY-MM-DD-hh-mm. See Analysis tool operation for a description of the csv file format.

    The output csv file contains information about the analysis, including the fit score. For scores other than 10, indicating a great fit, you can examine the csv file to determine the factors that lead to the score. See CSV file format for more information.

    To run the analysis tool on multiple tar files, you can use the command:

    $ ./m4a-fit-analysis tarFile1 tarFile2 tarFile3 ...

    The tool outputs a single row to the csv file for each input tar file. Within the report, you can identify each VM by its hostname, meaning the value returned by running the hostname command on the VM.

    Use the --verbosity option to control the output of the tool. Options include: panic, fatal, error(default), warning, info, debug, and trace.

Collect script operation

The collect script runs a series of Linux commands to gather information about the source VM and also collects information from several files on the VM.

Script commands

The script runs the following Linux commands:

Command Description
netstat -tulnp List all active listening ports.
hostname Get the name of the host.
ps -o pid,user,%mem,comm -e List of all running user processes.
dpkg -l List installed packages (debian based).
rpm -qa List of installed packages (rpm based).
sestatus Get SELinux status.
lsmod Get loaded kernel modules.
systemctl List running services (SystemD baseD).
service --status-all List running services (Init.d /Upstart based).
lsof | grep /dev List process with direct access to devices.
docker ps List running Docker containers.
ifconfig List interfaces

Files collected

The script copies the following files to the generated tar file:

Path Description
/etc/fstab List of mounts to be mounted at startup.
/etc/hosts

/etc/resolv.conf

/etc/hostname

/etc/HOSTNAME

/proc/sys/kernel/hostname

Aliases for hosts and DNS data.
/etc/issue

/etc/*-release

The name of the Linux distribution.
/etc/network/interfaces

/etc/dhcp/dhclient-up-hooks

/etc/NetworkManager/conf.d

/etc/systemd/resolved.conf

/etc/sysconfig/network-scripts/*

/etc/sysconfig/network/*

The configured interfaces.
/proc/self/mounts The currently mounted devices.
/proc/meminfo The current memory usage/total on the VM.
/etc/exports List of NFS exports.

Collect tar file format

The m4a-fit-collect.sh script outputs a tar file named m4a-collect-machinename-timestamp.tar to the current directory. Expand the tar file by using the command:

tar xvf m4a-collect-machinename-timestamp.tar

The tar file has the following format:

collect.log # Log output of the script
files # Directory containing files with their full path from root. For example:
   |- etc/fstab
   |- etc/hostname
   |- etc/network/interfaces
   |- ...
commands # Output of commands run by the script:
   |- dpkg
   |- netstat
   |- ps 
   |- ...
machinename # Text file with machine name
timestamp # Text file with collection timestamp
ostype # Text file with operating system type (Linux)

Analyze tool operation

The analyze tool examines the contents of the tar file from a VM, applies a set of rules, and outputs a csv file containing the fit score and analysis results.

A rule violation detected by the tool affects the fit score by subtracting the rule's weight from 10.0, where each rule has its own predetermined weight. For example, if the tool detects the SELinux is enabled on the VM, corresponding to rule SEL01: SELinux enforced, then the tool subtracts the rule's weight from 10.0. The weight for this rule is 5.0, so the fit score for the VM is 5.0.

If the tool determines that multiple rules apply, then only the rule with the highest weight is subtracted from the fit score. For example, if an incompatible file system is detected, rule IFS01: incompatible filesystem with a weight of 10.0, and SELinux is enabled with a weight of 5.0, the tool subtracts the larger of the two weights from 10.0 so the final fit score is 0.0.

CSV file format

The csv file contains a row describing each VM analyzed, and a set of columns that show the results from applying each rule to the VMs. For example, if a VM has SELinux enabled, then the fit score will have a maximum value of 5.0, and the csv file contains that information in a column named "SEL01: SELinux enforced".

The csv file also contains a glossary describing the fit score and instructions on interpreting the score.

The csv file contains the following columns:

Column name Short Description Weight Notes
vm_name The name of the VM.
data_collection_date The timestamp of the analysis, in the form:

YYYY-MM-DD-hh-mm

identified_OS The operating system which is always Linux.
m4a_fit_score The fit score between 0.0 (no fit or severe issues) and 10.0 (great fit). See How the tool works for information on interpreting this score.
IFS01: incompatible filesystem Incompatible file-system. 10.0 You cannot migrate workloads with an incompatible file system.
NET01: unsupported network mount If /etc/fstab or /proc/self/mounts has network mounts, you will need to manually add the CSI volumes to the migration plan. 0.5 See Mounting External Volumes for more on how to attach NFS/CIFS volumes to deployment YAML.
SEL01: SELinux enforced SELinux does not work well in nested containers so the recommendation is to disable it before migrating. 5.0 Disable SELinux or manually apply an apparmor profile before migrating.
DBC01: running DB inside container Check for running database applications that are not a good fit for migration:
  • Mysqld
  • Postgres
  • Mongod
  • Redis-server
  • Cassandra
  • elasticsearch
2.0 Consider migrating to CloudSQL.
DR01: docker running on VM Nesting Docker inside containers is not supported. If dockerd is running: 1.0 Consider using Migrate for Compute Engine or running the containers directly on GKE/Anthos.
FSE01: NFS share exported Detected an NFS export file and the NFS server kernel module is loaded. This configuration is not supported. 10.0 Migrate NFS servers to Cloud FileStore.
LIP01: found listener on non 0.0.0.0 IP address Binding to a specific network interface detected. If there are ports being listened to on a specific NIC (and not 0.0.0.0, *, or loopback) it usually means there is a multi-NIC setup 1.0 Update VM to listen on any one NIC because Migrate for Anthos only supports one NIC.
BD01: open block device Open block device detected by lsof. 10.0 Not compatible with Migrate for Anthos.
SHD01: Static hosts detected Static hosts definitions detected in /etc/hosts. 0.5 See Adding entries to Pod /etc/hosts with HostAliases for information on modifying your static hosts.

What's next