This page describes how to generate and analyze the network dependencies report in Migration Center.
Overview
The network dependencies report provides daily aggregated data about the connections to your servers and databases. The network dependencies report lets you see all the connections to the assets in your infrastructure, and the number of connections per day.
To collect the network dependencies data, you let the discovery client run for several days and enable syncing the data with Migration Center. The discovery client then identifies all the network connections from the scanned assets. The target assets in the connection can be any asset in your Migration Center inventory that you discovered with the discovery client or that you manually imported, or even an unknown asset.
The network dependencies report is useful in the following scenarios:
- Collecting data about connections to servers and databases, to identify assets that belong to the same application
- Identifying network connections of interest within a group of assets, such as all the servers using the MySQL standard port
- Identifying missing assets in your inventory
You can download the network dependencies report as a CSV file from Migration Center. You can then perform your analysis using BigQuery and the sample queries provided by Migration Center, or use any other third-party tool.
Limitations
- To collect connection data in your infrastructure, use the discovery client.
- Network connections data is collected only with the OS scan method only. The vSphere scan doesn't support network data collection.
Before you begin
Before you create a network dependencies report, you must have performance collection working with the discovery client.
Before you analyze the network dependencies report with BigQuery, do the following:
- Learn how to import local data to BigQuery.
- Learn how to run queries.
Generate the network dependencies report
To generate a network dependencies report, follow these steps:
In the Google Cloud console, go to the Create reports page.
Click Network dependencies exports.
From the list of groups, select the groups for which you want to generate the report, then click Export.
In the dialog that appears, select the number of days for which you want to export the data, from a minimum of 10 and up to 90, then click Export.
After your file is generated, click Download.
Analyze the network dependencies report in BigQuery
The following sections provide you with some sample queries to analyze common scenarios in BigQuery. Before you can run a query, you must upload your CSV file to BigQuery.
To use BigQuery, you are billed according to the BigQuery pricing.
Identify assets with most connections
The following query is useful to identify the assets that have the largest number of connections in the group.
SELECT
LocalVMName, SUM(ConnectionCount) as TotalCount
FROM
PROJECT.DATASET.TABLE
GROUP BY ALL
ORDER BY TotalCount DESC
Replace the following:
PROJECT
: The Google Cloud project where you uploaded the CSV file.DATASET
: The BigQuery dataset.TABLE
: The BigQuery table.
The following is a sample output from this query:
LocalVMName | TotalCount |
---|---|
VM-x5ua3o2w | 9970 |
VM-glg5np3w | 9763 |
VM-q3z4zfp8 | 9557 |
VM-2nnsrt37 | 9372 |
VM-1oah56hn | 9350 |
Identify connections by graph's depth
The following query is useful to identify all the assets that connect to a given one with a specific number of intermediate connections. For example:
- With graph depth equal to 1, you find all the assets directly connected to the main asset.
- With graph depth equal to 2, you find all the assets directly connected to other assets, which are in turn directly connected to the main asset.
DECLARE
local_vm_name STRING DEFAULT MAIN_ASSET;
DECLARE
depth INT64 DEFAULT DEPTH;
CREATE TEMP FUNCTION
recursiveConnections(localVmName STRING,
connectionsArray ARRAY<STRING>,
depth INT64)
RETURNS STRING
LANGUAGE js AS r"""
const connections = connectionsArray.map(connection => connection.split('|||'))
.filter(connectionTuple => connectionTuple[1] !== 'Unscanned Device');
const connectedAssets = new Set([localVmName]);
for (let i = 0; i < depth; i++) {
const currentSet = new Set(connectedAssets);
for (const connection of connections) {
/* Look for connections where the asset is the local asset */
if (currentSet.has(connection[0])) {
connectedAssets.add(connection[1]);
}
/* Look for connections where the asset is the remote asset */
if (currentSet.has(connection[1])) {
connectedAssets.add(connection[0]);
}
}
}
connectedAssets.delete(localVmName);
return Array.from(connectedAssets).sort().join(', ');
""";
SELECT
local_vm_name AS LocalVMName,
recursiveConnections(local_vm_name,
ARRAY_AGG(CONCAT(LocalVMName, '|||', RemoteVMName)),
depth) AS Connections
FROM
PROJECT.DATASET.TABLE
Replace the following:
MAIN_ASSET
: The name of the asset for which you want to identify the connections.DEPTH
: The depth of the graph.
The following is a sample output from this query:
LocalVMName | Connections |
---|---|
VM-lv8s148f | VM-2z8wp3ey, VM-66rq2x2y, VM-94uwyy8h, VM-ccgmqqmb, VM-ctqddf0u, VM-og4n77lb, ... |
Filter connections by IP and port ranges
The following query lets you identify assets that use IP addresses and ports in ranges that you define.
CREATE TEMP FUNCTION
ipBetween(value STRING,
low STRING,
high STRING) AS ( NET.IPV4_TO_INT64(NET.IP_FROM_STRING(value)) BETWEEN NET.IPV4_TO_INT64(NET.IP_FROM_STRING(low))
AND NET.IPV4_TO_INT64(NET.IP_FROM_STRING(high)) );
SELECT
*
FROM
PROJECT.DATASET.TABLE
WHERE
((LocalPort BETWEEN PORT_START
AND PORT_END)
OR (RemotePort BETWEEN PORT_START
AND PORT_END))
AND (ipBetween(LocalIP,
IP_START,
IP_END)
OR ipBetween(RemoteIP,
IP_START,
IP_END))
Replace the following:
PORT_START
: The initial port of the port range, for example0
.PORT_END
: The final port of the port range, for example1024
.IP_START
: The initial IP address of the range, for example"10.26.0.0"
.IP_END
: The final IP address of the range, for example"10.26.255.255"
.
The following is a sample output from this query:
Day | LocalVMName | LocalAssetID | LocalGroups | LocalIP | LocalPort | Protocol | LocalProcessName | RemoteVMName | RemoteAssetID | RemoteGroups | RemoteIP | RemotePort | ConnectionCount |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2024-04-18 | VM-0lf60off | projects/982941055174/locations/us-central1/assets/0lf60off | Group 1 | 10.0.45.138 | 272 | tcp | bash | VM-0spdofr9 | projects/982941055174/locations/us-central1/assets/0spdofr9 | 144.35.88.1 | 272 | 499 | |
2024-04-18 | VM-goa5uxhi | projects/982941055174/locations/us-central1/assets/goa5uxhi | Group 3 | 10.187.175.82 | 781 | tcp | bash | VM-27i5d2uj | projects/982941055174/locations/us-central1/assets/27i5d2uj | 22.99.72.109 | 781 | 980 | |
2024-04-19 | VM-7vwy31hg | projects/982941055174/locations/us-central1/assets/7vwy31hg | Group 1 | 10.58.166.132 | 21 | tcp | bash | VM-2gq0fl37 | projects/982941055174/locations/us-central1/assets/2gq0fl37 | 147.19.84.135 | 21 | 514 |
Identify unscanned assets in the network
The following query lets you identify any unscanned asset in your network. An unscanned asset is a connection to a remote IP address that is not associated to any asset in your Migration Center inventory. This lets you identify potentially missing assets that you need to scan for your assessment.
CREATE TEMP FUNCTION
ipBetween(value STRING,
low STRING,
high STRING) AS ( NET.IPV4_TO_INT64(NET.IP_FROM_STRING(value)) BETWEEN NET.IPV4_TO_INT64(NET.IP_FROM_STRING(low))
AND NET.IPV4_TO_INT64(NET.IP_FROM_STRING(high)) );
SELECT
STRING_AGG(LocalIP, ', ') AS LocalIPs,
RemoteIP
FROM
PROJECT.DATASET.TABLE
WHERE
RemoteVMName = 'Unscanned Device'
AND ipBetween(LocalIP,
IP_START,
IP_END)
AND ipBetween(RemoteIP,
IP_START,
IP_END)
GROUP BY
RemoteIP
Replace the following:
IP_START
: The initial IP address of the range, for example"10.26.0.0"
.IP_END
: The final IP address of the range, for example"10.26.255.255"
.