Work with paths

This page describes how to work with graph paths in Spanner Graph.

In graph databases, the graph path data type represents a sequence of nodes interleaved with edges and shows how these nodes and edges are related. To learn more about the path data type, see Graph path type.

With the Spanner Graph Language (GQL), you can construct graph paths and perform queries on them. The examples in this document use the same Spanner Graph schema as found on the Set up and query Spanner Graph page.

Construct a graph path

You can construct a graph path by creating a path variable in a graph pattern or with the PATH function.

We recommend constructing a graph path by using the path variable. The format to create a path variable is:

MATCH p = PATH_PATTERN

For more information, see Graph pattern.

Example

In the following example, the query finds patterns of money transfers between accounts within FinGraph.

GRAPH FinGraph
MATCH p = (src:Account {id: 16})-[t1:Transfers]->(mid:Account)-[t2:Transfers]->
  (dst:Account {id: 7})
RETURN TO_JSON(p) AS full_path;

Result

full_path
[{"identifier": ..., "properties": {"id": 16, ...}, ...}, {"identifier": ..., "properties": {"amount": 300.0, ...}, ...}, ...]

The result indicates that the query found the Account -> Transfers -> Account pattern in the database.

Query a graph path

You can use the following path-specific functions to query a graph path. For more general information about Spanner Graph queries, see Queries overview.

EDGES

The EDGES function returns all the edges in a graph path. For detailed semantics, see EDGES.

Example

This query finds a path between two accounts that pass through a middle account. It returns the amount of the second Transfers edge in the path which might be between src and mid or between mid and dst.

GRAPH FinGraph
MATCH p = (src:Account {id: 7})-[t1:Transfers]->{1,3}(mid:Account)-[t2:Transfers]->
  {1,3}(dst:Account {id: 16})
LET second_edge = EDGES(p)[1]
RETURN DISTINCT src.id AS src, dst.id AS dst, second_edge.amount AS second_edge_amount;

Result

src dst second_edge_amount
7 16 300

NODES

The NODES function returns all the nodes in a graph path. For detailed semantics, see NODES.

Example

This query finds the graph path of two transfers, and then returns a JSON list representing the path.

GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN TO_JSON(NODES(p)) AS nodes;

Result

nodes
[{"identifier": "...", "properties": {"id": 16}, ...}, {"identifier": "...", "properties": {"id": 20, ...}, ...]
...

PATH_FIRST

The PATH_FIRST function finds the first node in a graph path. For detailed semantics, see PATH_FIRST.

Example

This query finds the first node in a graph path of two transfers. It returns the label of the Account node and the account's nickname.

GRAPH FinGraph
MATCH p = -[:Transfers]->{1,3}(dst:Account{id: 7})
RETURN DISTINCT PATH_FIRST(p).id AS can_reach_target;

Result

can_reach_target
7
16
20

PATH_LAST

The PATH_LAST function finds the last node in a graph path. For detailed semantics, see PATH_LAST.

Example

This query finds the last node in a graph path of two transfers. It returns the label of the Account node and the account's nickname.

GRAPH FinGraph
MATCH p =(start:Account{id: 7})-[:Transfers]->{1,3}
RETURN DISTINCT PATH_LAST(p).id as can_reach_target;

Result

can_reach_target
7
16
20

PATH_LENGTH

The PATH_LENGTH function finds the number of edges in a graph path. For detailed semantics, see PATH_LENGTH.

Example

This query finds the number of edges in a graph path that contains one to three transfers.

GRAPH FinGraph
MATCH p = (src:Account)-[e:Transfers]->{1,3}(dst:Account)
RETURN PATH_LENGTH(p) AS num_transfers, COUNT(*) AS num_paths;

Result

num_transfers num_paths
1 5
2 7
3 11

IS_ACYCLIC

The IS_ACYCLIC function checks if a graph path has repeating nodes. It returns TRUE if repetition is found, otherwise it returns FALSE. For detailed semantics, see IS_ACYCLIC.

Example

This query checks if this graph path has repeating nodes.

GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN IS_ACYCLIC(p) AS is_acyclic_path, src.id AS source_account_id,
  mid.id AS mid_account_id, dst.id AS dst_account_id;

Result

is_acyclic_path source_account_id mid_account_id dst_account_id
TRUE 16 20 7
TRUE 20 7 16
TRUE 20 7 16
FALSE 16 20 16
TRUE 7 16 20
TRUE 7 16 20
FALSE 20 16 20

IS_TRAIL

The IS_TRAIL function checks if a graph path has repeating edges. It returns TRUE if repetition is found, otherwise it returns FALSE. For detailed semantics, see IS_TRAIL.

Example

This query checks if this graph path has repeating edges.

GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
  (mid2:Account)-[t3:Transfers]->(dst:Account)
WHERE src.id < dst.id
RETURN IS_TRAIL(p) AS is_trail_path, t1.id AS t1_id, t2.id AS t2_id, t3.id AS t3_id;

Result

is_trail_path t1_id t2_id t3_id
FALSE 16 20 16
TRUE 7 16 20
TRUE 7 16 20

Path modes

In Spanner Graph, repeating nodes and edges are returned by default. You can use the following path modes to include or exclude paths that have repeating nodes and edges based on the specified mode. For detailed semantics, see Path mode.

WALK

The default WALK path mode keeps all paths, including ones with repeating nodes and edges.

Example

The following query demonstrates the use of the WALK path mode on a non-quantified path pattern. The first path in the results uses the same edge for t1 and t3.

GRAPH FinGraph
MATCH p = WALK (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
  (mid2:Account)-[t3:Transfers]->(dst:Account)
WHERE src.id < dst.id
RETURN t1.id AS transfer1_id, t2.id AS transfer2_id, t3.id AS transfer3_id;

Result

transfer1_id transfer2_id transfer3_id
16 20 16
7 16 20
7 16 20

ACYCLIC

The default ACYCLIC path mode filters out paths that have repeating nodes.

Example

The following query demonstrates the use of the ACYCLIC path mode on a non-quantified path pattern. The path with equal src and dst nodes is filtered out.

GRAPH FinGraph
MATCH p = ACYCLIC (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN src.id AS account1_id, mid.id AS account2_id, dst.id AS account3_id;

Result

account1_id account2_id account3_id
20 7 16
20 7 16
7 16 20
7 16 20
16 20 7

TRAIL

The default TRAIL path mode filters out paths that have repeating edges.

Example

The following query demonstrates the use of the TRAIL path mode on a non-quantified path pattern. The path whose t1 and t3 edges are equal is filtered out.

GRAPH FinGraph
MATCH p = TRAIL (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
  (mid2:Account)-[t3:Transfers]->(dst:Account)
RETURN
  t1.id AS transfer1_id, t2.id AS transfer2_id, t3.id AS transfer3_id;

Result

transfer1_id transfer2_id transfer3_id
16 20 7
16 20 7
20 7 16
20 7 16
7 16 20
7 16 20
7 16 20
7 16 20
20 16 20

Path search prefix

You can use a path search prefix to restrict a path pattern to return the shortest path from each data partition. For detailed semantics, seePath search prefix.

ANY SHORTEST

The ANY SHORTEST path search prefix returns the shortest path (the path with the least number of edges) that matches the pattern from each data partition. If there are more than one shortest paths per partition, returns any one of them.

Example

The following query matches any path between each pair of [a, b].

GRAPH FinGraph
MATCH p = ANY SHORTEST (a:Account)-[t:Transfers]->{1,4}(b:Account)
WHERE a.is_blocked
LET total_amount = SUM(t.amount)
RETURN a.id AS account1_id, total_amount, b.id AS account2_id;

Result

account1_id total_amount account2_id
16 500 16
16 800 7
16 300 20

Conversion rules

For more information, see GRAPH_PATH conversion rules.

Use case example

In the following use case example, you find all accounts have been routed through one to three accounts, from account ID 20.

GRAPH FinGraph
MATCH p = (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN DISTINCT dst.id AS dst;

Result

dst
7
16
20

However, a query that returns to account ID 20 might be an overly broad query because it starts with account ID 20. To show more specific results, you can enforce your query to show only acyclic graph paths without any repeating nodes. To do so, you can:

  • Use MATCH p = ACYCLIC <path_pattern>; or
  • Apply an IS_ACYCLIC(p) filter in your query

The following query uses MATCH p = ACYCLIC PATH_PATTERN:

GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN DISTINCT dst.id AS dst;

Result

dst
7
16

If you want to know the first account that money is transferred through, then you can run the following query:

GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})(-[:Transfers]->
  (nexts:Account)){1,3}(dst:Account)
RETURN dst.id AS dst, ARRAY_AGG(DISTINCT nexts[0].id) AS unique_starts;

This query is unconventional because it introduces a new variable inside the quantified path using nexts to get the result. With path variables, you can simplify the query:

GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN dst.id AS dst, ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS unique_starts;

Using NODES(p) returns all nodes along the path. Because the first node account is specified as start, the next one (at the first offset) is the first account that money is transferred through.

Result

dst unique_starts
7 16, 7

Paths are more useful when there are multiple quantified paths. You can add a constraint that the paths found from start must pass through account ID 7:

GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->
  {1,3}(mid:Account {id: 7})-[:Transfers]->{1,3}(dst:Account)
RETURN dst.id AS dst,
  ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS unique_starts;

Although the MATCH statement changed, the rest of the query doesn't need to change. Without using path variables, there are cases where it's not possible for Spanner to statically know which quantified path to inspect.

Using a path variable, you can get the sum of all transfers:

GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->
  {1,3}(mid:Account {id: 7})-[:Transfers]->{1,3}(dst:Account)
LET all_transfers = EDGES(p)
LET transfer_amounts = SUM(all_transfers.amount)
RETURN dst.id AS dst,
  ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS participating_neighbor_nodes, transfer_amounts;

Result

dst participating_neighbor_nodes transfer_amounts
16 7 600
16 7 800

What's next