This page describes how to work with graph paths in Spanner Graph.
In graph databases, the graph path data type represents a sequence of nodes interleaved with edges and shows how these nodes and edges are related. To learn more about the path data type, see Graph path type.
With the Spanner Graph Language (GQL), you can construct graph paths and perform queries on them. The examples in this document use the same Spanner Graph schema as found on the Set up and query Spanner Graph page.
Construct a graph path
You can construct a graph path by creating a path variable in a graph pattern or
with the PATH
function.
We recommend constructing a graph path by using the path variable. The format to create a path variable is:
MATCH p = PATH_PATTERN
For more information, see Graph pattern.
Example
In the following example, the query finds patterns of money transfers between
accounts within FinGraph
.
GRAPH FinGraph
MATCH p = (src:Account {id: 16})-[t1:Transfers]->(mid:Account)-[t2:Transfers]->
(dst:Account {id: 7})
RETURN TO_JSON(p) AS full_path;
Result
full_path |
---|
[{"identifier": ..., "properties": {"id": 16, ...}, ...}, {"identifier": ..., "properties": {"amount": 300.0, ...}, ...}, ...] |
The result indicates that the query found the Account -> Transfers -> Account
pattern in the database.
Query a graph path
You can use the following path-specific functions to query a graph path. For more general information about Spanner Graph queries, see Queries overview.
EDGES
The EDGES
function returns all the edges in a graph path. For detailed
semantics, see EDGES
.
Example
This query finds a path between two accounts that pass through a middle account.
It returns the amount of the second Transfers
edge in the path which might be
between src
and mid
or between mid
and dst
.
GRAPH FinGraph
MATCH p = (src:Account {id: 7})-[t1:Transfers]->{1,3}(mid:Account)-[t2:Transfers]->
{1,3}(dst:Account {id: 16})
LET second_edge = EDGES(p)[1]
RETURN DISTINCT src.id AS src, dst.id AS dst, second_edge.amount AS second_edge_amount;
Result
src | dst | second_edge_amount |
---|---|---|
7 | 16 | 300 |
NODES
The NODES
function returns all the nodes in a graph path. For detailed
semantics, see NODES
.
Example
This query finds the graph path of two transfers, and then returns a JSON list representing the path.
GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN TO_JSON(NODES(p)) AS nodes;
Result
nodes |
---|
[{"identifier": "...", "properties": {"id": 16}, ...}, {"identifier": "...", "properties": {"id": 20, ...}, ...] |
... |
PATH_FIRST
The PATH_FIRST
function finds the first node in a graph path. For detailed
semantics, see PATH_FIRST
.
Example
This query finds the first node in a graph path of two transfers. It returns the
label of the Account
node and the account's nickname.
GRAPH FinGraph
MATCH p = -[:Transfers]->{1,3}(dst:Account{id: 7})
RETURN DISTINCT PATH_FIRST(p).id AS can_reach_target;
Result
can_reach_target |
---|
7 |
16 |
20 |
PATH_LAST
The PATH_LAST
function finds the last node in a graph path. For detailed
semantics, see PATH_LAST
.
Example
This query finds the last node in a graph path of two transfers. It returns the
label of the Account
node and the account's nickname.
GRAPH FinGraph
MATCH p =(start:Account{id: 7})-[:Transfers]->{1,3}
RETURN DISTINCT PATH_LAST(p).id as can_reach_target;
Result
can_reach_target |
---|
7 |
16 |
20 |
PATH_LENGTH
The PATH_LENGTH
function finds the number of edges in a graph path. For
detailed semantics, see PATH_LENGTH
.
Example
This query finds the number of edges in a graph path that contains one to three transfers.
GRAPH FinGraph
MATCH p = (src:Account)-[e:Transfers]->{1,3}(dst:Account)
RETURN PATH_LENGTH(p) AS num_transfers, COUNT(*) AS num_paths;
Result
num_transfers | num_paths |
---|---|
1 | 5 |
2 | 7 |
3 | 11 |
IS_ACYCLIC
The IS_ACYCLIC
function checks if a graph path has repeating nodes. It returns
TRUE
if repetition is found, otherwise it returns FALSE
. For detailed
semantics, see IS_ACYCLIC
.
Example
This query checks if this graph path has repeating nodes.
GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN IS_ACYCLIC(p) AS is_acyclic_path, src.id AS source_account_id,
mid.id AS mid_account_id, dst.id AS dst_account_id;
Result
is_acyclic_path | source_account_id | mid_account_id | dst_account_id |
---|---|---|---|
TRUE | 16 | 20 | 7 |
TRUE | 20 | 7 | 16 |
TRUE | 20 | 7 | 16 |
FALSE | 16 | 20 | 16 |
TRUE | 7 | 16 | 20 |
TRUE | 7 | 16 | 20 |
FALSE | 20 | 16 | 20 |
IS_TRAIL
The IS_TRAIL
function checks if a graph path has repeating edges. It returns
TRUE
if repetition is found, otherwise it returns FALSE
. For detailed
semantics, see IS_TRAIL
.
Example
This query checks if this graph path has repeating edges.
GRAPH FinGraph
MATCH p = (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
(mid2:Account)-[t3:Transfers]->(dst:Account)
WHERE src.id < dst.id
RETURN IS_TRAIL(p) AS is_trail_path, t1.id AS t1_id, t2.id AS t2_id, t3.id AS t3_id;
Result
is_trail_path | t1_id | t2_id | t3_id |
---|---|---|---|
FALSE | 16 | 20 | 16 |
TRUE | 7 | 16 | 20 |
TRUE | 7 | 16 | 20 |
Path modes
In Spanner Graph, repeating nodes and edges are returned by default. You can use the following path modes to include or exclude paths that have repeating nodes and edges based on the specified mode. For detailed semantics, see Path mode.
WALK
The default WALK
path mode keeps all paths, including ones with repeating
nodes and edges.
Example
The following query demonstrates the use of the WALK
path mode on a
non-quantified path pattern. The first path in the results uses the same edge
for t1
and t3
.
GRAPH FinGraph
MATCH p = WALK (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
(mid2:Account)-[t3:Transfers]->(dst:Account)
WHERE src.id < dst.id
RETURN t1.id AS transfer1_id, t2.id AS transfer2_id, t3.id AS transfer3_id;
Result
transfer1_id | transfer2_id | transfer3_id |
---|---|---|
16 | 20 | 16 |
7 | 16 | 20 |
7 | 16 | 20 |
ACYCLIC
The default ACYCLIC
path mode filters out paths that have repeating nodes.
Example
The following query demonstrates the use of the ACYCLIC
path mode on a
non-quantified path pattern.
The path with equal src
and dst
nodes is filtered out.
GRAPH FinGraph
MATCH p = ACYCLIC (src:Account)-[t1:Transfers]->(mid:Account)-[t2:Transfers]->(dst:Account)
RETURN src.id AS account1_id, mid.id AS account2_id, dst.id AS account3_id;
Result
account1_id | account2_id | account3_id |
---|---|---|
20 | 7 | 16 |
20 | 7 | 16 |
7 | 16 | 20 |
7 | 16 | 20 |
16 | 20 | 7 |
TRAIL
The default TRAIL
path mode filters out paths that have repeating edges.
Example
The following query demonstrates the use of the TRAIL
path mode on a
non-quantified path pattern. The path whose t1
and t3
edges are equal is
filtered out.
GRAPH FinGraph
MATCH p = TRAIL (src:Account)-[t1:Transfers]->(mid1:Account)-[t2:Transfers]->
(mid2:Account)-[t3:Transfers]->(dst:Account)
RETURN
t1.id AS transfer1_id, t2.id AS transfer2_id, t3.id AS transfer3_id;
Result
transfer1_id | transfer2_id | transfer3_id |
---|---|---|
16 | 20 | 7 |
16 | 20 | 7 |
20 | 7 | 16 |
20 | 7 | 16 |
7 | 16 | 20 |
7 | 16 | 20 |
7 | 16 | 20 |
7 | 16 | 20 |
20 | 16 | 20 |
Path search prefix
You can use a path search prefix to restrict a path pattern to return the shortest path from each data partition. For detailed semantics, seePath search prefix.
ANY SHORTEST
The ANY SHORTEST
path search prefix returns the shortest path (the path with
the least number of edges) that matches the pattern from each data partition. If
there are more than one shortest paths per partition, returns any one of them.
Example
The following query matches any path between each pair of [a, b]
.
GRAPH FinGraph
MATCH p = ANY SHORTEST (a:Account)-[t:Transfers]->{1,4}(b:Account)
WHERE a.is_blocked
LET total_amount = SUM(t.amount)
RETURN a.id AS account1_id, total_amount, b.id AS account2_id;
Result
account1_id | total_amount | account2_id |
---|---|---|
16 | 500 | 16 |
16 | 800 | 7 |
16 | 300 | 20 |
Conversion rules
For more information, see GRAPH_PATH conversion rules.
Use case example
In the following use case example, you find all accounts have been routed
through one to three accounts, from account ID 20
.
GRAPH FinGraph
MATCH p = (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN DISTINCT dst.id AS dst;
Result
dst |
---|
7 |
16 |
20 |
However, a query that returns to account ID 20
might be an overly broad query
because it starts with account ID 20
. To show more specific results, you can
enforce your query to show only acyclic graph paths without any repeating nodes.
To do so, you can:
- Use
MATCH p = ACYCLIC <path_pattern>
; or - Apply an
IS_ACYCLIC(p)
filter in your query
The following query uses MATCH p = ACYCLIC PATH_PATTERN
:
GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN DISTINCT dst.id AS dst;
Result
dst |
---|
7 |
16 |
If you want to know the first account that money is transferred through, then you can run the following query:
GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})(-[:Transfers]->
(nexts:Account)){1,3}(dst:Account)
RETURN dst.id AS dst, ARRAY_AGG(DISTINCT nexts[0].id) AS unique_starts;
This query is unconventional because it introduces a new variable inside the
quantified path using nexts
to get the result. With path variables, you can
simplify the query:
GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->{1,3}(dst:Account)
RETURN dst.id AS dst, ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS unique_starts;
Using NODES(p)
returns all nodes along the path. Because the first node
account is specified as start
, the next one (at the first offset) is the first
account that money is transferred through.
Result
dst | unique_starts |
---|---|
7 | 16, 7 |
Paths are more useful when there are multiple quantified paths. You can add
a constraint that the paths found from start
must pass through account ID
7
:
GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->
{1,3}(mid:Account {id: 7})-[:Transfers]->{1,3}(dst:Account)
RETURN dst.id AS dst,
ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS unique_starts;
Although the MATCH
statement changed, the rest of the query doesn't need to
change. Without using path variables, there are cases where it's not possible
for Spanner to statically know which quantified path to inspect.
Using a path variable, you can get the sum of all transfers:
GRAPH FinGraph
MATCH p = ACYCLIC (start:Account {id: 20})-[:Transfers]->
{1,3}(mid:Account {id: 7})-[:Transfers]->{1,3}(dst:Account)
LET all_transfers = EDGES(p)
LET transfer_amounts = SUM(all_transfers.amount)
RETURN dst.id AS dst,
ARRAY_AGG(DISTINCT NODES(p)[OFFSET(1)].id) AS participating_neighbor_nodes, transfer_amounts;
Result
dst | participating_neighbor_nodes | transfer_amounts |
---|---|---|
16 | 7 | 600 |
16 | 7 | 800 |