Cypher 投影(已弃用)
此页面描述了已弃用的旧版 Cypher 投影。替换方法是使用新的 Cypher 投影,该投影在 使用 Cypher 投影图 中进行了描述。迁移指南可以在 附录 C,从旧版迁移到新的 Cypher 投影 中找到。 |
旧版 Cypher 投影与 原生投影 相比,是一种更灵活和更具表现力的方法。旧版 Cypher 投影使用 Cypher 从 Neo4j 数据库创建(*投影*)内存中的图。
考量因素
生命周期
投影的图将驻留在目录中,直到:
|
节点属性支持
旧版 Cypher 投影只能从 Cypher 查询中投影有限的节点属性类型集。 节点属性页面 详细说明了支持哪些节点属性类型。其他类型的节点属性必须转换为支持的类型之一,或对其进行编码,才能使用旧版 Cypher 投影进行投影。
语法
旧版 Cypher 投影需要三个必填参数:graphName
、nodeQuery
和 relationshipQuery
。此外,可选的 configuration
参数允许我们进一步配置图的创建。
CALL gds.graph.project.cypher(
graphName: String,
nodeQuery: String,
relationshipQuery: String,
configuration: Map
) YIELD
graphName: String,
nodeQuery: String,
nodeCount: Integer,
relationshipQuery: String,
relationshipCount: Integer,
projectMillis: Integer
名称 | 可选 | 描述 |
---|---|---|
graphName |
否 |
图在目录中存储的名称。 |
nodeQuery |
否 |
用于投影节点的 Cypher 查询。查询结果必须包含 |
relationshipQuery |
否 |
用于投影关系的 Cypher 查询。查询结果必须包含 |
configuration |
是 |
用于配置旧版 Cypher 投影的其他参数。 |
名称 | 类型 | 默认 | 描述 |
---|---|---|---|
readConcurrency |
整数 |
4 |
用于创建图表的并发线程数。 |
validateRelationships |
布尔值 |
true |
如果 |
参数 |
映射 |
{} |
传递到节点和关系查询的用户定义查询参数的映射。 |
jobId |
字符串 |
内部生成 |
可以提供的 ID,以便更容易跟踪投影的进度。 |
名称 | 类型 | 描述 |
---|---|---|
graphName |
字符串 |
图在目录中存储的名称。 |
nodeQuery |
字符串 |
用于投影图中节点的 Cypher 查询。 |
nodeCount |
整数 |
投影图中存储的节点数。 |
relationshipQuery |
字符串 |
用于投影图中关系的 Cypher 查询。 |
relationshipCount |
整数 |
投影图中存储的关系数。 |
projectMillis |
整数 |
投影图的毫秒数。 |
要获取有关存储图的信息(例如其架构),可以使用gds.graph.list。 |
示例
以下所有示例都应在空数据库中运行。 |
为了演示 GDS Graph Project 的功能,我们将创建一个小型社交网络图在 Neo4j 中。示例图如下所示
CREATE
(florentin:Person { name: 'Florentin', age: 16 }),
(adam:Person { name: 'Adam', age: 18 }),
(veselin:Person { name: 'Veselin', age: 20, ratings: [5.0] }),
(hobbit:Book { name: 'The Hobbit', isbn: 1234, numberOfPages: 310, ratings: [1.0, 2.0, 3.0, 4.5] }),
(frankenstein:Book { name: 'Frankenstein', isbn: 4242, price: 19.99 }),
(florentin)-[:KNOWS { since: 2010 }]->(adam),
(florentin)-[:KNOWS { since: 2018 }]->(veselin),
(florentin)-[:READ { numberOfPages: 4 }]->(hobbit),
(florentin)-[:READ { numberOfPages: 42 }]->(hobbit),
(adam)-[:READ { numberOfPages: 30 }]->(hobbit),
(veselin)-[:READ]->(frankenstein)
简单图
简单图是只有一个节点标签和关系类型的图,即单部图。我们将从演示如何通过仅投影Person
节点标签和KNOWS
关系类型来加载简单图开始。
Person
节点和KNOWS
关系CALL gds.graph.project.cypher(
'persons',
'MATCH (n:Person) RETURN id(n) AS id',
'MATCH (n:Person)-[r:KNOWS]->(m:Person) RETURN id(n) AS source, id(m) AS target')
YIELD
graphName AS graph, nodeQuery, nodeCount AS nodes, relationshipQuery, relationshipCount AS rels
graph | nodeQuery | nodes | relationshipQuery | rels |
---|---|---|---|---|
"persons" |
|
3 |
"MATCH (n:Person)-[r:KNOWS]→(m:Person) RETURN id(n) AS source, id(m) AS target" |
|
多图
多图是具有多个节点标签和关系类型的图。
当我们加载多个节点标签和关系类型时,要保留标签和类型信息,我们可以向节点查询添加labels
列,向关系查询添加type
列。
Person
和Book
节点以及KNOWS
和READ
关系CALL gds.graph.project.cypher(
'personsAndBooks',
'MATCH (n) WHERE n:Person OR n:Book RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:KNOWS|READ]->(m) RETURN id(n) AS source, id(m) AS target, type(r) AS type')
YIELD
graphName AS graph, nodeQuery, nodeCount AS nodes, relationshipCount AS rels
graph | nodeQuery | nodes | rels |
---|---|---|---|
"personsAndBooks" |
|
|
|
关系方向
本机投影支持为每个关系类型指定方向。Legacy Cypher 投影将关系查询返回的每个关系视为NATURAL
方向,并从第一个提供的 id(源)到第二个(目标)创建一个有向关系。通过切换 RETURN 子句中 id 的顺序来实现REVERSE
方向的投影,例如MATCH (n)-[r:KNOWS]→(m) RETURN id(m) AS source, id(n) AS target, type(r) AS type
。
当使用 Legacy Cypher 投影时,无法以UNDIRECTED
方向投影图。
某些算法要求图以 |
节点属性
要加载节点属性,我们将为每个属性向节点查询的结果添加一列。因此,我们使用 Cypher 函数coalesce()函数来指定默认值,如果节点没有该属性。
Person
和Book
节点以及KNOWS
和READ
关系CALL gds.graph.project.cypher(
'graphWithProperties',
'MATCH (n)
WHERE n:Book OR n:Person
RETURN
id(n) AS id,
labels(n) AS labels,
coalesce(n.age, 18) AS age,
coalesce(n.price, 5.0) AS price,
n.ratings AS ratings',
'MATCH (n)-[r:KNOWS|READ]->(m) RETURN id(n) AS source, id(m) AS target, type(r) AS type'
)
YIELD
graphName, nodeCount AS nodes, relationshipCount AS rels
RETURN graphName, nodes, rels
graphName | nodes | rels |
---|---|---|
"graphWithProperties" |
5 |
6 |
投影的graphWithProperties
图包含五个节点和六个关系。在 Legacy Cypher 投影中,来自nodeQuery
的每个节点都获得相同的节点属性,这意味着您无法拥有特定于标签的属性。例如,在上面的示例中,Person
节点也将获得ratings
和price
属性,而Book
节点将获得age
属性。
此外,price
属性的默认值为5.0
。并非每本书在示例图中都指定了价格。在下文中,我们将检查价格是否正确投影
MATCH (n:Book)
RETURN n.name AS name, gds.util.nodeProperty('graphWithProperties', id(n), 'price') AS price
ORDER BY price
name | price |
---|---|
"The Hobbit" |
5.0 |
"Frankenstein" |
19.99 |
我们可以看到,价格是投影的,其中霍比特人的默认价格为 5.0。
关系属性
与节点属性类似,我们可以使用relationshipQuery
来投影关系属性。
Person
和Book
节点以及具有numberOfPages
属性的READ
关系CALL gds.graph.project.cypher(
'readWithProperties',
'MATCH (n) RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:READ]->(m)
RETURN id(n) AS source, id(m) AS target, type(r) AS type, r.numberOfPages AS numberOfPages'
)
YIELD
graphName AS graph, nodeCount AS nodes, relationshipCount AS rels
graph | nodes | rels |
---|---|---|
"readWithProperties" |
5 |
4 |
接下来,我们将验证关系属性numberOfPages
是否已正确加载。
numberOfPages
CALL gds.graph.relationshipProperty.stream('readWithProperties', 'numberOfPages')
YIELD sourceNodeId, targetNodeId, propertyValue AS numberOfPages
RETURN
gds.util.asNode(sourceNodeId).name AS person,
gds.util.asNode(targetNodeId).name AS book,
numberOfPages
ORDER BY person ASC, numberOfPages DESC
person | book | numberOfPages |
---|---|---|
"Adam" |
"The Hobbit" |
30.0 |
"Florentin" |
"The Hobbit" |
42.0 |
"Florentin" |
"The Hobbit" |
4.0 |
"Veselin" |
"Frankenstein" |
NaN |
我们可以看到,numberOfPages
已加载。默认属性值为Double.Nan
,可以更改为之前的示例节点属性中使用 Cypher 函数coalesce()。
并行关系
Neo4j 中的属性图模型支持并行关系,即两个节点之间的多个关系。默认情况下,GDS 会保留并行关系。对于某些算法,我们希望投影图在两个节点之间最多包含一个关系。
实现关系重复数据删除的最简单方法是在关系查询中使用DISTINCT
运算符。或者,我们可以使用count()函数聚合并行关系,并将计数存储为关系属性。
Person
和Book
节点以及聚合的COUNT
关系CALL gds.graph.project.cypher(
'readCount',
'MATCH (n) RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:READ]->(m)
RETURN id(n) AS source, id(m) AS target, type(r) AS type, count(r) AS numberOfReads'
)
YIELD
graphName AS graph, nodeCount AS nodes, relationshipCount AS rels
graph | nodes | rels |
---|---|---|
"readCount" |
5 |
3 |
接下来,我们将验证READ
关系是否已正确聚合。
numberOfReads
CALL gds.graph.relationshipProperty.stream('readCount', 'numberOfReads')
YIELD sourceNodeId, targetNodeId, propertyValue AS numberOfReads
RETURN
gds.util.asNode(sourceNodeId).name AS person,
gds.util.asNode(targetNodeId).name AS book,
numberOfReads
ORDER BY numberOfReads DESC, person
person | book | numberOfReads |
---|---|---|
"Florentin" |
"The Hobbit" |
2.0 |
"Adam" |
"The Hobbit" |
1.0 |
"Veselin" |
"Frankenstein" |
1.0 |
我们可以看到,Florentin 和霍比特人之间的两个 READ 关系导致了2
个 numberOfReads。
具有属性的并行关系
对于具有关系属性的图,我们还可以使用Cypher 手册中记录的其他聚合。
Person
和Book
节点以及通过对numberOfPages
求和来聚合的READ
关系CALL gds.graph.project.cypher(
'readSums',
'MATCH (n) RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:READ]->(m)
RETURN id(n) AS source, id(m) AS target, type(r) AS type, sum(r.numberOfPages) AS numberOfPages'
)
YIELD
graphName AS graph, nodeCount AS nodes, relationshipCount AS rels
graph | nodes | rels |
---|---|---|
"readSums" |
5 |
3 |
接下来,我们将验证关系属性numberOfPages
是否已正确聚合。
numberOfPages
CALL gds.graph.relationshipProperty.stream('readSums', 'numberOfPages')
YIELD sourceNodeId, targetNodeId, propertyValue AS numberOfPages
RETURN
gds.util.asNode(sourceNodeId).name AS person,
gds.util.asNode(targetNodeId).name AS book,
numberOfPages
ORDER BY numberOfPages DESC, person
person | book | numberOfPages |
---|---|---|
"Florentin" |
"The Hobbit" |
46.0 |
"Adam" |
"The Hobbit" |
30.0 |
"Veselin" |
"Frankenstein" |
0.0 |
我们可以看到,Florentin 和霍比特人之间的两个READ
关系总计为46
个 numberOfPages。
投影过滤的 Neo4j 图
Cypher 投影使我们能够以更细粒度的方式指定要投影的图。以下示例将演示如何过滤掉READ
关系,前提是它们没有numberOfPages
属性。
Person
和Book
节点以及存在numberOfPages
的READ
关系CALL gds.graph.project.cypher(
'existingNumberOfPages',
'MATCH (n) RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:READ]->(m)
WHERE r.numberOfPages IS NOT NULL
RETURN id(n) AS source, id(m) AS target, type(r) AS type, r.numberOfPages AS numberOfPages'
)
YIELD
graphName AS graph, nodeCount AS nodes, relationshipCount AS rels
graph | nodes | rels |
---|---|---|
"existingNumberOfPages" |
5 |
3 |
接下来,我们将验证关系属性numberOfPages
是否已正确加载。
numberOfPages
CALL gds.graph.relationshipProperty.stream('existingNumberOfPages', 'numberOfPages')
YIELD sourceNodeId, targetNodeId, propertyValue AS numberOfPages
RETURN
gds.util.asNode(sourceNodeId).name AS person,
gds.util.asNode(targetNodeId).name AS book,
numberOfPages
ORDER BY person ASC, numberOfPages DESC
person | book | numberOfPages |
---|---|---|
"Adam" |
"The Hobbit" |
30.0 |
"Florentin" |
"The Hobbit" |
42.0 |
"Florentin" |
"The Hobbit" |
4.0 |
使用查询参数
与Cypher类似,也可以设置查询参数。在以下示例中,我们提供字符串列表来限制要投影的城市。
Person
和Book
节点以及numberOfPages
大于 9 的READ
关系CALL gds.graph.project.cypher(
'existingNumberOfPages',
'MATCH (n) RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:READ]->(m)
WHERE r.numberOfPages > $minNumberOfPages
RETURN id(n) AS source, id(m) AS target, type(r) AS type, r.numberOfPages AS numberOfPages',
{ parameters: { minNumberOfPages: 9} }
)
YIELD
graphName AS graph, nodeCount AS nodes, relationshipCount AS rels
graph | nodes | rels |
---|---|---|
"existingNumberOfPages" |
5 |
2 |
参数的进一步使用
参数也可以用于直接传递节点列表或关系列表。例如,如果节点过滤器很昂贵,则预先计算节点列表可能很有用。
Person
节点,以及KNOWS
关系CALL gds.graph.project.cypher(
'personSubset',
'MATCH (n)
WHERE n.age < 20 AND NOT n.name STARTS WITH "V"
RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:KNOWS]->(m)
WHERE (n.age < 20 AND NOT n.name STARTS WITH "V") AND
(m.age < 20 AND NOT m.name STARTS WITH "V")
RETURN id(n) AS source, id(m) AS target, type(r) AS type, r.numberOfPages AS numberOfPages'
)
YIELD
graphName, nodeCount AS nodes, relationshipCount AS rels
graphName | nodes | rels |
---|---|---|
"personSubset" |
2 |
1 |
通过将相关的 Persons 作为参数传递,上面的查询可以转换为以下查询
Person
节点,以及KNOWS
关系MATCH (n)
WHERE n.age < 20 AND NOT n.name STARTS WITH "V"
WITH collect(n) AS olderPersons
CALL gds.graph.project.cypher(
'personSubsetViaParameters',
'UNWIND $nodes AS n RETURN id(n) AS id, labels(n) AS labels',
'MATCH (n)-[r:KNOWS]->(m)
WHERE (n IN $nodes) AND (m IN $nodes)
RETURN id(n) AS source, id(m) AS target, type(r) AS type, r.numberOfPages AS numberOfPages',
{ parameters: { nodes: olderPersons} }
)
YIELD
graphName, nodeCount AS nodes, relationshipCount AS rels
RETURN graphName, nodes, rels
graphName | nodes | rels |
---|---|---|
"personSubsetViaParameters" |
2 |
1 |