导出到 CSV
CSV 导出过程将数据导出为 Python 和 R 生态系统中的数据科学库更支持的格式。我们也可能希望将数据导出为 JSON 格式,以便导入其他工具或用于查询结果的通用共享。本节中描述的过程支持导出到文件或作为流。
对于 apoc.export.csv.all
、apoc.export.csv.data
和 apoc.export.csv.graph
,节点和关系属性按字母顺序排列,使用以下结构:
_id,_labels,<自然排序的节点属性列表>,_start,_end,_type,<自然排序的关系属性列表>
.
对于包含节点属性 age
、city
、kids
、male
、name
和 street
以及关系属性 bar
和 foo
的图,我们将得到以下结果:
_id,_labels,age,city,kids,male,name,street,_start,_end,_type,bar,foo
导出的标签按字母顺序排列。labels()
函数的输出未排序,请与 apoc.coll.sort()
结合使用。
请注意,为了执行正确的点序列化,不建议导出具有坐标 x,y 和 crs: 'wgs-84' 的点,例如 point({x: 56.7, y: 12.78, crs: 'wgs-84'})
。否则,该点将以经度和纬度(以及高度)而不是 x 和 y(以及 z)导出。
可用过程
下表描述了可用过程:
合格名称 | 类型 |
---|---|
apoc.export.csv.all |
|
apoc.export.csv.data |
|
apoc.export.csv.graph |
|
apoc.export.csv.query |
|
导出到文件
默认情况下,禁用导出到文件系统。我们可以通过在 apoc.conf
中设置以下属性来启用它:
apoc.export.file.enabled=true
有关访问 apoc.conf
的更多信息,请参阅配置选项一章。
如果我们尝试使用任何导出过程而未首先设置此属性,我们将收到以下错误消息:
调用过程失败:原因:java.lang.RuntimeException: 文件导出未启用,请在您的 apoc.conf 中设置 apoc.export.file.enabled=true。否则,如果您在没有文件系统访问权限的云环境中运行,请使用 |
导出文件会写入 import
目录,该目录由 server.directories.import
属性定义。这意味着我们提供的任何文件路径都是相对于此目录的。如果我们尝试写入绝对路径,例如 /tmp/filename
,我们将收到类似以下内容的错误消息:
调用过程失败:原因:java.io.FileNotFoundException: /path/to/neo4j/import/tmp/fileName (没有此类文件或目录) |
我们可以通过在 apoc.conf
中设置以下属性来启用在文件系统上的任何位置写入:
apoc.import.file.use_neo4j_config=false
Neo4j 现在将能够在文件系统上的任何位置写入,因此在设置此属性之前请确保这是您的意图。 |
示例
本节包含演示如何使用 CSV 导出过程的示例。这些示例基于电影数据集,可以通过运行以下 Cypher 查询导入:
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
CREATE (Laurence:Person {name:'Laurence Fishburne', born:1961})
CREATE (Hugo:Person {name:'Hugo Weaving', born:1960})
CREATE (LillyW:Person {name:'Lilly Wachowski', born:1967})
CREATE (LanaW:Person {name:'Lana Wachowski', born:1965})
CREATE (JoelS:Person {name:'Joel Silver', born:1952})
CREATE
(Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix),
(Carrie)-[:ACTED_IN {roles:['Trinity']}]->(TheMatrix),
(Laurence)-[:ACTED_IN {roles:['Morpheus']}]->(TheMatrix),
(Hugo)-[:ACTED_IN {roles:['Agent Smith']}]->(TheMatrix),
(LillyW)-[:DIRECTED]->(TheMatrix),
(LanaW)-[:DIRECTED]->(TheMatrix),
(JoelS)-[:PRODUCED]->(TheMatrix);
下面的 Neo4j Browser 可视化显示了导入的图:

将整个数据库导出到 CSV
apoc.export.csv.all
过程将整个数据库导出到 CSV 文件或作为流。
movies.csv
:CALL apoc.export.csv.all("movies.csv", {})
文件 | 来源 | 格式 | 节点 | 关系 | 属性 | 时间 | 行 | 批处理大小 | 批次 | 完成 | 数据 |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies.csv" |
"database: nodes(8), rels(7)" |
"csv" |
8 |
7 |
21 |
39 |
15 |
20000 |
1 |
TRUE |
NULL |
movies.csv
的内容如下所示:
"_id","_labels","born","name","released","tagline","title","_start","_end","_type","roles"
"188",":Movie","","","1999","Welcome to the Real World","The Matrix",,,,
"189",":Person","1964","Keanu Reeves","","","",,,,
"190",":Person","1967","Carrie-Anne Moss","","","",,,,
"191",":Person","1961","Laurence Fishburne","","","",,,,
"192",":Person","1960","Hugo Weaving","","","",,,,
"193",":Person","1967","Lilly Wachowski","","","",,,,
"194",":Person","1965","Lana Wachowski","","","",,,,
"195",":Person","1952","Joel Silver","","","",,,,
,,,,,,,"189","188","ACTED_IN","[""Neo""]"
,,,,,,,"190","188","ACTED_IN","[""Trinity""]"
,,,,,,,"191","188","ACTED_IN","[""Morpheus""]"
,,,,,,,"192","188","ACTED_IN","[""Agent Smith""]"
,,,,,,,"193","188","DIRECTED",""
,,,,,,,"194","188","DIRECTED",""
,,,,,,,"195","188","PRODUCED",""
data
列中返回整个数据库的流:CALL apoc.export.csv.all(null, {stream:true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data
文件 | 节点 | 关系 | 属性 | 数据 |
---|---|---|---|---|
|
|
|
|
"\"_id\",\"_labels\",\"born\",\"name\",\"released\",\"tagline\",\"title\",\"_start\",\"_end\",\"_type\",\"roles\" \"188\",\":Movie\",\"\",\"\",\"1999\",\"Welcome to the Real World\",\"The Matrix\",,,, \"189\",\":Person\",\"1964\",\"Keanu Reeves\",\"\",\"\",\"\",,,, \"190\",\":Person\",\"1967\",\"Carrie-Anne Moss\",\"\",\"\",\"\",,,, \"191\",\":Person\",\"1961\",\"Laurence Fishburne\",\"\",\"\",\"\",,,, \"192\",\":Person\",\"1960\",\"Hugo Weaving\",\"\",\"\",\"\",,,, \"193\",\":Person\",\"1967\",\"Lilly Wachowski\",\"\",\"\",\"\",,,, \"194\",\":Person\",\"1965\",\"Lana Wachowski\",\"\",\"\",\"\",,,, \"195\",\":Person\",\"1952\",\"Joel Silver\",\"\",\"\",\"\",,,, ,,,,,,,\"189\",\"188\",\"ACTED_IN\",\"[\"\"Neo\"\"]\" ,,,,,,,\"190\",\"188\",\"ACTED_IN\",\"[\"\"Trinity\"\"]\" ,,,,,,,\"191\",\"188\",\"ACTED_IN\",\"[\"\"Morpheus\"\"]\" ,,,,,,,\"192\",\"188\",\"ACTED_IN\",\"[\"\"Agent Smith\"\"]\" ,,,,,,,\"193\",\"188\",\"DIRECTED\",\"\" ,,,,,,,\"194\",\"188\",\"DIRECTED\",\"\" ,,,,,,,\"195\",\"188\",\"PRODUCED\",\"\" " |
将指定节点和关系导出到 CSV
apoc.export.csv.data
过程将指定节点和关系导出到 CSV 文件或作为流。
:Person
标签且 name
属性以 L
开头的节点导出到文件 movies-l.csv
:MATCH (person:Person)
WHERE person.name STARTS WITH "L"
WITH collect(person) AS people
CALL apoc.export.csv.data(people, [], "movies-l.csv", {})
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
文件 | 来源 | 格式 | 节点 | 关系 | 属性 | 时间 | 行 | 批处理大小 | 批次 | 完成 | 数据 |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies-l.csv" |
"data: nodes(3), rels(0)" |
"csv" |
3 |
0 |
6 |
2 |
3 |
20000 |
1 |
TRUE |
NULL |
movies-l.csv
的内容如下所示:
"_id","_labels","born","name","_start","_end","_type"
"191",":Person","1961","Laurence Fishburne",,,
"193",":Person","1967","Lilly Wachowski",,,
"194",":Person","1965","Lana Wachowski",,,
ACTED_IN
关系以及该关系两侧带有 Person
和 Movie
标签的节点导出到文件 movies-actedIn.csv
:MATCH (person:Person)-[actedIn:ACTED_IN]->(movie:Movie)
WITH collect(DISTINCT person) AS people, collect(DISTINCT movie) AS movies, collect(actedIn) AS actedInRels
CALL apoc.export.csv.data(people + movies, actedInRels, "movies-actedIn.csv", {})
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
文件 | 来源 | 格式 | 节点 | 关系 | 属性 | 时间 | 行 | 批处理大小 | 批次 | 完成 | 数据 |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies-actedIn.csv" |
"data: nodes(5), rels(4)" |
"csv" |
5 |
4 |
15 |
2 |
9 |
20000 |
1 |
TRUE |
NULL |
movies-actedIn.csv
的内容如下所示:
"_id","_labels","born","name","released","tagline","title","_start","_end","_type","roles"
"189",":Person","1964","Keanu Reeves","","","",,,,
"190",":Person","1967","Carrie-Anne Moss","","","",,,,
"191",":Person","1961","Laurence Fishburne","","","",,,,
"192",":Person","1960","Hugo Weaving","","","",,,,
"188",":Movie","","","1999","Welcome to the Real World","The Matrix",,,,
,,,,,,,"189","188","ACTED_IN","[""Neo""]"
,,,,,,,"190","188","ACTED_IN","[""Trinity""]"
,,,,,,,"191","188","ACTED_IN","[""Morpheus""]"
,,,,,,,"192","188","ACTED_IN","[""Agent Smith""]"
data
列中返回所有 ACTED_IN
关系以及该关系两侧带有 Person
和 Movie
标签的节点流:MATCH (person:Person)-[actedIn:ACTED_IN]->(movie:Movie)
WITH collect(DISTINCT person) AS people, collect(DISTINCT movie) AS movies, collect(actedIn) AS actedInRels
CALL apoc.export.csv.data(people + movies, actedInRels, null, {stream: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data
文件 | 节点 | 关系 | 属性 | 数据 |
---|---|---|---|---|
|
|
|
|
"\"_id\",\"_labels\",\"born\",\"name\",\"released\",\"tagline\",\"title\",\"_start\",\"_end\",\"_type\",\"roles\" \"190\",\":Person\",\"1967\",\"Carrie-Anne Moss\",\"\",\"\",\"\",,,, \"189\",\":Person\",\"1964\",\"Keanu Reeves\",\"\",\"\",\"\",,,, \"191\",\":Person\",\"1961\",\"Laurence Fishburne\",\"\",\"\",\"\",,,, \"192\",\":Person\",\"1960\",\"Hugo Weaving\",\"\",\"\",\"\",,,, \"188\",\":Movie\",\"\",\"\",\"1999\",\"Welcome to the Real World\",\"The Matrix\",,,, ,,,,,,,\"189\",\"188\",\"ACTED_IN\",\"[\"\"Neo\"\"]\" ,,,,,,,\"190\",\"188\",\"ACTED_IN\",\"[\"\"Trinity\"\"]\" ,,,,,,,\"191\",\"188\",\"ACTED_IN\",\"[\"\"Morpheus\"\"]\" ,,,,,,,\"192\",\"188\",\"ACTED_IN\",\"[\"\"Agent Smith\"\"]\" " |
将 Cypher 查询结果导出到 CSV
apoc.export.csv.query
过程将 Cypher 查询的结果导出到 CSV 文件或作为流。
DIRECTED
关系以及该关系两侧带有 Person
和 Movie
标签的节点导出到文件 movies-directed.csv
:WITH "MATCH path = (person:Person)-[:DIRECTED]->(movie)
RETURN person.name AS name, person.born AS born,
movie.title AS title, movie.tagline AS tagline, movie.released AS released" AS query
CALL apoc.export.csv.query(query, "movies-directed.csv", {})
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data;
文件 | 来源 | 格式 | 节点 | 关系 | 属性 | 时间 | 行 | 批处理大小 | 批次 | 完成 | 数据 |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies-directed.csv" |
"statement: cols(5)" |
"csv" |
0 |
0 |
10 |
3 |
2 |
20000 |
1 |
TRUE |
NULL |
movies-directed.csv
的内容如下所示:
"name","born","role","title","tagline","released"
"Lilly Wachowski","1967","","The Matrix","Welcome to the Real World","1999"
"Lana Wachowski","1965","","The Matrix","Welcome to the Real World","1999"
DIRECTED
关系以及该关系两侧带有 Person
和 Movie
标签的节点流:WITH "MATCH path = (person:Person)-[:DIRECTED]->(movie)
RETURN person.name AS name, person.born AS born,
movie.title AS title, movie.tagline AS tagline, movie.released AS released" AS query
CALL apoc.export.csv.query(query, null, {stream: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data;
文件 | 节点 | 关系 | 属性 | 数据 |
---|---|---|---|---|
|
|
|
|
"\"name\",\"born\",\"title\",\"tagline\",\"released\" \"Lilly Wachowski\",\"1967\",\"The Matrix\",\"Welcome to the Real World\",\"1999\" \"Lana Wachowski\",\"1965\",\"The Matrix\",\"Welcome to the Real World\",\"1999\" " |
您也可以压缩要导出的文件。有关更多信息,请参见此处 |
当启用 bulkImport
配置时,它会创建一个文件列表,可用于 Neo4j 批量导入。
此配置仅可与 apoc.export.csv.all
和 apoc.export.csv.graph
一起使用。
所有创建的文件都按以下方式命名:
-
节点文件以输入文件名为基础,并附加
.nodes.[LABEL_NAME].csv
-
关系文件以输入文件名为基础,并附加
.relationships.[TYPE_NAME].csv
如果节点或关系有多个标签/类型,它将为每个标签/类型创建一个文件。
配置参数
这些过程支持以下配置参数:
参数 | 默认值 | 描述 |
---|---|---|
批处理大小 |
20000 |
批处理大小。 |
分隔符 |
"," |
CSV 文件的分隔符字符。 |
数组分隔符 |
";" |
用于数组的分隔符字符(用于批量导入)。 |
引号 |
'always' |
CSV 使用的引用字符,可能的值有:
|
区分空值 |
false |
区分空字符串和空值。其行为受
|
使用类型 |
false |
在文件头中添加类型。 |
批量导入 |
true |
为 Neo4j Admin 导入创建文件。 |
超时秒数 |
100 |
查询在超时前应运行的最长时间(秒)。 |
独立标题 |
false |
创建两个文件:一个用于标题,一个用于数据。 |
流式语句 |
false |
通过配置 |
流 |
false |
等同于 |
引号配置示例
可以使用不同的引用策略将图导出到 CSV 文件。
以下查询在一个包含一个节点的图上执行:
CREATE (:Quote {name: 'foo,bar,baz', array:["a","b","c"], other: 123})
不带引号导出
CALL apoc.export.csv.all("fileNoQuote.csv", {quotes: 'none'})
返回以下 CSV 文件:
_id,_labels,array,name,other,_start,_end,_type
5,:Quote,["a","b","c"],foo,bar,baz,123,,,
区分空值配置示例
默认情况下,空字符串和空值被视为相同,但可以使用 differentiateNulls
配置项更改此行为。
以下查询在包含两个节点的图上执行:
CREATE (:Quote {value1: "Hello", value2: ""})
CREATE (:Quote {value1: "I have a \"null\" value", value2: null})
带引号并区分空值导出
CALL apoc.export.csv.all("fileAlwaysQuote.csv", {quotes: 'always', differentiateNulls: true})
返回以下 CSV 文件:
"_id","_labels","value1","value2","_start","_end","_type"
"0",":Quote","Hello","",,,
"1",":Quote","I have a ""null"" value",,,,
value2
的空字符串值用引号 ""
表示,而空值则留空。
仅在需要时带引号并区分空值导出
CALL apoc.export.csv.all("fileIfNeededQuote.csv", {quotes: 'ifNeeded', differentiateNulls: true})
返回以下 CSV 文件:
_id,_labels,value1,value2,_start,_end,_type
0,:Quote,Hello,"",,,
1,:Quote,"I have a ""null"" value",,,,
字符串 Hello
不带引号,因为它不包含任何特殊字符。字符串 I have a "null" value
带引号,因为它包含特殊字符。空字符串带引号是为了将其与空值区分开来。空值为空。