使用 Neo4j-admin copy 在 4.0 中进行数据库压缩
本文演示了如何使用 neo4j-admin copy 工具回收 neo4j 存储文件占用的未用空间。
1). 添加 100k 个节点:foreach (x in range (1,100000) | create (n:testnode1 {id:x}))
.
2). 检查已分配的 ID 范围:MATCH (n:testnode1) RETURN ID(n) as ID order by ID limit 5
.
-
ID 升序:0, 1, 2, 3, 4; ID 降序:99999, 99998, 99997, 99996, 99995.
3). 执行 :sysinfo:
总存储大小 = 18.6 MiB,ID 分配:节点 ID 100000,属性 ID 100000.
4). 然后我们可以通过 Match (n) detach delete n
删除上面创建的节点。
5). 报告的总存储大小为 :sysinfo:
总存储大小 = 18.6 MiB,ID 分配:节点 ID 100000,属性 ID 100000.
6). 然后我们可以执行完整的 neo4j-admin 备份 (https://neo4j.ac.cn/docs/operations-manual/current/backup-restore/online-backup/) 以执行在线备份,该备份默认情况下会执行检查点(将页面缓存中的任何缓存更新刷新到存储文件)。
7). 从上面的步骤 6 中,似乎已分配的 ID 保持不变,并且尽管已删除,但存储大小没有改变。如果此时,或者在频繁执行大量加载/删除操作的生产数据库中,可能会导致存储文件占用大量未用空间,我们可以使用在 4.0 中引入的 neo4j-admin copy
工具(实际上是 store-utils 的合并版本)(https://neo4j.ac.cn/docs/operations-manual/current/tools/neo4j-admin/#neo4j-admin-syntax-and-commands)。然后,我们可以使用步骤 6 中执行的备份来执行 neo4j-admin copy 工具。请注意,neo4j-admin copy 只能在脱机数据库或备份上执行。
8). 执行 neo4j-admin copy,例如:
$./bin/neo4j-admin copy --from-database=neo4j --to-database=1/backups/copy:
Starting to copy store, output will be saved to: /$neo4j_home/logs/neo4j-admin-copy-2020-01-16.12.06.38.log
2020-01-16 12:06:38.777+0000 INFO [StoreCopy] ### Copy Data ###
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Source: /Users/um/neo4j/4.0/cc/1/data/databases/neo4j
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Target: /Users/um/neo4j/4.0/cc/1/data/databases/1/backups/copy
2020-01-16 12:06:38.779+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source.
2020-01-16 12:06:40.159+0000 INFO [o.n.i.b.ImportLogic] Import starting
Import starting 2020-01-16 12:06:40.227+0000
Estimated number of nodes: 0.00
Estimated number of node properties: 0.00
Estimated number of relationships: 0.00
Estimated number of relationship properties: 0.00
Estimated disk space usage: 3.922MiB
Estimated required memory usage: 7.969MiB
(1/4) Node import 2020-01-16 12:06:40.604+0000
Estimated number of nodes: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(2/4) Relationship import 2020-01-16 12:06:42.804+0000
Estimated number of relationships: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(3/4) Relationship linking 2020-01-16 12:06:43.046+0000
Estimated required memory usage: 7.969MiB
(4/4) Post processing 2020-01-16 12:06:43.461+0000
Estimated required memory usage: 7.969MiB
-......... .......... .......... .......... .......... 5% ∆226ms
.......... .......... .......... .......... .......... 10% ∆1ms
.......... .......... .......... .......... .......... 15% ∆1ms
.......... .......... .......... .......... .......... 20% ∆1ms
.......... .......... .......... .......... .......... 25% ∆0ms
.......... .......... .......... .......... .......... 30% ∆1ms
.......... .......... .......... .......... .......... 35% ∆0ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆0ms
.......... .......... .......... .......... .......... 50% ∆1ms
.......... .......... .......... .......... .......... 55% ∆0ms
.......... .......... .......... .......... .......... 60% ∆0ms
.......... .......... .......... .......... .......... 65% ∆1ms
.......... .......... .......... .......... .......... 70% ∆0ms
.......... .......... .......... .......... .......... 75% ∆1ms
.......... .......... .......... .......... .......... 80% ∆0ms
.......... .......... .......... .......... .......... 85% ∆0ms
.......... .......... .......... .......... .......... 90% ∆1ms
.......... .......... .......... .......... .......... 95% ∆0ms
.......... .......... .......... .......... .......... 100% ∆1ms
IMPORT DONE in 3s 860ms.
Imported:
0 nodes
0 relationships
0 properties
Peak memory usage: 7.969MiB
2020-01-16 12:06:44.031+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 3s 860ms. Imported:
0 nodes
0 relationships
0 properties
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] Import summary: Copying of 200622 records took 5 seconds (40124 rec/s). Unused Records 200622 (100%) Removed Records 0 (0%)
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] ### Extracting schema ###
2020-01-16 12:06:44.319+0000 INFO [StoreCopy] Trying to extract schema...
2020-01-16 12:06:44.330+0000 INFO [StoreCopy] ... found 0 schema definition. The following can be used to recreate the schema:
2020-01-16 12:06:44.332+0000 INFO [StoreCopy]
上面的示例在大约 6 秒内完成,并导致了一个紧凑且一致的存储(任何不一致的节点、属性、关系都不会复制到新创建的存储)。另一个需要注意的是,上面的“/copy”是在 $neo4j_home/data/databases/neo4j/1/backups/copy 中创建的,而不是 /current-directory/1/backups/copy,因为 copy 工具将 $neo4j_home/data/databases/<database_name>
添加到指定的目录。
9). 然后,我们可以像在独立的 Neo4j 4.0 实例上一样恢复上面的副本,并将存储大小差异与之前的 61.6MiB 进行比较:执行 ./sa/bin/neo4j-admin restore --from=cc/1/data/databases/1/backups/copy --verbose --database=sa/data/databases/neo4j --force
请注意,恢复的 neo4j 数据库被恢复到 $neo4j_home/data/databases/sa/data/databases
,同样将指定的目录与 $neo4j_home/data/databases
结合起来。
10). 最后,将现在的总存储大小(压缩后)与之前进行比较
上面的恢复数据库上的 sysinfo 现在显示总存储大小在此示例中 = 800.00 KiB
这表明 neo4j-admin copy 工具已成功压缩存储,并且操作系统已回收 ID 存储为将来创建 ID 而保留的空间。
参考资料
此页面是否有帮助?