使用 Neo4j-admin copy 在 4.0 中进行数据库压缩
本文演示了如何使用 neo4j-admin copy 工具回收被 neo4j 存储文件占用的未使用空间。
1). 添加 10 万个节点:foreach (x in range (1,100000) | create (n:testnode1 {id:x}))
。
2). 检查分配的 ID 范围:MATCH (n:testnode1) RETURN ID(n) as ID order by ID limit 5
。
-
ID 升序:0, 1, 2, 3, 4;ID 降序:99999, 99998, 99997, 99996, 99995。
3). 执行 :sysinfo:
命令:总存储大小=18.6 MiB,ID 分配:节点 ID 100000,属性 ID 100000。
4). 然后我们可以通过 Match (n) detach delete n
命令删除上述创建的节点。
5). 报告的总存储大小为 :sysinfo:
总存储大小=18.6 MiB,ID 分配:节点 ID 100000,属性 ID 100000。
6). 然后我们可以执行完整的 neo4j-admin 备份 (https://neo4j.ac.cn/docs/operations-manual/current/backup-restore/online-backup/) 来执行在线备份,该备份默认执行检查点(将 pagecache 中缓存的任何更新刷新到存储文件)。
7). 从上面的步骤 6 可以看出,分配的 ID 保持不变,并且尽管进行了删除,存储大小也未改变。如果在此时,或在经常进行大量加载/删除操作并可能导致存储文件占用大量未使用空间的生产数据库中,我们可以使用在 4.0 中引入的 neo4j-admin copy
工具(实质上是 store-utils 的合并)(https://neo4j.ac.cn/docs/operations-manual/current/tools/neo4j-admin/#neo4j-admin-syntax-and-commands)。然后我们可以使用步骤 6 中执行的备份来执行 neo4j-admin copy 工具。请注意,neo4j-admin copy 只能在离线数据库或备份上执行。
8). 执行 neo4j-admin copy 命令,例如:
$./bin/neo4j-admin copy --from-database=neo4j --to-database=1/backups/copy:
Starting to copy store, output will be saved to: /$neo4j_home/logs/neo4j-admin-copy-2020-01-16.12.06.38.log
2020-01-16 12:06:38.777+0000 INFO [StoreCopy] ### Copy Data ###
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Source: /Users/um/neo4j/4.0/cc/1/data/databases/neo4j
2020-01-16 12:06:38.778+0000 INFO [StoreCopy] Target: /Users/um/neo4j/4.0/cc/1/data/databases/1/backups/copy
2020-01-16 12:06:38.779+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source.
2020-01-16 12:06:40.159+0000 INFO [o.n.i.b.ImportLogic] Import starting
Import starting 2020-01-16 12:06:40.227+0000
Estimated number of nodes: 0.00
Estimated number of node properties: 0.00
Estimated number of relationships: 0.00
Estimated number of relationship properties: 0.00
Estimated disk space usage: 3.922MiB
Estimated required memory usage: 7.969MiB
(1/4) Node import 2020-01-16 12:06:40.604+0000
Estimated number of nodes: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(2/4) Relationship import 2020-01-16 12:06:42.804+0000
Estimated number of relationships: 0.00
Estimated disk space usage: 1.961MiB
Estimated required memory usage: 7.969MiB
(3/4) Relationship linking 2020-01-16 12:06:43.046+0000
Estimated required memory usage: 7.969MiB
(4/4) Post processing 2020-01-16 12:06:43.461+0000
Estimated required memory usage: 7.969MiB
-......... .......... .......... .......... .......... 5% ∆226ms
.......... .......... .......... .......... .......... 10% ∆1ms
.......... .......... .......... .......... .......... 15% ∆1ms
.......... .......... .......... .......... .......... 20% ∆1ms
.......... .......... .......... .......... .......... 25% ∆0ms
.......... .......... .......... .......... .......... 30% ∆1ms
.......... .......... .......... .......... .......... 35% ∆0ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆0ms
.......... .......... .......... .......... .......... 50% ∆1ms
.......... .......... .......... .......... .......... 55% ∆0ms
.......... .......... .......... .......... .......... 60% ∆0ms
.......... .......... .......... .......... .......... 65% ∆1ms
.......... .......... .......... .......... .......... 70% ∆0ms
.......... .......... .......... .......... .......... 75% ∆1ms
.......... .......... .......... .......... .......... 80% ∆0ms
.......... .......... .......... .......... .......... 85% ∆0ms
.......... .......... .......... .......... .......... 90% ∆1ms
.......... .......... .......... .......... .......... 95% ∆0ms
.......... .......... .......... .......... .......... 100% ∆1ms
IMPORT DONE in 3s 860ms.
Imported:
0 nodes
0 relationships
0 properties
Peak memory usage: 7.969MiB
2020-01-16 12:06:44.031+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 3s 860ms. Imported:
0 nodes
0 relationships
0 properties
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] Import summary: Copying of 200622 records took 5 seconds (40124 rec/s). Unused Records 200622 (100%) Removed Records 0 (0%)
2020-01-16 12:06:44.318+0000 INFO [StoreCopy] ### Extracting schema ###
2020-01-16 12:06:44.319+0000 INFO [StoreCopy] Trying to extract schema...
2020-01-16 12:06:44.330+0000 INFO [StoreCopy] ... found 0 schema definition. The following can be used to recreate the schema:
2020-01-16 12:06:44.332+0000 INFO [StoreCopy]
上述示例在大约 6 秒内完成,生成了一个紧凑且一致的存储(任何不一致的节点、属性、关系都不会被复制到新创建的存储中)。另一点需要注意的是,上述的 '/copy' 是在 $neo4j_home/data/databases/neo4j/1/backups/copy 创建的,而不是在 /current-directory/1/backups/copy 创建的,因为 copy 工具会在指定的目的地目录前加上 $neo4j_home/data/databases/<database_name>
前缀。
9). 然后我们可以将上述副本在独立的 Neo4j 4.0 实例上恢复,并与之前 61.6MiB 的存储大小进行比较:执行 ./sa/bin/neo4j-admin restore --from=cc/1/data/databases/1/backups/copy --verbose --database=sa/data/databases/neo4j --force
请注意,恢复的 neo4j 数据库被恢复到 $neo4j_home/data/databases/sa/data/databases
,同样在指定的目的地目录前加上了 $neo4j_home/data/databases
前缀。
10). 最后,将现在(压缩后)的总存储大小与之前进行比较
在此示例中,上述恢复的数据库上的 sysinfo 显示总存储大小 = 800.00 KiB
这表明 neo4j-admin copy 工具成功地压缩了存储,并且操作系统回收了 ID 存储为未来 ID 创建保留的空间。
参考
此页面有帮助吗?