教程:在运行的集群中备份和复制单个数据库
本教程提供了一个详细示例,说明如何备份单个数据库(本例中为 3.5 版),并使用
neo4j-admin copy
命令将其复制到运行中的 4.x Neo4j 集群。
neo4j-admin copy
命令可用于清理数据库不一致、压缩存储以及将数据库(从社区版或企业版)升级/迁移到更高版本的 Neo4j 企业版。由于 neo4j-admin copy
命令不复制模式存储,因此不需要顺序路径的中间步骤。如果定义了模式,neo4j-admin copy
操作输出的命令可用于创建新模式。
请记住
因此,如果您想保留关系 ID,或者升级整个 DBMS,您应该遵循顺序路径。 |
值得注意的是,
例如,如果您的磁盘制造商提供了最大 5000 IOPS,那么您合理地可以预期每秒进行多达 5000 次这样的页面操作。因此,在该磁盘上您可以预期的最大理论吞吐量为 40MB/秒(或 144 GB/小时)。您可以假设,在该 5000 IOPS 磁盘上运行 然而,重要的是要记住,该过程必须从源数据库读取 144 GB,并且还必须写入目标存储(假设目标存储大小相当)。此外,复制过程中还有内部进程会多次读取/修改/写入存储。因此,加上额外的 144 GB 读取和写入,在 5000 IOPS 磁盘上运行 最后,同样重要的是要考虑,在几乎所有云环境中,公布的 IOPS 值可能与实际值不同,或者无法持续保持最大可能的 IOPS。本例的实际处理时间可能远超 3 小时的估算。 |
本教程将介绍检查数据库存储使用情况(本例中为 3.5 版)、执行备份、压缩数据库备份(使用 neo4j-admin copy
)以及在运行中的 Neo4j 4.x 集群中创建备份的基础知识。
检查 3.5 数据库存储使用情况
在备份和复制 3.5 数据库之前,让我们先看看数据库存储使用情况,并了解在加载、删除然后重新加载数据时它是如何变化的。
-
登录运行中的 3.5 Neo4j 独立实例的 Neo4j Browser,并使用以下命令向
graph.db
数据库添加 10 万个节点FOREACH (x IN RANGE (1,100000) | CREATE (n:Person {name:x}))
-
在
Person
节点的name
属性上创建索引CREATE INDEX ON :Person(name)
-
使用
dbms.checkpoint()
过程将所有缓存更新从页面缓存刷新到存储文件。CALL dbms.checkpoint()
-
在您的终端中,导航到
graph.db
数据库($NEO4J_HOME/data/databases/graph.db),并运行以下命令检查已加载节点和属性的存储大小。ls -alh
... -rw-r--r-- 1 username staff 1.4M 26 Nov 15:51 neostore.nodestore.db -rw-r--r-- 1 username staff 3.9M 26 Nov 15:51 neostore.propertystore.db ...
输出报告节点存储(neostore.nodestore.db)和属性存储(neostore.propertystore.db)分别占用
1.4M
和3.9M
。 -
在 Neo4j Browser 中,删除上面创建的节点,并再次运行
CALL dbms.checkpoint
以强制执行检查点。MATCH (n) DETACH DELETE n
CALL dbms.checkpoint()
-
现在,只添加一个节点,强制执行检查点,然后重复步骤 4,查看存储大小是否已更改。
CREATE (n:Person {name:"John"})
CALL dbms.checkpoint()
如果您现在检查节点存储和属性存储的大小,它们仍将是
1.4M
和3.9M
,即使数据库只包含一个节点和一个属性。Neo4j 不会缩小硬盘上的存储文件。
在执行大量加载/删除操作的生产数据库中,结果是存储文件占用了大量的未使用空间。 |
备份 3.5 数据库
导航到 /bin 文件夹,并运行以下命令将数据库备份到目标文件夹。如果存放备份的文件夹不存在,您必须创建它。在本例中,该文件夹名为 /tmp/3.5.24。
./neo4j-admin backup --backup-dir=/tmp/3.5.24 --name=graphdbbackup
有关执行备份和不同命令选项的详细信息,请参阅操作手册 → 执行备份。
将 3.5 数据库备份复制到 4.x Neo4j 集群
您可以使用 neo4j-admin copy
命令来回收未使用的空间,并在 4.x 集群中创建数据库备份的碎片整理副本。
为了加快复制操作,您可以使用 |
-
在每个集群成员上,导航到 /bin 文件夹并运行以下命令,创建 3.5 数据库备份的压缩存储副本。任何不一致的节点、属性和关系都不会复制到新创建的存储中。
./neo4j-admin copy --from-path=/private/tmp/3.5.24/graphdbbackup --to-database=compactdb
Selecting JVM - Version:11.0.6+8-LTS, Name:Java HotSpot(TM) 64-Bit Server VM, Vendor:Oracle Corporation Starting to copy store, output will be saved to: /Users/renetapopova/neo4j/cc-4.4.0/core1/logs/neo4j-admin-copy-2022-02-07.11.13.05.log 2022-02-07 11:13:06.920+0000 INFO [StoreCopy] ### Copy Data ### 2022-02-07 11:13:06.923+0000 INFO [StoreCopy] Source: /private/tmp/3.5.24/graphdbbackup (page cache 8m) 2022-02-07 11:13:06.924+0000 INFO [StoreCopy] Target: /Users/renetapopova/neo4j/cc-4.4.0/core1/data/databases/compactdb 2022-02-07 11:13:06.924+0000 INFO [StoreCopy] Empty database created, will start importing readable data from the source. 2022-02-07 11:13:09.911+0000 INFO [o.n.i.b.ImportLogic] Import starting Import starting 2022-02-07 11:13:09.963+0000 Estimated number of nodes: 50.00 k Estimated number of node properties: 50.00 k Estimated number of relationships: 0.00 Estimated number of relationship properties: 50.00 k Estimated disk space usage: 2.680MiB Estimated required memory usage: 36.71MiB (1/4) Node import 2022-02-07 11:13:11.069+0000 Estimated number of nodes: 50.00 k Estimated disk space usage: 1.698MiB Estimated required memory usage: 36.71MiB .......... .......... .......... .......... .......... 5% ∆236ms .......... .......... .......... .......... .......... 10% ∆24ms .......... .......... .......... .......... .......... 15% ∆3ms .......... .......... .......... .......... .......... 20% ∆2ms .......... .......... .......... .......... .......... 25% ∆1ms .......... .......... .......... .......... .......... 30% ∆0ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆3ms .......... .......... .......... .......... .......... 45% ∆2ms .......... .......... .......... .......... .......... 50% ∆1ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .........- 60% ∆77ms .......... .......... .......... .......... .......... 65% ∆2ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆1ms .......... .......... .......... .......... .......... 80% ∆0ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆0ms .......... .......... .......... .......... .......... 95% ∆0ms .......... .......... .......... .......... .......... 100% ∆0ms Node import COMPLETED in 458ms (2/4) Relationship import 2022-02-07 11:13:11.528+0000 Estimated number of relationships: 0.00 Estimated disk space usage: 1006KiB Estimated required memory usage: 43.90MiB Relationship import COMPLETED in 571ms (3/4) Relationship linking 2022-02-07 11:13:12.100+0000 Estimated required memory usage: 36.08MiB Relationship linking COMPLETED in 645ms (4/4) Post processing 2022-02-07 11:13:12.745+0000 Estimated required memory usage: 36.08MiB -......... .......... .......... .......... .......... 5% ∆717ms .......... .......... .......... .......... .......... 10% ∆1ms .......... .......... .......... .......... .......... 15% ∆0ms .......... .......... .......... .......... .......... 20% ∆1ms .......... .......... .......... .......... .......... 25% ∆1ms .......... .......... .......... .......... .......... 30% ∆0ms .......... .......... .......... .......... .......... 35% ∆0ms .......... .......... .......... .......... .......... 40% ∆0ms .......... .......... .......... .......... .......... 45% ∆0ms .......... .......... .......... .......... .......... 50% ∆1ms .......... .......... .......... .......... .......... 55% ∆0ms .......... .......... .......... .......... .......... 60% ∆0ms .......... .......... .......... .......... .......... 65% ∆0ms .......... .......... .......... .......... .......... 70% ∆0ms .......... .......... .......... .......... .......... 75% ∆0ms .......... .......... .......... .......... .......... 80% ∆1ms .......... .......... .......... .......... .......... 85% ∆0ms .......... .......... .......... .......... .......... 90% ∆0ms .......... .......... .......... .......... .......... 95% ∆0ms .......... .......... .......... .......... .......... 100% ∆0ms Post processing COMPLETED in 1s 781ms IMPORT DONE in 4s 606ms. Imported: 1 nodes 0 relationships 1 properties Peak memory usage: 43.90MiB 2022-02-07 11:13:14.527+0000 INFO [o.n.i.b.ImportLogic] Import completed successfully, took 4s 606ms. Imported: 1 nodes 0 relationships 1 properties 2022-02-07 11:13:15.484+0000 INFO [StoreCopy] Import summary: Copying of 100704 records took 8 seconds (12588 rec/s). Unused Records 100703 (99%) Removed Records 0 (0%) 2022-02-07 11:13:15.485+0000 INFO [StoreCopy] ### Extracting schema ### 2022-02-07 11:13:15.485+0000 INFO [StoreCopy] Trying to extract schema... 2022-02-07 11:13:15.606+0000 INFO [StoreCopy] ... found 1 readable schema definitions. The following can be used to recreate the schema: 2022-02-07 11:13:15.606+0000 INFO [StoreCopy] CREATE BTREE INDEX `index_5c0607ad` FOR (n:`Person`) ON (n.`name`) OPTIONS {indexProvider: 'native-btree-1.0', indexConfig: {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0], `spatial.cartesian.min`: [-1000000.0, -1000000.0], `spatial.wgs-84.min`: [-180.0, -90.0], `spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0], `spatial.cartesian.max`: [1000000.0, 1000000.0], `spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0], `spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0], `spatial.wgs-84.max`: [180.0, 90.0]}} 2022-02-07 11:13:15.606+0000 INFO [StoreCopy] You have to manually apply the above commands to the database when it is started to recreate the indexes and constraints. The commands are saved to /Users/renetapopova/neo4j/cc-4.4.0/core1/logs/neo4j-admin-copy-2022-02-07.11.13.05.log as well for reference.
-
在每个集群成员上,运行以下命令验证数据库是否已成功复制。
ls -al ../data/databases
total 0 drwxr-xr-x@ 6 renetapopova staff 192 Feb 7 11:11 . drwxr-xr-x@ 8 renetapopova staff 256 Feb 7 10:36 .. drwxr-xr-x 34 renetapopova staff 1088 Feb 7 11:12 compactdb drwxr-xr-x 38 renetapopova staff 1216 Feb 7 10:39 neo4j -rw-r--r-- 1 renetapopova staff 0 Feb 7 10:36 store_lock drwxr-xr-x 39 renetapopova staff 1248 Feb 7 10:39 system
复制数据库不会自动创建它。因此,如果您在 Cypher® Shell 或 Neo4j Browser 中执行
SHOW DATABASES
,它将不可见。
在其中一个集群成员上创建压缩备份
您只在其中一个集群成员上使用 CREATE DATABASE
命令创建数据库副本。该命令会自动路由到领导者,然后从领导者路由到其他集群成员。
-
在其中一个集群成员上,导航到 /bin 文件夹并运行以下命令登录到 Cypher Shell 命令行控制台
./cypher-shell -u neo4j -p password
-
将活动数据库更改为
system
:USE system;
-
创建
compactdb
数据库CREATE DATABASE compactdb;
0 rows available after 145 ms, consumed after another 0 ms
-
验证
compactdb
数据库是否在线。SHOW DATABASES;
+----------------------------------------------------------------------------------------------------------------------------------+ | name | aliases | access | address | role | requestedStatus | currentStatus | error | default | home | +----------------------------------------------------------------------------------------------------------------------------------+ | "compactdb" | [] | "read-write" | "localhost:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "compactdb" | [] | "read-write" | "localhost:7688" | "leader" | "online" | "online" | "" | FALSE | FALSE | | "compactdb" | [] | "read-write" | "localhost:7689" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "neo4j" | [] | "read-write" | "localhost:7687" | "leader" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "localhost:7688" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "neo4j" | [] | "read-write" | "localhost:7689" | "follower" | "online" | "online" | "" | TRUE | TRUE | | "system" | [] | "read-write" | "localhost:7687" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "localhost:7688" | "follower" | "online" | "online" | "" | FALSE | FALSE | | "system" | [] | "read-write" | "localhost:7689" | "leader" | "online" | "online" | "" | FALSE | FALSE | +----------------------------------------------------------------------------------------------------------------------------------+ 9 rows ready to start consuming query after 21 ms, results consumed after another 29 ms
-
在其中一个集群成员上,将活动数据库更改为
compactdb
,并使用neo4j-admin copy
命令的输出重新创建模式。CREATE BTREE INDEX `index_5c0607ad` FOR (n:`Person`) ON (n.`name`) OPTIONS {indexProvider: 'native-btree-1.0', indexConfig: {`spatial.cartesian-3d.min`: [-1000000.0, -1000000.0, -1000000.0], `spatial.cartesian.min`: [-1000000.0, -1000000.0], `spatial.wgs-84.min`: [-180.0, -90.0], `spatial.cartesian-3d.max`: [1000000.0, 1000000.0, 1000000.0], `spatial.cartesian.max`: [1000000.0, 1000000.0], `spatial.wgs-84-3d.min`: [-180.0, -90.0, -1000000.0], `spatial.wgs-84-3d.max`: [180.0, 90.0, 1000000.0], `spatial.wgs-84.max`: [180.0, 90.0]}};
0 rows ready to start consuming query after 95 ms, results consumed after another 0 ms Added 1 indexes
-
在每个集群成员上,登录到 Cypher Shell 命令行控制台,将活动数据库更改为
compactdb
,并验证索引是否已成功创建CALL db.indexes;
+----------------------------------------------------------------------------------------------------------------------------------------------+ | id | name | state | populationPercent | uniqueness | type | entityType | labelsOrTypes | properties | provider | +----------------------------------------------------------------------------------------------------------------------------------------------+ | 1 | "index_343aff4e" | "ONLINE" | 100.0 | "NONUNIQUE" | "LOOKUP" | "NODE" | [] | [] | "token-lookup-1.0" | | 2 | "index_5c0607ad" | "ONLINE" | 100.0 | "NONUNIQUE" | "BTREE" | "NODE" | ["Person"] | ["name"] | "native-btree-1.0" | +----------------------------------------------------------------------------------------------------------------------------------------------+ 2 rows ready to start consuming query after 31 ms, results consumed after another 5 ms
-
验证所有数据是否已成功复制。在本例中,应该有一个节点。
MATCH (n) RETURN n.name;
+--------+ | n.name | +--------+ | "John" | +--------+ 1 row available after 106 ms, consumed after another 2 ms
-
退出 Cypher Shell 命令行控制台。
:exit; Bye!
您现在可以将存储大小与备份数据库的大小进行比较。
-
在其中一个集群成员上,导航到
compactdb
数据库($core1_home/data/databases/compactdb),并检查已复制节点和属性的存储大小。ls -alh
... -rw-r--r-- 1 username staff 736K Feb 7 16:00 neostore.nodestore.db -rw-r--r-- 1 username staff 16K Feb 7 16:00 neostore.propertystore.db ...
输出报告节点存储和属性存储现在分别仅占用
736K
和16K
,而之前分别为1.4M
和3.9M
。
MB/s = (IOPS * B) ÷ 10^6
,其中 B
是以字节为单位的块大小;在 Neo4j 的情况下,此值为 8000
。然后可以使用 (MB/s * 3600) ÷ 1000
计算 GB/小时。