Azure Synapse Analytics
Azure Synapse Analytics(以前称为 SQL 数据仓库)是一个基于云的企业数据仓库,它利用大规模并行处理 (MPP) 来快速运行跨 PB 级数据的复杂查询。
先决条件
您需要一个正在运行的 Azure Synapse Analytics 实例。如果您没有,可以从 这里创建。
身份验证
Azure Synapse 连接器使用三种类型的网络连接
-
Spark 驱动程序到 Azure Synapse
-
Spark 驱动程序和执行器到 Azure 存储帐户
-
Azure Synapse 到 Azure 存储帐户
要选择最适合您的用例的身份验证方法,我们建议您查看官方的 Azure Synapse 文档
从 Azure Synapse Analytics 到 Neo4j
根据您选择的身份验证方法,以下是一个关于如何将数据从 Azure Synapse Analytics 表作为节点导入到 Neo4j 的示例
// Step (1)
// Load a table into a Spark DataFrame
val azureDF: DataFrame = spark.read
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
.option("dbTable", "CUSTOMER")
.load()
// Step (2)
// Save the `azureDF` as nodes with labels `Person` and `Customer` into Neo4j
azureDF.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.ErrorIfExists)
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Person:Customer")
.save()
# Step (1)
# Load a table into a Spark DataFrame
azureDF = (spark.read
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
.option("dbTable", "CUSTOMER")
.load())
# Step (2)
# Save the `azureDF` as nodes with labels `Person` and `Customer` into Neo4j
(azureDF.write
.format("org.neo4j.spark.DataSource")
.mode("ErrorIfExists")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Person:Customer")
.save())
从 Neo4j 到 Azure Synapse Analytics
根据您选择的身份验证方法,以下是一个关于如何将数据从 Neo4j 导入到 Azure Synapse Analytics 表的示例
// Step (1)
// Load `:Person:Customer` nodes as DataFrame
val neo4jDF: DataFrame = spark.read.format("org.neo4j.spark.DataSource")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Person:Customer")
.load()
// Step (2)
// Save the `neo4jDF` as table CUSTOMER into Azure Synapse Analytics
neo4jDF.write
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
.option("dbTable", "CUSTOMER")
.save()
# Step (1)
# Load `:Person:Customer` nodes as DataFrame
neo4jDF = (spark.read.format("org.neo4j.spark.DataSource")
.option("url", "neo4j://<host>:<port>")
.option("labels", ":Person:Customer")
.load())
# Step (2)
# Save the `neo4jDF` as table CUSTOMER into Azure Synapse Analytics
(neo4jDF.write
.format("com.databricks.spark.sqldw")
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
.option("dbTable", "CUSTOMER")
.save())