Azure Synapse Analytics

Azure Synapse Analytics (前身为 SQL Data Warehouse) 是一个基于云的企业数据仓库，利用大规模并行处理 (MPP) 快速运行跨数 PB 数据的复杂查询。

先决条件

您需要一个正在运行的 Azure Synapse Analytics 实例。如果您没有，可以从此处创建。

依赖项

Azure Synapse Analytics 仅在 Databricks Runtime 中通过 Spark 工作，因为所需的连接器尚未公开发布。

身份验证

Azure Synapse 连接器使用三种类型的网络连接

Spark 驱动程序到 Azure Synapse
Spark 驱动程序和执行器到 Azure 存储账户
Azure Synapse 到 Azure 存储账户

为了选择更适合您用例的身份验证方法，我们建议查阅官方的 Azure Synapse 文档

从 Azure Synapse Analytics 到 Neo4j

根据您选择的身份验证方法，以下是如何将数据从 Azure Synapse Analytics 表作为节点摄取到 Neo4j 的示例

// Step (1)
// Load a table into a Spark DataFrame
val azureDF: DataFrame = spark.read
  .format("com.databricks.spark.sqldw")
  .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
  .option("dbTable", "CUSTOMER")
  .load()

// Step (2)
// Save the `azureDF` as nodes with labels `Person` and `Customer` into Neo4j
azureDF.write
  .format("org.neo4j.spark.DataSource")
  .mode(SaveMode.ErrorIfExists)
  .option("url", "neo4j://<host>:<port>")
  .option("labels", ":Person:Customer")
  .save()

# Step (1)
# Load a table into a Spark DataFrame
azureDF = (spark.read
  .format("com.databricks.spark.sqldw")
  .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
  .option("dbTable", "CUSTOMER")
  .load())

# Step (2)
# Save the `azureDF` as nodes with labels `Person` and `Customer` into Neo4j
(azureDF.write
  .format("org.neo4j.spark.DataSource")
  .mode("ErrorIfExists")
  .option("url", "neo4j://<host>:<port>")
  .option("labels", ":Person:Customer")
  .save())

从 Neo4j 到 Azure Synapse Analytics

根据您选择的身份验证方法，以下是如何将数据从 Neo4j 摄取到 Azure Synapse Analytics 表的示例

// Step (1)
// Load `:Person:Customer` nodes as DataFrame
val neo4jDF: DataFrame = spark.read.format("org.neo4j.spark.DataSource")
  .option("url", "neo4j://<host>:<port>")
  .option("labels", ":Person:Customer")
  .load()

// Step (2)
// Save the `neo4jDF` as table CUSTOMER into Azure Synapse Analytics
neo4jDF.write
  .format("com.databricks.spark.sqldw")
  .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
  .option("dbTable", "CUSTOMER")
  .save()

# Step (1)
# Load `:Person:Customer` nodes as DataFrame
neo4jDF = (spark.read.format("org.neo4j.spark.DataSource")
  .option("url", "neo4j://<host>:<port>")
  .option("labels", ":Person:Customer")
  .load())

# Step (2)
# Save the `neo4jDF` as table CUSTOMER into Azure Synapse Analytics
(neo4jDF.write
  .format("com.databricks.spark.sqldw")
  .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>")
  .option("dbTable", "CUSTOMER")
  .save())