Neo4j 向量索引与搜索

Grounding LLM Responses with Implicit and Explicit Search Through Neo4js Knowledge Graph 2048x1152

Neo4j 向量索引实现了 HNSW(层次可导航小世界)算法,通过创建 k 最近邻层来实现高效且稳健的近似最近邻搜索。该索引旨在与嵌入向量配合使用,例如机器学习模型生成的嵌入,可用于根据嵌入在图中查找相似节点。

功能包括

  • 为节点和关系创建具有指定维度和相似度函数(欧氏距离、余弦)的向量索引

  • 使用嵌入和 top‑k 查询向量索引,返回节点及相似度得分

  • GenAI 插件中的函数和过程,可使用 OpenAI、Azure OpenAI、Google Vertex AI、Amazon Bedrock 以及其他支持的提供商计算文本嵌入

  • 向量相似度函数,用于计算向量之间的余弦相似度和欧氏距离

llm vectors unstructured

使用

本页面适用于 Neo4j 5.11+。下方的标签将当前推荐使用 Cypher 25 的方法与针对早期 Neo4j 5.x 发行版使用 Cypher 5 的兼容性方法区分开来。对于当前的 Neo4j 发行版,建议采用 Cypher 25 方法,即使用 SEARCH 子句进行向量搜索,并使用 ai.text.embed()ai.text.embedBatch() 生成嵌入。在两种方法中,都应显式设置 vector.dimensions 以匹配所使用的嵌入模型。这样做的原因是确保仅对预期尺寸的嵌入进行索引,并使维度不匹配的查询能够明确失败。

// create vector index
CREATE VECTOR INDEX `abstract-embeddings`
FOR (n:Abstract) ON (n.embedding)
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}};


// set embedding as parameter
MATCH (a:Abstract {id: $id})
CALL db.create.setNodeVectorProperty(a, 'embedding', $embedding);


// generate an embedding with the GenAI plugin
MATCH (a:Abstract {id: $id})
WITH a, ai.text.embed(a.text, "OpenAI", { token: $token, model: 'text-embedding-3-small' }) AS embedding
CALL db.create.setNodeVectorProperty(a, 'embedding', toFloatList(embedding));


// query the vector index in Cypher 25
CYPHER 25
MATCH (title:Title)<--(:Paper)-->(abstract:Abstract)
WHERE title.text CONTAINS 'hierarchical navigable small world graph'
WITH abstract.embedding AS queryVector
MATCH (similarAbstract:Abstract)
  SEARCH similarAbstract IN (
    VECTOR INDEX `abstract-embeddings`
    FOR queryVector
    LIMIT 10
  )
  SCORE score
MATCH (similarAbstract)<--(:Paper)-->(similarTitle:Title)
RETURN similarTitle.text AS title, score;


// use cosine similarity for exact nearest neighbor search
// pre-filter vector search
MATCH (venue:Venue)<--(paper:Paper)-->(abstract:Abstract)
WHERE venue.name CONTAINS 'NETWORK'

WITH abstract, paper,
     vector.similarity.cosine(abstract.embedding, $embedding) AS score
WHERE score > 0.9

RETURN paper.title AS title, abstract.text, score
ORDER BY score DESC LIMIT 10;
// create vector index
CREATE VECTOR INDEX `abstract-embeddings`
FOR (n: Abstract) ON (n.embedding)
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}};


// set embedding as parameter
MATCH (a:Abstract {id: $id})
CALL db.create.setNodeVectorProperty(a, 'embedding', $embedding);


// use the GenAI plugin to compute the embedding
MATCH (a:Abstract {id: $id})
WITH a, genai.vector.encode(a.text, "OpenAI", { token: $token }) AS embedding
CALL db.create.setNodeVectorProperty(a, 'embedding', embedding);


// query vector index for similar abstracts
MATCH (title:Title)<--(:Paper)-->(abstract:Abstract)
WHERE title.text CONTAINS 'hierarchical navigable small world graph'
CALL db.index.vector.queryNodes('abstract-embeddings', 10, abstract.embedding)
YIELD node AS similarAbstract, score

MATCH (similarAbstract)<--(:Paper)-->(similarTitle:Title)
RETURN similarTitle.text AS title, score;


// use cosine similarity for exact nearest neighbor search
// pre-filter vector search
MATCH (venue:Venue)<--(paper:Paper)-->(abstract:Abstract)
WHERE venue.name CONTAINS 'NETWORK'

WITH abstract, paper,
     vector.similarity.cosine(abstract.embedding, $embedding) AS score
WHERE score > 0.9

RETURN paper.title AS title, abstract.text, score
ORDER BY score DESC LIMIT 10;

作者

Neo4j 工程

社区支持

Neo4j 在线社区

存储库

GitHub

问题追踪

https://github.com/neo4j/neo4j/issues

视频与教程

© . This site is unofficial and not affiliated with Neo4j, Inc.