GraphGists

食谱指南 - BBC美食图谱

食谱概览

image

本指南将演示如何使用Neo4j来理解BBC GoodFood食谱数据。

数据导入

首先,让我们导入数据。我们可以通过执行接下来的几个查询来完成

//set up indexes for query performance
CREATE INDEX ON :Recipe(id);
CREATE INDEX ON :Ingredient(name);
CREATE INDEX ON :Keyword(name);
CREATE INDEX ON :DietType(name);
CREATE INDEX ON :Author(name);
CREATE INDEX ON :Collection(name);

数据导入,第2部分

//import recipes to the graph
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.title AS title,
       value.page.article.description AS description,
       value.page.recipe.cooking_time AS cookingTime,
       value.page.recipe.prep_time AS preparationTime,
       value.page.recipe.skill_level AS skillLevel
MERGE (r:Recipe {id: id})
SET r.cookingTime = cookingTime,
    r.preparationTime = preparationTime,
    r.name = title,
    r.description = description,
    r.skillLevel = skillLevel;

数据导入,第3部分

//import authors and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.article.author AS author
MERGE (a:Author {name: author})
WITH a,id
MATCH (r:Recipe {id:id})
MERGE (a)-[:WROTE]->(r);
//import ingredients and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.ingredients AS ingredients
MATCH (r:Recipe {id:id})
FOREACH (ingredient IN ingredients |
  MERGE (i:Ingredient {name: ingredient})
  MERGE (r)-[:CONTAINS_INGREDIENT]->(i)
);

数据导入,第4部分

//import keywords and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.keywords AS keywords
MATCH (r:Recipe {id:id})
FOREACH (keyword IN keywords |
  MERGE (k:Keyword {name: keyword})
  MERGE (r)-[:KEYWORD]->(k)
);
//import dietTypes and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.diet_types AS dietTypes
MATCH (r:Recipe {id:id})
FOREACH (dietType IN dietTypes |
  MERGE (d:DietType {name: dietType})
  MERGE (r)-[:DIET_TYPE]->(d)
);

数据导入,第5部分

//import collections and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.collections AS collections
MATCH (r:Recipe {id:id})
FOREACH (collection IN collections |
  MERGE (c:Collection {name: collection})
  MERGE (r)-[:COLLECTION]->(c)
);

图谱模式

现在,让我们回顾一下元图谱,看看我们将要使用的节点和关系类型。

CALL db.schema.visualization()

我们可以看到,图谱围绕着食谱,食谱连接到其他几个实体。一个食谱包含成分,可以是集合的一部分,由作者撰写,可以是饮食类型的一部分,并具有特定的关键词。

最常见成分

哪些是最受欢迎的成分,以及它们在多少食谱中被使用?

MATCH (i:Ingredient)<-[rel:CONTAINS_INGREDIENT]-(r:Recipe)
RETURN i.name, count(rel) as recipes
ORDER BY recipes DESC

列表顶部的项目并不那么令人惊讶——橄榄油、黄油和大蒜!再往下看,我们可以看到一些可能用于制作蛋糕的成分:糖、牛奶、自发粉。

我想吃巧克力蛋糕!

这个数据集也包含集合,其中看起来最美味的一个是巧克力蛋糕集合。以下查询返回此集合中的食谱

MATCH (:Collection {name: 'Chocolate cake'})<-[:COLLECTION]-(recipe)
RETURN recipe.id, recipe.name, recipe.description

这份名单令人食欲大振,但我们不要贪心。我们将重点关注那个浓郁的巧克力蛋糕。

浓郁巧克力蛋糕

我们将从以下查询开始,它返回食谱及其成分的图谱

MATCH path = (r:Recipe {id:'97123'})-[:CONTAINS_INGREDIENT]->(i:Ingredient)
RETURN path

有没有类似的蛋糕?

好的,我们现在已经烤了几次这个蛋糕,虽然它很美味,但我们想尝试其他食谱。有没有类似的蛋糕呢?

MATCH (r:Recipe {id:'97123'})-[:CONTAINS_INGREDIENT]->(i:Ingredient)<-[:CONTAINS_INGREDIENT]-(rec:Recipe)
RETURN rec.id, rec.name, collect(i.name) AS commonIngredients
ORDER BY size(commonIngredients) DESC
LIMIT 10

上面的查询

  • 找到浓郁巧克力蛋糕中的所有成分

  • 找到也包含这些成分的其他食谱

  • 返回包含最常见成分的食谱

作者还发布了哪些其他食谱?

另一种推荐查询是查找浓郁巧克力蛋糕作者发布的其他食谱。以下查询实现了这一点

MATCH (rec:Recipe)<-[:WROTE]-(a:Author)-[:WROTE]->(r:Recipe {id:'97123'})
RETURN rec.id, rec.name, rec.description

用我厨房里的食材能做什么?

给我看看辣椒

MATCH (r:Recipe)
WHERE (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'chilli'})
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients

用我厨房里的食材能做什么?

包含多种成分的食谱(第1部分)

MATCH (r:Recipe)
WHERE (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'chilli'})
AND   (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'prawn'})
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
LIMIT 20

用我厨房里的食材能做什么?

包含多种成分的食谱(第2部分)

:param ingredients => ['chilli', 'prawn']
MATCH (r:Recipe)
WHERE all(i in $ingredients WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
ORDER BY size(ingredients)
LIMIT 20

Mark对所有东西都过敏

:param allergens =>   ['egg', 'milk'];
:param ingredients => ['coconut milk', 'rice'];
MATCH (r:Recipe)

WHERE all(i in $ingredients WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))
AND none(i in $allergens WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))

RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
ORDER BY size(ingredients)
LIMIT 20
© . All rights reserved.