GraphGists

食谱指南 - BBC 食品图

食谱概述

image

本指南将演示如何使用 Neo4j 来理解 BBC GoodFood 食谱数据。

数据导入

首先,让我们导入数据。我们可以通过执行接下来的几个查询来做到这一点

//set up indexes for query performance
CREATE INDEX ON :Recipe(id);
CREATE INDEX ON :Ingredient(name);
CREATE INDEX ON :Keyword(name);
CREATE INDEX ON :DietType(name);
CREATE INDEX ON :Author(name);
CREATE INDEX ON :Collection(name);

数据导入,第 2 部分

//import recipes to the graph
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.title AS title,
       value.page.article.description AS description,
       value.page.recipe.cooking_time AS cookingTime,
       value.page.recipe.prep_time AS preparationTime,
       value.page.recipe.skill_level AS skillLevel
MERGE (r:Recipe {id: id})
SET r.cookingTime = cookingTime,
    r.preparationTime = preparationTime,
    r.name = title,
    r.description = description,
    r.skillLevel = skillLevel;

数据导入,第 3 部分

//import authors and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.article.author AS author
MERGE (a:Author {name: author})
WITH a,id
MATCH (r:Recipe {id:id})
MERGE (a)-[:WROTE]->(r);
//import ingredients and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.ingredients AS ingredients
MATCH (r:Recipe {id:id})
FOREACH (ingredient IN ingredients |
  MERGE (i:Ingredient {name: ingredient})
  MERGE (r)-[:CONTAINS_INGREDIENT]->(i)
);

数据导入,第 4 部分

//import keywords and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.keywords AS keywords
MATCH (r:Recipe {id:id})
FOREACH (keyword IN keywords |
  MERGE (k:Keyword {name: keyword})
  MERGE (r)-[:KEYWORD]->(k)
);
//import dietTypes and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.diet_types AS dietTypes
MATCH (r:Recipe {id:id})
FOREACH (dietType IN dietTypes |
  MERGE (d:DietType {name: dietType})
  MERGE (r)-[:DIET_TYPE]->(d)
);

数据导入,第 5 部分

//import collections and connect to recipes
CALL apoc.load.json('https://raw.githubusercontent.com/neo4j-examples/graphgists/master/browser-guides/data/stream_clean.json') YIELD value
WITH value.page.article.id AS id,
       value.page.recipe.collections AS collections
MATCH (r:Recipe {id:id})
FOREACH (collection IN collections |
  MERGE (c:Collection {name: collection})
  MERGE (r)-[:COLLECTION]->(c)
);

图模式

现在让我们回顾一下元图,看看我们将要使用的节点和关系类型。

CALL db.schema.visualization()

我们可以看到,该图基于食谱,然后连接到几个其他实体。食谱包含食材,可以是某个合集的一部分,由作者撰写,可以构成某种饮食类型的一部分,并且具有一些关键词。

最常见的食材

哪些是最受欢迎的食材,它们被用在多少道食谱中?

MATCH (i:Ingredient)<-[rel:CONTAINS_INGREDIENT]-(r:Recipe)
RETURN i.name, count(rel) as recipes
ORDER BY recipes DESC

列表顶部的项目并不令人意外 - 橄榄油、黄油和大蒜!在列表的更下方,我们可以看到一些可能用于蛋糕的食材:糖、牛奶、泡打粉。

我想要巧克力蛋糕!

此数据集还包含合集,其中一个看起来最美味的合集是巧克力蛋糕合集。以下查询返回此合集中的食谱

MATCH (:Collection {name: 'Chocolate cake'})<-[:COLLECTION]-(recipe)
RETURN recipe.id, recipe.name, recipe.description

一份令人垂涎欲滴的清单,但我们不要贪心。我们将重点关注那款超级浓郁的巧克力蛋糕。

超级浓郁的巧克力蛋糕

我们将从以下查询开始,该查询返回食谱及其食材的图

MATCH path = (r:Recipe {id:'97123'})-[:CONTAINS_INGREDIENT]->(i:Ingredient)
RETURN path

还有哪些类似的蛋糕?

好的,所以我们现在已经做了几次这个蛋糕,虽然它很好吃,但我们想尝试一些其他的食谱。还有哪些其他蛋糕与这个类似?

MATCH (r:Recipe {id:'97123'})-[:CONTAINS_INGREDIENT]->(i:Ingredient)<-[:CONTAINS_INGREDIENT]-(rec:Recipe)
RETURN rec.id, rec.name, collect(i.name) AS commonIngredients
ORDER BY size(commonIngredients) DESC
LIMIT 10

上面的查询

  • 查找超级浓郁的巧克力蛋糕中的所有食材

  • 查找也包含这些食材的其他食谱

  • 返回包含最常见食材的食谱

作者还发布了哪些其他食谱?

另一种推荐查询将是查找超级浓郁的巧克力蛋糕作者发布的其他食谱。以下查询执行此操作

MATCH (rec:Recipe)<-[:WROTE]-(a:Author)-[:WROTE]->(r:Recipe {id:'97123'})
RETURN rec.id, rec.name, rec.description

我可以用厨房里的食材做什么?

给我看看辣椒

MATCH (r:Recipe)
WHERE (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'chilli'})
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients

我可以用厨房里的食材做什么?

包含多种食材的食谱(第 1 部分)

MATCH (r:Recipe)
WHERE (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'chilli'})
AND   (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: 'prawn'})
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
LIMIT 20

我可以用厨房里的食材做什么?

包含多种食材的食谱(第 2 部分)

:param ingredients => ['chilli', 'prawn']
MATCH (r:Recipe)
WHERE all(i in $ingredients WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))
RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
ORDER BY size(ingredients)
LIMIT 20

马克对所有东西过敏

:param allergens =>   ['egg', 'milk'];
:param ingredients => ['coconut milk', 'rice'];
MATCH (r:Recipe)

WHERE all(i in $ingredients WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))
AND none(i in $allergens WHERE exists(
  (r)-[:CONTAINS_INGREDIENT]->(:Ingredient {name: i})))

RETURN r.name AS recipe,
       [(r)-[:CONTAINS_INGREDIENT]->(i) | i.name]
       AS ingredients
ORDER BY size(ingredients)
LIMIT 20