recipe for success - machine learningcs229.stanford.edu/proj2017/final-posters/5147978.pdf · white...

1
“Banana Nut Brownies” Dataset and Features Rating: 4.39 Reviews: 1322 Servings: 20 Ingredients: [2 cups white sugar, 1 cup butter, 1 1/2 cups all-purpose flour, ... 1 ripe banana, mashed] Finding a healthy recipe online that aligns with your personal pref- erences and dietary restrictions can be a daunting task. Using data from allrecipes.com, we constructed machine learning models that map recipes (as captured by their constituent ingredients) to suc- cess as measured by online ratings. Using vector representations of our ingredients, we develop a methodology for detecting logical ingredient substitutions. Model: We apply the Naive Bayes multinomial model with Laplace smoothing. A flexible bucketing schema discretizes the continuous ranges of ratings from one to five stars. Data and Features: We filter the recipes to include only ingredients that appear in at least 10 meals. Recipes are randomly partitioned into training (80%) and test (20%) sets. Classification Accuracy: 53% Best Cookie Ingredients ”cream cheese” (3.39) ”peanut butter” (3.47) ”semi-sweet chocolate” (3.51) ”marshmallow” (3.54) ”peanut butter cup” (3.76) Worst Cookie Ingredients ”coconut sugar” (1.92) ”peppermint extract” (1.94) ”anise oil” (1.95) ”orange extract” (1.95) ”almond milk” (1.95) 2 tbsp vegetable oil 2 tbsp condensed milk 2 tbsp marshmallow 2 tbsp brown sugar 2 tbsp walnut 2 tbsp shortening 1 1/3 cups all-purpose flour 1/2 cup butter 1 floz vanilla extract 3/4 cups of white sugar 2 tbsp almond 4 tbsp of chocolate cake mix 2 tbsp cocoa powder 2 tbsp raisin 2 tbsp pumpkin 2 tbsp confectioners' sugar 2 tbsp water 3/4 cups semi-sweet chocolate 3 tbsp of peanut butter cup 3 eggs Substitutes for “chocolate cake mix” Best Ingredients • “devil's food cake mix” • ”lemon cake mix” • ”coconut” • ”chocolate pudding mix” • ”marshmallow” Worst Ingredients ”semi-sweet chocolate” ”white sugar” ”brown sugar” ”peanut butter cup” ”all-purpose flour” Model: We implemented two neural networks—one for classifica- tion, and one for regression. Both use one hidden layer and sig- moid activation. Data and Features: Again, we filter the recipes to include to only track ingredients that appear in at least 10 recipes. Recipes are ran- domly partitioned into training (70%) and test (30%) sets. Future work would involve integrating our models for rating pre- diction and ingredient substitution into one tool to generate highly-rated recipes given a set of dietary constraints. Doing this would likely require more data for many types of meals. There are a number of more nuanced approaches that can be further ex- plored, such as reverse-engineering our rating network. The tools we’ve explored should make this process fairly straightforward. Model: We adapted Mikolov et al.’s word2vec model for generat- ing vector representations of features, allowing us to synthesize in- gredients as vectors in a high-dimensional space. Servings: 20 [ “2 cups white sugar“, “1 cup butter“, “1 1/2 cups all-purpose flour“, ... “1 ripe banana, mashed” ] Lasagna: 259 Brownies: 383 Cookies: 4703 [ [0.67, sugar”], [0.33, “butter ”], [0.5, “flour ”], ... [0.83, “banana”] Introduction Recipe Rating Prediction Naive Bayes Rating Neural Network Ingredient Substitution Word2vec Neural Network Future Directions Embeddings of “cake Mix” Embeddings of “flour” white cake mix yellow cake mix devil’s food cake mix lemon cake mix chocolate cake mix spice cake mix pastry flour cake flour all-purpose flour self-rising flour rice flour whole wheat flour almond flour W1 W2 context vector target vector loss recipe 1 st / 2 nd Guess Accuracy Sample Generated 5-star Recipe Label 4 x 10 0 4 x 10 1 4 x 10 2 Prediction Recipe for Success Optimizing meals under dietary constraints Benjamin Share (benshare), James Ordner (jordner), and Zack Cinquini (icinquin)

Upload: others

Post on 23-Mar-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recipe for Success - Machine learningcs229.stanford.edu/proj2017/final-posters/5147978.pdf · white cake mix yellow cake mix devil’s food cake mix lemon cake mix chocolate cake

“Banana Nut Brownies”

Dataset and Features

Rating: 4.39

Reviews: 1322

Servings: 20

Ingredients: [2 cups white sugar, 1 cup butter, 1 1/2 cups all-purpose flour, ... 1 ripe banana, mashed]

Finding a healthy recipe online that aligns with your personal pref-erences and dietary restrictions can be a daunting task. Using data from allrecipes.com, we constructed machine learning models that map recipes (as captured by their constituent ingredients) to suc-cess as measured by online ratings. Using vector representations of our ingredients, we develop a methodology for detecting logical ingredient substitutions.

Model: We apply the Naive Bayes multinomial model with Laplace smoothing. A flexible bucketing schema discretizes the continuous ranges of ratings from one to five stars.

Data and Features: We filter the recipes to include only ingredients that appear in at least 10 meals. Recipes are randomly partitioned into training (80%) and test (20%) sets.

Classification Accuracy: 53%

Best Cookie Ingredients• ”cream cheese” (3.39)• ”peanut butter” (3.47)• ”semi-sweet chocolate” (3.51)• ”marshmallow” (3.54)• ”peanut butter cup” (3.76)

Worst Cookie Ingredients• ”coconut sugar” (1.92)• ”peppermint extract” (1.94)• ”anise oil” (1.95)• ”orange extract” (1.95)• ”almond milk” (1.95)

2 tbsp vegetable oil2 tbsp condensed milk2 tbsp marshmallow2 tbsp brown sugar2 tbsp walnut2 tbsp shortening1 1/3 cups all-purpose flour

1/2 cup butter1 floz vanilla extract3/4 cups of white sugar2 tbsp almond4 tbsp of chocolate cake mix2 tbsp cocoa powder2 tbsp raisin

2 tbsp pumpkin2 tbsp confectioners' sugar2 tbsp water3/4 cups semi-sweet chocolate3 tbsp of peanut butter cup3 eggs

Substitutes for “chocolate cake mix”

Best Ingredients

• “devil's food cake mix”

• ”lemon cake mix”

• ”coconut”

• ”chocolate pudding mix”

• ”marshmallow”

Worst Ingredients

• ”semi-sweet chocolate”

• ”white sugar”

• ”brown sugar”

• ”peanut butter cup”

• ”all-purpose flour”

Model: We implemented two neural networks—one for classifica-tion, and one for regression. Both use one hidden layer and sig-moid activation.

Data and Features: Again, we filter the recipes to include to only track ingredients that appear in at least 10 recipes. Recipes are ran-domly partitioned into training (70%) and test (30%) sets.

Future work would involve integrating our models for rating pre-diction and ingredient substitution into one tool to generate highly-rated recipes given a set of dietary constraints. Doing this would likely require more data for many types of meals. There are a number of more nuanced approaches that can be further ex-plored, such as reverse-engineering our rating network. The tools we’ve explored should make this process fairly straightforward.

Model: We adapted Mikolov et al.’s word2vec model for generat-ing vector representations of features, allowing us to synthesize in-gredients as vectors in a high-dimensional space.

Servings: 20

[ “2 cups white sugar “, “1 cup butter “, “1 1/2 cups all-purpose flour “, ... “1 ripe banana, mashed” ]

Lasagna: 259 Brownies: 383 Cookies: 4703

[ [0.67, sugar”], [0.33, “butter”], [0.5, “flour”], ... [0.83, “banana”]

Introduction Recipe Rating Prediction

Naive Bayes

Rating Neural Network

Ingredient Substitution

Word2vec Neural Network

Future Directions

Embeddings of “cake Mix” Embeddings of “flour”

white cake mix

yellow cake mix

devil’s food cake mixlemon cake mixchocolate cake mixspice cake mix

pastry flourcake flourall-purpose flourself-rising flourrice flourwhole wheat flouralmond flour

W1 W2

context vector target vector

loss

recipe

1st / 2nd Guess Accuracy

Sample Generated 5-star Recipe

Label4 x 100

4 x 101

4 x 102

Prediction

Recipe for SuccessOptimizing meals under dietary constraintsBenjamin Share (benshare), James Ordner (jordner), and Zack Cinquini (icinquin)