Machine Learning Approaches to Brewing Beer

Operation Brewster
Machine learning strategy for brewing beer
Draft overview of approaches by Gregg Barrett

Requirement
- Classify a style based on ingredients and preparation
- Brewer has some ingredients at hand and/or options in mind. What style do these ingredients and preparation
options fit with?
- For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe
- Typical
- Complementary ingredients
- Also suggest amount of ingredients
- Non-typical
- Non-complementary ingredients
- Also suggest amount of ingredients

Outline
- Section 1: The data set
- Section 2: Classification
- Section 3: Ingredient combination
- Section 4: Ingredient amount
- Section 5: Other considerations
- Conclusion
- Reference

The data set
- Data is random sampled into:
- Training set containing 60% of the data
- Validation set containing 20% of the data
- Test set containing the remaining 20% of the data
- Training set used to train the model
- Validation set used to assess model performance on unseen data and choose between models
and their tuning parameters
- Test set used to assess how our final selected model performs on unseen data

The data set
Approach 1:
- Training set
- Validation set
- Test set
If there are large imbalances in the styles:
Approach 2:
- Training set subsampled to have an equal amount of each style
- Validation set – same set as used in approach 1
- Test set – same set as used in approach 1
Use both approaches and see which performs the best on validation and test data

The data set
- A full Exploratory Data Analysis (EDA) is needed before moving forward with any modelling effort
- The EDA will assist in identifying the scope of the data cleansing requirements
- Transformation of features can be explored
- Feature engineering (creating new features from existing features) can be explored
- If there are missing values decisions will need to be made:
- Removing recipes with missing values from the data set
- Imputing values (mean/median/mode)
- Building models to predict the missing values
- The data set does not contain instructions. It does contain information on boiling time, but no sequence of actions. The
assumption is therefore that instructions are not needed for this requirement – the person brewing the beer only needs
ingredient information and boiling time – and knows how to put it all together.

Classification
- Classification problem
- Supervised learning
- Specifically, by treating recipes as instances, ingredients and preparation (like boiling time) as
features, and style as class labels, the aim is to build a classifier model to predict the styles of
recipes.
- Using unsupervised learning may also be helpful
- Visual depiction of the similarities and differences between the styles
- Possibly provide some insight into which features are useful in defining styles

Classification
- Range of classification techniques should be considered to see what works best
- Supervised
- Logistic Regression
- Linear Discriminant Analysis (LDA)*
- Quadratic Discriminant Analysis (QDA)*
- Generalised Additive Model (GAM)
- Random Forest
- Gradient Boosting
- Support Vector Machine (SVM)
- Neural Networks
- K-Nearest Neighbours (KNN)
- Unsupervised
- Principle Components Analysis (PCA) - derive variables for use in supervised learning
- K-Means Clustering
- Hierarchical Clustering
- Ensemble consisting of any number of the above
* Strictly speaking, LDA and QDA should not be used with qualitative predictors, but in practice it often is if the goal is simply to find a good predictive model

Classification
Assessing classification performance between the various techniques on the validation data:
- A plethora of measures of performance
- Initial thoughts are to compare Cohen’s kappa for the various (supervised) techniques on the validation data. (Ben-David,
2007)
- It may be worth investigating other measures of performance (Valverde-Albacete, Peláez-Moreno, 2010).

Section 3
Ingredient combination

For a given style suggest ingredients where “suggest” is a tuneable parameter
- Typical - uses complementary ingredients
- Non-typical - uses non-complementary ingredients
Calculation of complementary ingredients:
- Pairing - Pairwise Bayesian probabilities
- Ingredient Network
Or
Learn a generative probabilistic model from the ingredient data and then randomly sample it and observe the resulting ingredient
combinations:
- Deep Belief Network (DBN)
Or
Creation of ingredient clusters:
- Principle Component Analysis (PCA)

Pairing within a style:
Calculate pairwise probabilities of ingredients from the training data by counting how many times each pair of
ingredients appears in the set of recipes within a style.
It would be ideal to maximize the probability over the entire subset. However, this would entail a large search space.
The approach could therefore be to start with a set specified by the brewer and iteratively add new ingredients to the
set by taking the most feasible (in the case of a “typical” recipe) from the remaining ingredients using the joint
probabilities of the new ingredient with only the last added one.
Stop adding new ingredients once the probability of adding a new one goes below a certain threshold.
(Naik, Polamreddi, 2015)

Ingredient Network (bipartite) within a style:
Another approach could be to use an ingredient network in which two nodes (ingredients) are connected if they share
at least one recipe in common. The weight of each link represents the number of shared ingredients, turning the
ingredient network into a weighted network.
The approach could be to start with a set specified by the brewer and iteratively add new ingredients to the set by
taking (in the case of a “typical” recipe) from the remaining ingredients using the ingredient that has the highest
network weight. Moving along the network tracking the highest weight for each new ingredient.
Stop adding new ingredients once the weight falls below a certain threshold.
(Ahn, Ahnert, Bagrow, Barabasi, 2011)

Deep Belief Network within a style:
A Deep Belief Network (DBN) could be used to learn generative models of ingredient distributions within each style.
We could then randomly sample it and observe the resulting ingredient combinations.
Changing the parameters of the DBN (the network shape) could lead to different results, giving new combinations of
ingredients, varying ingredient lists, etc.
(Nedovic, 2013)

Principle Component Analysis within a style:
Ingredients can be clustered based on their use in recipes. Within cluster ingredients can be suggested and selected by
the brewer.
(De Clercq, 2014)

Ingredient amount
In particular, given n ingredients and n − 1 amounts, the brewer wants to find the nth amount
- Clustering
- Dimension reduction
- Regression
(Safreno, Deng, 2013)

Section 5
Other considerations

Ingredient-instruction dependency tree representation
Simplified Ingredient Merging Map in Recipes (SIMMR)
SIMMR represents a recipe as a dependency tree whose leaves (terminal nodes) are the recipe ingredients, and whose
internal nodes are the recipe instructions. The SIMMR representation captures the high-level flow of ingredients but
without modelling the semantics in each individual instruction
(Jermsurawong, Habash, 2015)

Once the style, ingredients and amounts have been selected, generate Instructions using pairwise
Bayesian probabilities:
Instructions for a recipe are a sequence of actions, each of which is a tuple of verb and ingredient.
Action-Ingredient-Verb Probabilities:
First choose an ingredient to work on, given the previous action performed. Then, a verb is predicted conditioned on
both the previous complete action and the new ingredient chosen. Thus, this model assumes a logical ordering of
ingredients that we work on during a particular preparation and a logical set of verbs that can possibly be performed on
a given ingredient.
(Naik, Polamreddi, 2015)

Data as a graph
If the data is modelled as graphs we could use a subgraph mining algorithm FSG (Frequent subgraph discovery) and
then compute a recipe similarity measure. Using this method, the brewer can perform similarity search over the graph
structure, shared characteristics, and distinct characteristics of each recipe.
(Wang, Li, Li, Dong, Yang, 2008)

Visual mapping with t-SNE
To visualize the data and obtain some insight into its structure t- Distributed Stochastic Neighbor Embedding can be
used.
Also of consideration is a parametric version of t-SNE that allows for generalization to held-out validation data by using
the t-SNE objective function to train a neural network that provides an explicit mapping to a low-dimensional space.
(van der Maaten, Hinton, 2008)

Ingredient complement network:
Construct an ingredient complement network based on pointwise mutual information (PMI) defined on pairs of
ingredients. The PMI gives the probability that two ingredients occur together against the probability that they occur
separately.
(Teng, Lin, Adamic, 2012)

Conclusion
If time and resources were severely limited:
- Requirement: Classify a style based on ingredients and preparation
- Pursue Boosting and Neural Networks for this requirement
- Requirement: For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe
- Use a Deep Belief Network (DBN) and randomly sample it to derive ingredient combinations

Ahn, Y. Ahnert, S. Bagrow, J. Barabasi, A. (2011). Flavor network and the principles of food pairing. [pdf].
Retrieved from http://www.nature.com
Ben-David, A. (2007). A lot of randomness is hiding in accuracy. [pdf]. Retrieved from
http://www.sciencedirect.com
De Clercq, M. (2014). Prediction of Ingredient Combinations using Machine Learning Techniques. [pdf].
Retrieved from http://lib.ugent.be/fulltxt/RUG01/002/166/653/RUG01-002166653_2014_0001_AC.pdf
Jermsurawong, J. Habash, N. (2015). Predicting the Structure of Cooking Recipes. [pdf]. Retrieved from
http://www.aclweb.org/anthology/D15-1090
Naik, J. Polamreddi, V. (2015). Cuisine Classification and Recipe Generation. [pdf]. Retrieved from
http://cs229.stanford.edu/proj2015/233_report.pdf
Nedovic, V. (2013). Learning recipe ingredient space using generative probabilistic models. [pdf]. Retrieved
from http://liris.cnrs.fr/cwc/papers/cwc2013_submission_2.pdf
Safreno, D. Deng, Y. (2013). The Recipe Learner. [pdf]. Retrieved from
http://cs229.stanford.edu/proj2013/DengSafreno-TheRecipeLearner.pdf
Teng, C. Lin, Y. Adamic, L. (2012). Recipe recommendation using ingredient networks. [pdf]. Retrieved from
https://arxiv.org/pdf/1111.3919.pdf
Valverde-Albacete, F. Peláez-Moreno, C. (2010). Two information-theoretic tools to assess the performance of
multi-class classifiers. [pdf]. Retrieved from http://www.sciencedirect.com
van der Maaten, L. Hinton, G. (2008). Visualizing Data using t-SNE. [pdf]. Retrieved from
http://www.jmlr.org/papers/v9/vandermaaten08a.html
Wang, L. Li, Q. Li, Na. Dong, G. Yang, Y. (2008). Substructure Similarity Measurement in Chinese Recipes.
[pdf]. Retrieved from http://wwwconference.org/www2008/papers/pdf/p979-wang.pdf
Reference

Machine Learning Approaches to Brewing Beer

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Machine Learning Approaches to Brewing Beer

Semelhante a Machine Learning Approaches to Brewing Beer (20)

Mais de Gregg Barrett

Mais de Gregg Barrett (20)

Último

Último (20)

Machine Learning Approaches to Brewing Beer