SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Operation Brewster
Machine learning strategy for brewing beer
Draft overview of approaches by Gregg Barrett
Requirement
- Classify a style based on ingredients and preparation
- Brewer has some ingredients at hand and/or options in mind. What style do these ingredients and preparation
options fit with?
- For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe
- Typical
- Complementary ingredients
- Also suggest amount of ingredients
- Non-typical
- Non-complementary ingredients
- Also suggest amount of ingredients
Outline
- Section 1: The data set
- Section 2: Classification
- Section 3: Ingredient combination
- Section 4: Ingredient amount
- Section 5: Other considerations
- Conclusion
- Reference
Section 1
The data set
The data set
- Data is random sampled into:
- Training set containing 60% of the data
- Validation set containing 20% of the data
- Test set containing the remaining 20% of the data
- Training set used to train the model
- Validation set used to assess model performance on unseen data and choose between models
and their tuning parameters
- Test set used to assess how our final selected model performs on unseen data
The data set
Approach 1:
- Training set
- Validation set
- Test set
If there are large imbalances in the styles:
Approach 2:
- Training set subsampled to have an equal amount of each style
- Validation set – same set as used in approach 1
- Test set – same set as used in approach 1
Use both approaches and see which performs the best on validation and test data
The data set
- A full Exploratory Data Analysis (EDA) is needed before moving forward with any modelling effort
- The EDA will assist in identifying the scope of the data cleansing requirements
- Transformation of features can be explored
- Feature engineering (creating new features from existing features) can be explored
- If there are missing values decisions will need to be made:
- Removing recipes with missing values from the data set
- Imputing values (mean/median/mode)
- Building models to predict the missing values
- The data set does not contain instructions. It does contain information on boiling time, but no sequence of actions. The
assumption is therefore that instructions are not needed for this requirement – the person brewing the beer only needs
ingredient information and boiling time – and knows how to put it all together.
Section 2
Classification
Classification
- Classification problem
- Supervised learning
- Specifically, by treating recipes as instances, ingredients and preparation (like boiling time) as
features, and style as class labels, the aim is to build a classifier model to predict the styles of
recipes.
- Using unsupervised learning may also be helpful
- Visual depiction of the similarities and differences between the styles
- Possibly provide some insight into which features are useful in defining styles
Classification
- Range of classification techniques should be considered to see what works best
- Supervised
- Logistic Regression
- Linear Discriminant Analysis (LDA)*
- Quadratic Discriminant Analysis (QDA)*
- Generalised Additive Model (GAM)
- Random Forest
- Gradient Boosting
- Support Vector Machine (SVM)
- Neural Networks
- K-Nearest Neighbours (KNN)
- Unsupervised
- Principle Components Analysis (PCA) - derive variables for use in supervised learning
- K-Means Clustering
- Hierarchical Clustering
- Ensemble consisting of any number of the above
* Strictly speaking, LDA and QDA should not be used with qualitative predictors, but in practice it often is if the goal is simply to find a good predictive model
Classification
Assessing classification performance between the various techniques on the validation data:
- A plethora of measures of performance
- Initial thoughts are to compare Cohen’s kappa for the various (supervised) techniques on the validation data. (Ben-David,
2007)
- It may be worth investigating other measures of performance (Valverde-Albacete, Peláez-Moreno, 2010).
Section 3
Ingredient combination
Ingredient combination
For a given style suggest ingredients where “suggest” is a tuneable parameter
- Typical - uses complementary ingredients
- Non-typical - uses non-complementary ingredients
Calculation of complementary ingredients:
- Pairing - Pairwise Bayesian probabilities
- Ingredient Network
Or
Learn a generative probabilistic model from the ingredient data and then randomly sample it and observe the resulting ingredient
combinations:
- Deep Belief Network (DBN)
Or
Creation of ingredient clusters:
- Principle Component Analysis (PCA)
Pairing within a style:
Calculate pairwise probabilities of ingredients from the training data by counting how many times each pair of
ingredients appears in the set of recipes within a style.
It would be ideal to maximize the probability over the entire subset. However, this would entail a large search space.
The approach could therefore be to start with a set specified by the brewer and iteratively add new ingredients to the
set by taking the most feasible (in the case of a “typical” recipe) from the remaining ingredients using the joint
probabilities of the new ingredient with only the last added one.
Stop adding new ingredients once the probability of adding a new one goes below a certain threshold.
(Naik, Polamreddi, 2015)
Ingredient combination
Ingredient combination
Ingredient Network (bipartite) within a style:
Another approach could be to use an ingredient network in which two nodes (ingredients) are connected if they share
at least one recipe in common. The weight of each link represents the number of shared ingredients, turning the
ingredient network into a weighted network.
The approach could be to start with a set specified by the brewer and iteratively add new ingredients to the set by
taking (in the case of a “typical” recipe) from the remaining ingredients using the ingredient that has the highest
network weight. Moving along the network tracking the highest weight for each new ingredient.
Stop adding new ingredients once the weight falls below a certain threshold.
(Ahn, Ahnert, Bagrow, Barabasi, 2011)
Ingredient combination
Deep Belief Network within a style:
A Deep Belief Network (DBN) could be used to learn generative models of ingredient distributions within each style.
We could then randomly sample it and observe the resulting ingredient combinations.
Changing the parameters of the DBN (the network shape) could lead to different results, giving new combinations of
ingredients, varying ingredient lists, etc.
(Nedovic, 2013)
Ingredient combination
Principle Component Analysis within a style:
Ingredients can be clustered based on their use in recipes. Within cluster ingredients can be suggested and selected by
the brewer.
(De Clercq, 2014)
Section 4
Ingredient amount
Ingredient amount
In particular, given n ingredients and n − 1 amounts, the brewer wants to find the nth amount
- Clustering
- Dimension reduction
- Regression
(Safreno, Deng, 2013)
Section 5
Other considerations
Other considerations
Ingredient-instruction dependency tree representation
Simplified Ingredient Merging Map in Recipes (SIMMR)
SIMMR represents a recipe as a dependency tree whose leaves (terminal nodes) are the recipe ingredients, and whose
internal nodes are the recipe instructions. The SIMMR representation captures the high-level flow of ingredients but
without modelling the semantics in each individual instruction
(Jermsurawong, Habash, 2015)
Other considerations
Once the style, ingredients and amounts have been selected, generate Instructions using pairwise
Bayesian probabilities:
Instructions for a recipe are a sequence of actions, each of which is a tuple of verb and ingredient.
Action-Ingredient-Verb Probabilities:
First choose an ingredient to work on, given the previous action performed. Then, a verb is predicted conditioned on
both the previous complete action and the new ingredient chosen. Thus, this model assumes a logical ordering of
ingredients that we work on during a particular preparation and a logical set of verbs that can possibly be performed on
a given ingredient.
(Naik, Polamreddi, 2015)
Other considerations
Data as a graph
If the data is modelled as graphs we could use a subgraph mining algorithm FSG (Frequent subgraph discovery) and
then compute a recipe similarity measure. Using this method, the brewer can perform similarity search over the graph
structure, shared characteristics, and distinct characteristics of each recipe.
(Wang, Li, Li, Dong, Yang, 2008)
Other considerations
Visual mapping with t-SNE
To visualize the data and obtain some insight into its structure t- Distributed Stochastic Neighbor Embedding can be
used.
Also of consideration is a parametric version of t-SNE that allows for generalization to held-out validation data by using
the t-SNE objective function to train a neural network that provides an explicit mapping to a low-dimensional space.
(van der Maaten, Hinton, 2008)
Other considerations
Ingredient complement network:
Construct an ingredient complement network based on pointwise mutual information (PMI) defined on pairs of
ingredients. The PMI gives the probability that two ingredients occur together against the probability that they occur
separately.
(Teng, Lin, Adamic, 2012)
Conclusion
Conclusion
If time and resources were severely limited:
- Requirement: Classify a style based on ingredients and preparation
- Pursue Boosting and Neural Networks for this requirement
- Requirement: For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe
- Use a Deep Belief Network (DBN) and randomly sample it to derive ingredient combinations
Reference
Ahn, Y. Ahnert, S. Bagrow, J. Barabasi, A. (2011). Flavor network and the principles of food pairing. [pdf].
Retrieved from http://www.nature.com
Ben-David, A. (2007). A lot of randomness is hiding in accuracy. [pdf]. Retrieved from
http://www.sciencedirect.com
De Clercq, M. (2014). Prediction of Ingredient Combinations using Machine Learning Techniques. [pdf].
Retrieved from http://lib.ugent.be/fulltxt/RUG01/002/166/653/RUG01-002166653_2014_0001_AC.pdf
Jermsurawong, J. Habash, N. (2015). Predicting the Structure of Cooking Recipes. [pdf]. Retrieved from
http://www.aclweb.org/anthology/D15-1090
Naik, J. Polamreddi, V. (2015). Cuisine Classification and Recipe Generation. [pdf]. Retrieved from
http://cs229.stanford.edu/proj2015/233_report.pdf
Nedovic, V. (2013). Learning recipe ingredient space using generative probabilistic models. [pdf]. Retrieved
from http://liris.cnrs.fr/cwc/papers/cwc2013_submission_2.pdf
Safreno, D. Deng, Y. (2013). The Recipe Learner. [pdf]. Retrieved from
http://cs229.stanford.edu/proj2013/DengSafreno-TheRecipeLearner.pdf
Teng, C. Lin, Y. Adamic, L. (2012). Recipe recommendation using ingredient networks. [pdf]. Retrieved from
https://arxiv.org/pdf/1111.3919.pdf
Valverde-Albacete, F. Peláez-Moreno, C. (2010). Two information-theoretic tools to assess the performance of
multi-class classifiers. [pdf]. Retrieved from http://www.sciencedirect.com
van der Maaten, L. Hinton, G. (2008). Visualizing Data using t-SNE. [pdf]. Retrieved from
http://www.jmlr.org/papers/v9/vandermaaten08a.html
Wang, L. Li, Q. Li, Na. Dong, G. Yang, Y. (2008). Substructure Similarity Measurement in Chinese Recipes.
[pdf]. Retrieved from http://wwwconference.org/www2008/papers/pdf/p979-wang.pdf
Reference

Mais conteúdo relacionado

Semelhante a Machine Learning Approaches to Brewing Beer

Hybrid recommender systems
Hybrid recommender systemsHybrid recommender systems
Hybrid recommender systems
renataghisloti
 
Presentation
PresentationPresentation
Presentation
butest
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
nextlib
 
Report
ReportReport
Report
butest
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
Ann-Marie Roche
 

Semelhante a Machine Learning Approaches to Brewing Beer (20)

IRJET- Recipe Recommendation System using Machine Learning Models
IRJET- Recipe Recommendation System using Machine Learning ModelsIRJET- Recipe Recommendation System using Machine Learning Models
IRJET- Recipe Recommendation System using Machine Learning Models
 
Hybrid recommender systems
Hybrid recommender systemsHybrid recommender systems
Hybrid recommender systems
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
 
Assessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data LinkagesAssessment of Cluster Tree Analysis based on Data Linkages
Assessment of Cluster Tree Analysis based on Data Linkages
 
Presentation
PresentationPresentation
Presentation
 
Research proposal
Research proposalResearch proposal
Research proposal
 
05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf05-00-ACA-Data-Intro.pdf
05-00-ACA-Data-Intro.pdf
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
PaperReview_ “Few-shot Graph Classification with Contrastive Loss and Meta-cl...
 
Item Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation AlgorithmsItem Based Collaborative Filtering Recommendation Algorithms
Item Based Collaborative Filtering Recommendation Algorithms
 
Report
ReportReport
Report
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
 
IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...
IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...
IRJET- Classifying Twitter Data in Multiple Classes based on Sentiment Class ...
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
 
IJET-V2I6P32
IJET-V2I6P32IJET-V2I6P32
IJET-V2I6P32
 
IRJET- Discovery of Recipes based on Ingredients using Machine Learning
IRJET- Discovery of Recipes based on Ingredients using Machine LearningIRJET- Discovery of Recipes based on Ingredients using Machine Learning
IRJET- Discovery of Recipes based on Ingredients using Machine Learning
 
Effective Feature Selection for Feature Possessing Group Structure
Effective Feature Selection for Feature Possessing Group StructureEffective Feature Selection for Feature Possessing Group Structure
Effective Feature Selection for Feature Possessing Group Structure
 
Ensemble hybrid learning technique
Ensemble hybrid learning techniqueEnsemble hybrid learning technique
Ensemble hybrid learning technique
 
Product Recommendation Systems based on Hybrid Approach Technology
Product Recommendation Systems based on Hybrid Approach TechnologyProduct Recommendation Systems based on Hybrid Approach Technology
Product Recommendation Systems based on Hybrid Approach Technology
 
Food Cuisine Analysis using Image Processing and Machine Learning
Food Cuisine Analysis using Image Processing and Machine LearningFood Cuisine Analysis using Image Processing and Machine Learning
Food Cuisine Analysis using Image Processing and Machine Learning
 

Mais de Gregg Barrett

Mais de Gregg Barrett (20)

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiative
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project Document
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boosting
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla Motors
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introduction
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings power
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be exciting
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managers
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in R
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using R
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overview
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at Intermountain
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahout
 

Último

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Machine Learning Approaches to Brewing Beer

  • 1. Operation Brewster Machine learning strategy for brewing beer Draft overview of approaches by Gregg Barrett
  • 2. Requirement - Classify a style based on ingredients and preparation - Brewer has some ingredients at hand and/or options in mind. What style do these ingredients and preparation options fit with? - For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe - Typical - Complementary ingredients - Also suggest amount of ingredients - Non-typical - Non-complementary ingredients - Also suggest amount of ingredients
  • 3. Outline - Section 1: The data set - Section 2: Classification - Section 3: Ingredient combination - Section 4: Ingredient amount - Section 5: Other considerations - Conclusion - Reference
  • 5. The data set - Data is random sampled into: - Training set containing 60% of the data - Validation set containing 20% of the data - Test set containing the remaining 20% of the data - Training set used to train the model - Validation set used to assess model performance on unseen data and choose between models and their tuning parameters - Test set used to assess how our final selected model performs on unseen data
  • 6. The data set Approach 1: - Training set - Validation set - Test set If there are large imbalances in the styles: Approach 2: - Training set subsampled to have an equal amount of each style - Validation set – same set as used in approach 1 - Test set – same set as used in approach 1 Use both approaches and see which performs the best on validation and test data
  • 7. The data set - A full Exploratory Data Analysis (EDA) is needed before moving forward with any modelling effort - The EDA will assist in identifying the scope of the data cleansing requirements - Transformation of features can be explored - Feature engineering (creating new features from existing features) can be explored - If there are missing values decisions will need to be made: - Removing recipes with missing values from the data set - Imputing values (mean/median/mode) - Building models to predict the missing values - The data set does not contain instructions. It does contain information on boiling time, but no sequence of actions. The assumption is therefore that instructions are not needed for this requirement – the person brewing the beer only needs ingredient information and boiling time – and knows how to put it all together.
  • 9. Classification - Classification problem - Supervised learning - Specifically, by treating recipes as instances, ingredients and preparation (like boiling time) as features, and style as class labels, the aim is to build a classifier model to predict the styles of recipes. - Using unsupervised learning may also be helpful - Visual depiction of the similarities and differences between the styles - Possibly provide some insight into which features are useful in defining styles
  • 10. Classification - Range of classification techniques should be considered to see what works best - Supervised - Logistic Regression - Linear Discriminant Analysis (LDA)* - Quadratic Discriminant Analysis (QDA)* - Generalised Additive Model (GAM) - Random Forest - Gradient Boosting - Support Vector Machine (SVM) - Neural Networks - K-Nearest Neighbours (KNN) - Unsupervised - Principle Components Analysis (PCA) - derive variables for use in supervised learning - K-Means Clustering - Hierarchical Clustering - Ensemble consisting of any number of the above * Strictly speaking, LDA and QDA should not be used with qualitative predictors, but in practice it often is if the goal is simply to find a good predictive model
  • 11. Classification Assessing classification performance between the various techniques on the validation data: - A plethora of measures of performance - Initial thoughts are to compare Cohen’s kappa for the various (supervised) techniques on the validation data. (Ben-David, 2007) - It may be worth investigating other measures of performance (Valverde-Albacete, Peláez-Moreno, 2010).
  • 13. Ingredient combination For a given style suggest ingredients where “suggest” is a tuneable parameter - Typical - uses complementary ingredients - Non-typical - uses non-complementary ingredients Calculation of complementary ingredients: - Pairing - Pairwise Bayesian probabilities - Ingredient Network Or Learn a generative probabilistic model from the ingredient data and then randomly sample it and observe the resulting ingredient combinations: - Deep Belief Network (DBN) Or Creation of ingredient clusters: - Principle Component Analysis (PCA)
  • 14. Pairing within a style: Calculate pairwise probabilities of ingredients from the training data by counting how many times each pair of ingredients appears in the set of recipes within a style. It would be ideal to maximize the probability over the entire subset. However, this would entail a large search space. The approach could therefore be to start with a set specified by the brewer and iteratively add new ingredients to the set by taking the most feasible (in the case of a “typical” recipe) from the remaining ingredients using the joint probabilities of the new ingredient with only the last added one. Stop adding new ingredients once the probability of adding a new one goes below a certain threshold. (Naik, Polamreddi, 2015) Ingredient combination
  • 15. Ingredient combination Ingredient Network (bipartite) within a style: Another approach could be to use an ingredient network in which two nodes (ingredients) are connected if they share at least one recipe in common. The weight of each link represents the number of shared ingredients, turning the ingredient network into a weighted network. The approach could be to start with a set specified by the brewer and iteratively add new ingredients to the set by taking (in the case of a “typical” recipe) from the remaining ingredients using the ingredient that has the highest network weight. Moving along the network tracking the highest weight for each new ingredient. Stop adding new ingredients once the weight falls below a certain threshold. (Ahn, Ahnert, Bagrow, Barabasi, 2011)
  • 16. Ingredient combination Deep Belief Network within a style: A Deep Belief Network (DBN) could be used to learn generative models of ingredient distributions within each style. We could then randomly sample it and observe the resulting ingredient combinations. Changing the parameters of the DBN (the network shape) could lead to different results, giving new combinations of ingredients, varying ingredient lists, etc. (Nedovic, 2013)
  • 17. Ingredient combination Principle Component Analysis within a style: Ingredients can be clustered based on their use in recipes. Within cluster ingredients can be suggested and selected by the brewer. (De Clercq, 2014)
  • 19. Ingredient amount In particular, given n ingredients and n − 1 amounts, the brewer wants to find the nth amount - Clustering - Dimension reduction - Regression (Safreno, Deng, 2013)
  • 21. Other considerations Ingredient-instruction dependency tree representation Simplified Ingredient Merging Map in Recipes (SIMMR) SIMMR represents a recipe as a dependency tree whose leaves (terminal nodes) are the recipe ingredients, and whose internal nodes are the recipe instructions. The SIMMR representation captures the high-level flow of ingredients but without modelling the semantics in each individual instruction (Jermsurawong, Habash, 2015)
  • 22. Other considerations Once the style, ingredients and amounts have been selected, generate Instructions using pairwise Bayesian probabilities: Instructions for a recipe are a sequence of actions, each of which is a tuple of verb and ingredient. Action-Ingredient-Verb Probabilities: First choose an ingredient to work on, given the previous action performed. Then, a verb is predicted conditioned on both the previous complete action and the new ingredient chosen. Thus, this model assumes a logical ordering of ingredients that we work on during a particular preparation and a logical set of verbs that can possibly be performed on a given ingredient. (Naik, Polamreddi, 2015)
  • 23. Other considerations Data as a graph If the data is modelled as graphs we could use a subgraph mining algorithm FSG (Frequent subgraph discovery) and then compute a recipe similarity measure. Using this method, the brewer can perform similarity search over the graph structure, shared characteristics, and distinct characteristics of each recipe. (Wang, Li, Li, Dong, Yang, 2008)
  • 24. Other considerations Visual mapping with t-SNE To visualize the data and obtain some insight into its structure t- Distributed Stochastic Neighbor Embedding can be used. Also of consideration is a parametric version of t-SNE that allows for generalization to held-out validation data by using the t-SNE objective function to train a neural network that provides an explicit mapping to a low-dimensional space. (van der Maaten, Hinton, 2008)
  • 25. Other considerations Ingredient complement network: Construct an ingredient complement network based on pointwise mutual information (PMI) defined on pairs of ingredients. The PMI gives the probability that two ingredients occur together against the probability that they occur separately. (Teng, Lin, Adamic, 2012)
  • 27. Conclusion If time and resources were severely limited: - Requirement: Classify a style based on ingredients and preparation - Pursue Boosting and Neural Networks for this requirement - Requirement: For a given style suggest ingredients, ingredient amounts and preparation to create a new recipe - Use a Deep Belief Network (DBN) and randomly sample it to derive ingredient combinations
  • 29. Ahn, Y. Ahnert, S. Bagrow, J. Barabasi, A. (2011). Flavor network and the principles of food pairing. [pdf]. Retrieved from http://www.nature.com Ben-David, A. (2007). A lot of randomness is hiding in accuracy. [pdf]. Retrieved from http://www.sciencedirect.com De Clercq, M. (2014). Prediction of Ingredient Combinations using Machine Learning Techniques. [pdf]. Retrieved from http://lib.ugent.be/fulltxt/RUG01/002/166/653/RUG01-002166653_2014_0001_AC.pdf Jermsurawong, J. Habash, N. (2015). Predicting the Structure of Cooking Recipes. [pdf]. Retrieved from http://www.aclweb.org/anthology/D15-1090 Naik, J. Polamreddi, V. (2015). Cuisine Classification and Recipe Generation. [pdf]. Retrieved from http://cs229.stanford.edu/proj2015/233_report.pdf Nedovic, V. (2013). Learning recipe ingredient space using generative probabilistic models. [pdf]. Retrieved from http://liris.cnrs.fr/cwc/papers/cwc2013_submission_2.pdf Safreno, D. Deng, Y. (2013). The Recipe Learner. [pdf]. Retrieved from http://cs229.stanford.edu/proj2013/DengSafreno-TheRecipeLearner.pdf Teng, C. Lin, Y. Adamic, L. (2012). Recipe recommendation using ingredient networks. [pdf]. Retrieved from https://arxiv.org/pdf/1111.3919.pdf Valverde-Albacete, F. Peláez-Moreno, C. (2010). Two information-theoretic tools to assess the performance of multi-class classifiers. [pdf]. Retrieved from http://www.sciencedirect.com van der Maaten, L. Hinton, G. (2008). Visualizing Data using t-SNE. [pdf]. Retrieved from http://www.jmlr.org/papers/v9/vandermaaten08a.html Wang, L. Li, Q. Li, Na. Dong, G. Yang, Y. (2008). Substructure Similarity Measurement in Chinese Recipes. [pdf]. Retrieved from http://wwwconference.org/www2008/papers/pdf/p979-wang.pdf Reference