SlideShare uma empresa Scribd logo
1 de 21
Let’s Eat!
Brad Binder, Lesley Chapman,
Jon Froiland, David Lee
Introduction
History:
Since 1979 there have been services that review
and rank restaurants (Zagat)
•
Today:
According to Nielson – Americans have on
average 41 apps on their smartphones, many of
which provide a recommendation service
Introduction
A variety of restaurant recommendation apps
have been created
Features include: find restaurants, make reservations,
and healthy options
–
A Restaurant Recommender would aim to help
users save money, time, and could help cure
buyers remorse
Problem Summary
We need a tool that resolves the challenge of
finding a restaurant in your area based upon
specific cuisine and menu item criteria
entered by the user
Hypothesis
Hypothesis: The Restaurant Recommender will recommend a
more accurate restaurant compared to selecting a restaurant
based on chance alone
Ho (null hypothesis): A user will find a restaurant that they like
based on chance alone
HA(alternative hypothesis): The restaurant recommender app
will provide a better restaurant suggestion to the user compared
chance alone
Data Ingestion
• WORM Storage
–Stored HTML menu pages in one location
which could be read many times
• Parsed HTML with BeautifulSoup
–Built out a list of “Restaurant” objects
• GET requests to WMATA API to pull metro
station data
–JSON data parsed with pandas read_json()
function
Ingestion Wrangling Analysis Modeling Visualization
Wrangling and Munging
• Majority of time spent wrangling the data and
building restaurants
–Removing duplicate and incomplete
records
–Standardizing inconsistent fields (e.g. price)
–Aggregating and grouping
–Data types
• Merged restaurant and WMATA data using
Euclidean distance
Ingestion Wrangling Analysis Modeling Visualization
Data Overview
Ingestion Wrangling Analysis Modeling Visualization
964 Total Restaurants
115,517 Total Menu Items
• Restaurant data includes:
–Name
–Location (address, latitude, longitude)
–Type of cuisine
–Menu (item, price, description)
• WMATA data includes:
–Station name
–Location (latitude, longitude)
–Metro Line
Analysis
Ingestion Wrangling Analysis Modeling Visualization
10 cities
964 Restaurants
115,517 Menu Items
Analysis
Ingestion Wrangling Analysis Modeling Visualization
964 Restaurants
115,517 Menu Items
Washington, D.C.
Ingestion Wrangling Analysis Modeling Visualization
Washington, D.C.
Ingestion Wrangling Analysis Modeling Visualization
Feature Selection
• Four feature extraction pipelines using sklearn
–Chunking
–Cuisine Type
• TfidfVectorizer
–Extract keywords and assign significance score
– Tokenize and chunk parts of speech using nltk
• LabelBinarizer
–Convert cuisine types to binary features
• FeatureUnion
Ingestion Wrangling Analysis Modeling Visualization
Modeling and Prediction
• Transformation pipelines and transformed
feature vectors pickled
• Kmeans models fitted using training
restaurant data, then pickled
• User inputs entered via Flask are stored as
training instance
• Relevant pipeline and model loaded to
transform and predict
Ingestion Wrangling Analysis Modeling Visualization
K=15
Ingestion Wrangling Analysis Modeling Visualization
Ingestion Wrangling Analysis Modeling Visualization
Reporting and Visualization
• Restaurant recommendations are determined
by similarity within a matched cluster
–“Similarity” is calculated by minimizing sklearn’s
pairwise euclidean distance function between the
test data and the training instances in the feature
space
• Predictions are exported into an interactive
Tableau visualization
–Allows the user flexibility in making a selection
through filtering and visual indicators
Demo
Results
• Some predictions are good, others not so
good
–Some clusters still contain a “hodge podge”
• Removing the “cuisine type” feature helped to
eliminate what we saw as overfit
• Different k values saw better results in some
cases, worse in others
• Additional features (price, ratings, metro)
would require more clusters and MORE DATA
Conclusions
• More data over a “better” model
• Might improve results using transformations
like Singular Value Decomposition (SVD) or
Latent Dirichlet Allocation (LDA)
– Better model analysis
• With more data, improve our tokenizer
– Incorporate stemming, improve chunking
• Incorporating user feedback into prediction
model (ex: Flask interface)
Additional Opportunities
• “Waiter-caller” function that would allow users to login, use
the restaurant map search function, click on a restaurant, and
be matched up with menu items based on keyword matches.
As opposed to reading through an entire menu to find
relevant items.
–Required more knowledge and implementation of
javascript, css, and jinja into the Flask environment.
• Sentiment analyzer was developed but not integrated. Would
allow users to go to restaurant and input a review. The review
would then be analyzed giving back a recommended score (1-
5) to the user.
–Similar requirements
Sources
• Downey, Allen B. Think Bayes. O’Reilly Media; 1st Edition. 2013. Paperback.
• Downey, Allen B. Think Python. O’Reilly Media; 1st Edition, 2012. Paperback.
• Dwyer, Gareth. Flask by Example. Packt Publishing, 2016. Paperback.
• Harris, Harlin, Sean Murphy, and Marck Vaisman. Analyzing the Analyzers: An
Introspective Survey of Data Scientists and Their Work. O’Reilly Media; 1st Edition,
2013.
• Julian, David. Designing Machine Learning Systems with Python. Packt Publishing,
2016. Paperback.
• Kirk, Matthew. Thoughtful Machine Learning: A Test-Driven Approach. O’Reilly
Media; 1st Edition, 2014. Paperback.
• Kumar, Ashish. Learning Predictive Analytics with Python. Packt Publishing, 2016.
Paperback.
• McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy,
and IPython. O’Reilly Media; 1st Edition, 2012. Paperback.
• Mitchell, Ryan. Web Scraping with Python: Collecting Data from the Modern Web.
O’Reilly Media; 1st Edition, 2015. Paperback.
• Raschka, Sebastian. Python Machine Learning. Packt Publishing, 2015. Paperback.
• Segaran, Toby. Programming Collective Intelligence: Building Smart Web 2.0
Applications. O’Reilly Media, 2007. Paperback.

Mais conteúdo relacionado

Destaque

Georgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone ProjectGeorgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone ProjectMark Phillips
 
Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)Noah Turner
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationshipsdivyakalsi89
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in RDuyen Do
 

Destaque (6)

Machine learning
Machine learningMachine learning
Machine learning
 
Georgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone ProjectGeorgetown Data Analytics - Team 1 Capstone Project
Georgetown Data Analytics - Team 1 Capstone Project
 
Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)Georgetown Data Analytics Project (Team DC)
Georgetown Data Analytics Project (Team DC)
 
Discriminant analysis basicrelationships
Discriminant analysis basicrelationshipsDiscriminant analysis basicrelationships
Discriminant analysis basicrelationships
 
Hotel Performance FINAL
Hotel Performance FINALHotel Performance FINAL
Hotel Performance FINAL
 
Iris data analysis example in R
Iris data analysis example in RIris data analysis example in R
Iris data analysis example in R
 

Semelhante a Lets eat presentation_final_20160521

Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCJeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCMLconf
 
Dissertation paper
Dissertation paperDissertation paper
Dissertation paperRupal Rathi
 
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTableRecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTableJeremy Schiff
 
Elizabeth Hom's UX Portfolio
Elizabeth Hom's UX PortfolioElizabeth Hom's UX Portfolio
Elizabeth Hom's UX PortfolioElizabeth Hom
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommenderAnnie Thomas
 
Text mining of reviews
Text mining of reviewsText mining of reviews
Text mining of reviewsShivam Borikar
 
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...Jeremy Schiff
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Prototyping for web and mobile workshop
Prototyping for web and mobile workshopPrototyping for web and mobile workshop
Prototyping for web and mobile workshopSimon Phillips
 
Search Engine Marketing Campaign sample for California Fitness
Search Engine Marketing Campaign sample for California FitnessSearch Engine Marketing Campaign sample for California Fitness
Search Engine Marketing Campaign sample for California FitnessLeo Concepcion
 
Recommender Systems Dr Carol Hargreaves
Recommender Systems Dr Carol HargreavesRecommender Systems Dr Carol Hargreaves
Recommender Systems Dr Carol HargreavesCarol Hargreaves
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyMaya Hristakeva
 
The User Is Always Right (Usually): 4 User Research Methods That Get Results
The User Is Always Right (Usually): 4 User Research Methods That Get ResultsThe User Is Always Right (Usually): 4 User Research Methods That Get Results
The User Is Always Right (Usually): 4 User Research Methods That Get ResultsMichael Hartman
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchLucidworks
 
Emagineers - Design & Test Report
Emagineers - Design & Test ReportEmagineers - Design & Test Report
Emagineers - Design & Test ReportAlexis Polanco, Jr.
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation systemAkashPatil334
 
Adaptable Information Workshop slides
Adaptable Information Workshop slidesAdaptable Information Workshop slides
Adaptable Information Workshop slidesLouis Rosenfeld
 

Semelhante a Lets eat presentation_final_20160521 (20)

Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYCJeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC
 
Dissertation paper
Dissertation paperDissertation paper
Dissertation paper
 
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTableRecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
 
Elizabeth Hom's UX Portfolio
Elizabeth Hom's UX PortfolioElizabeth Hom's UX Portfolio
Elizabeth Hom's UX Portfolio
 
Restaurant recommender
Restaurant recommenderRestaurant recommender
Restaurant recommender
 
Text mining of reviews
Text mining of reviewsText mining of reviews
Text mining of reviews
 
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Prototyping for web and mobile workshop
Prototyping for web and mobile workshopPrototyping for web and mobile workshop
Prototyping for web and mobile workshop
 
Search Engine Marketing Campaign sample for California Fitness
Search Engine Marketing Campaign sample for California FitnessSearch Engine Marketing Campaign sample for California Fitness
Search Engine Marketing Campaign sample for California Fitness
 
Spoon
SpoonSpoon
Spoon
 
Recommender Systems Dr Carol Hargreaves
Recommender Systems Dr Carol HargreavesRecommender Systems Dr Carol Hargreaves
Recommender Systems Dr Carol Hargreaves
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
The User Is Always Right (Usually): 4 User Research Methods That Get Results
The User Is Always Right (Usually): 4 User Research Methods That Get ResultsThe User Is Always Right (Usually): 4 User Research Methods That Get Results
The User Is Always Right (Usually): 4 User Research Methods That Get Results
 
Webinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better SearchWebinar: Increase Conversion With Better Search
Webinar: Increase Conversion With Better Search
 
Emagineers - Design & Test Report
Emagineers - Design & Test ReportEmagineers - Design & Test Report
Emagineers - Design & Test Report
 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation system
 
Cs548 s15 showcase_web_mining
Cs548 s15 showcase_web_miningCs548 s15 showcase_web_mining
Cs548 s15 showcase_web_mining
 
NISO Webinar: Taking Your Website Wherever You Go: Delivering Great User Expe...
NISO Webinar: Taking Your Website Wherever You Go: Delivering Great User Expe...NISO Webinar: Taking Your Website Wherever You Go: Delivering Great User Expe...
NISO Webinar: Taking Your Website Wherever You Go: Delivering Great User Expe...
 
Adaptable Information Workshop slides
Adaptable Information Workshop slidesAdaptable Information Workshop slides
Adaptable Information Workshop slides
 

Último

原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量
原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量
原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量funaxa
 
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...kumargunjan9515
 
Top Call Girls in Tribeniganj 9332606886 High Profile Call Girls You Can G...
Top Call Girls in Tribeniganj   9332606886  High Profile Call Girls You Can G...Top Call Girls in Tribeniganj   9332606886  High Profile Call Girls You Can G...
Top Call Girls in Tribeniganj 9332606886 High Profile Call Girls You Can G...Sareena Khatun
 
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...Sareena Khatun
 
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...Sareena Khatun
 
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...HyderabadDolls
 
The Role of Hotel Prasanth in Thiruvananthapuram Tourism Development
The Role of Hotel Prasanth in Thiruvananthapuram Tourism DevelopmentThe Role of Hotel Prasanth in Thiruvananthapuram Tourism Development
The Role of Hotel Prasanth in Thiruvananthapuram Tourism Developmentassistantmarketing28
 
PRESTAIR MANUFACTURER OF DISPLAY COUNTER
PRESTAIR MANUFACTURER OF DISPLAY COUNTERPRESTAIR MANUFACTURER OF DISPLAY COUNTER
PRESTAIR MANUFACTURER OF DISPLAY COUNTERPRESTAIR SYSTEMS LLP
 
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...kumargunjan9515
 
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...kumargunjan9515
 
The Codex Alimentarius Commission (CAC).
The Codex Alimentarius Commission (CAC).The Codex Alimentarius Commission (CAC).
The Codex Alimentarius Commission (CAC).Ravikumar Vaniya
 
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理hwoudye
 
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabi
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabicontact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabi
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabihyt3577
 
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...kajalverma014
 
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...Sareena Khatun
 
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONS
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONSCLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONS
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONSDr. TATHAGAT KHOBRAGADE
 
How can AI food recipe generator elevate your experience.
How can AI food recipe generator elevate your experience.How can AI food recipe generator elevate your experience.
How can AI food recipe generator elevate your experience.Inventcolabs
 

Último (20)

原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量
原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量
原版1:1定制(IC大学毕业证)帝国理工学院大学毕业证国外文凭复刻成绩单#电子版制作#留信入库#多年经营绝对保证质量
 
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...
Call Girls Vanasthalipuram - 8250092165 Our call girls are sure to provide yo...
 
Top Call Girls in Tribeniganj 9332606886 High Profile Call Girls You Can G...
Top Call Girls in Tribeniganj   9332606886  High Profile Call Girls You Can G...Top Call Girls in Tribeniganj   9332606886  High Profile Call Girls You Can G...
Top Call Girls in Tribeniganj 9332606886 High Profile Call Girls You Can G...
 
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Chhapra [ 7014168258 ] Call Me For Genuine Models W...
 
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Morbi - 8250092165 Our call girls are sure to provide you with ...
 
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...
Call girls Service Nadiad / 8250092165 Genuine Call girls with real Photos an...
 
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...
Charbagh \ Book Call Girls in Lucknow Finest Escorts Service 9548273370 Avail...
 
The Role of Hotel Prasanth in Thiruvananthapuram Tourism Development
The Role of Hotel Prasanth in Thiruvananthapuram Tourism DevelopmentThe Role of Hotel Prasanth in Thiruvananthapuram Tourism Development
The Role of Hotel Prasanth in Thiruvananthapuram Tourism Development
 
PRESTAIR MANUFACTURER OF DISPLAY COUNTER
PRESTAIR MANUFACTURER OF DISPLAY COUNTERPRESTAIR MANUFACTURER OF DISPLAY COUNTER
PRESTAIR MANUFACTURER OF DISPLAY COUNTER
 
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...
Call girls Service Nacharam - 8250092165 Our call girls are sure to provide y...
 
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...
Call Girls in Kondapur - 8250092165 Our call girls are sure to provide you wi...
 
The Codex Alimentarius Commission (CAC).
The Codex Alimentarius Commission (CAC).The Codex Alimentarius Commission (CAC).
The Codex Alimentarius Commission (CAC).
 
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Mirzapur [ 7014168258 ] Call Me For Genuine Models ...
 
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理
一比一原版(uOttawa毕业证书)加拿大渥太华大学毕业证如何办理
 
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabi
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabicontact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabi
contact "+971)558539980" to buy abortion pills in Dubai, Abu Dhabi
 
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Kharagpur [ 7014168258 ] Call Me For Genuine Models...
 
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...
High Class Call Girls Hyderabad 10k @ I'm VIP Independent Escorts Girls 📞 935...
 
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...
Call Girls in Sihor - 8250092165 Our call girls are sure to provide you with ...
 
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONS
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONSCLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONS
CLASSIFICATION AND PROPERTIES OF FATS AND THEIR FUNCTIONS
 
How can AI food recipe generator elevate your experience.
How can AI food recipe generator elevate your experience.How can AI food recipe generator elevate your experience.
How can AI food recipe generator elevate your experience.
 

Lets eat presentation_final_20160521

  • 1. Let’s Eat! Brad Binder, Lesley Chapman, Jon Froiland, David Lee
  • 2. Introduction History: Since 1979 there have been services that review and rank restaurants (Zagat) • Today: According to Nielson – Americans have on average 41 apps on their smartphones, many of which provide a recommendation service
  • 3. Introduction A variety of restaurant recommendation apps have been created Features include: find restaurants, make reservations, and healthy options – A Restaurant Recommender would aim to help users save money, time, and could help cure buyers remorse
  • 4. Problem Summary We need a tool that resolves the challenge of finding a restaurant in your area based upon specific cuisine and menu item criteria entered by the user
  • 5. Hypothesis Hypothesis: The Restaurant Recommender will recommend a more accurate restaurant compared to selecting a restaurant based on chance alone Ho (null hypothesis): A user will find a restaurant that they like based on chance alone HA(alternative hypothesis): The restaurant recommender app will provide a better restaurant suggestion to the user compared chance alone
  • 6. Data Ingestion • WORM Storage –Stored HTML menu pages in one location which could be read many times • Parsed HTML with BeautifulSoup –Built out a list of “Restaurant” objects • GET requests to WMATA API to pull metro station data –JSON data parsed with pandas read_json() function Ingestion Wrangling Analysis Modeling Visualization
  • 7. Wrangling and Munging • Majority of time spent wrangling the data and building restaurants –Removing duplicate and incomplete records –Standardizing inconsistent fields (e.g. price) –Aggregating and grouping –Data types • Merged restaurant and WMATA data using Euclidean distance Ingestion Wrangling Analysis Modeling Visualization
  • 8. Data Overview Ingestion Wrangling Analysis Modeling Visualization 964 Total Restaurants 115,517 Total Menu Items • Restaurant data includes: –Name –Location (address, latitude, longitude) –Type of cuisine –Menu (item, price, description) • WMATA data includes: –Station name –Location (latitude, longitude) –Metro Line
  • 9. Analysis Ingestion Wrangling Analysis Modeling Visualization 10 cities 964 Restaurants 115,517 Menu Items
  • 10. Analysis Ingestion Wrangling Analysis Modeling Visualization 964 Restaurants 115,517 Menu Items
  • 11. Washington, D.C. Ingestion Wrangling Analysis Modeling Visualization
  • 12. Washington, D.C. Ingestion Wrangling Analysis Modeling Visualization
  • 13. Feature Selection • Four feature extraction pipelines using sklearn –Chunking –Cuisine Type • TfidfVectorizer –Extract keywords and assign significance score – Tokenize and chunk parts of speech using nltk • LabelBinarizer –Convert cuisine types to binary features • FeatureUnion Ingestion Wrangling Analysis Modeling Visualization
  • 14. Modeling and Prediction • Transformation pipelines and transformed feature vectors pickled • Kmeans models fitted using training restaurant data, then pickled • User inputs entered via Flask are stored as training instance • Relevant pipeline and model loaded to transform and predict Ingestion Wrangling Analysis Modeling Visualization
  • 15. K=15 Ingestion Wrangling Analysis Modeling Visualization
  • 16. Ingestion Wrangling Analysis Modeling Visualization Reporting and Visualization • Restaurant recommendations are determined by similarity within a matched cluster –“Similarity” is calculated by minimizing sklearn’s pairwise euclidean distance function between the test data and the training instances in the feature space • Predictions are exported into an interactive Tableau visualization –Allows the user flexibility in making a selection through filtering and visual indicators
  • 17. Demo
  • 18. Results • Some predictions are good, others not so good –Some clusters still contain a “hodge podge” • Removing the “cuisine type” feature helped to eliminate what we saw as overfit • Different k values saw better results in some cases, worse in others • Additional features (price, ratings, metro) would require more clusters and MORE DATA
  • 19. Conclusions • More data over a “better” model • Might improve results using transformations like Singular Value Decomposition (SVD) or Latent Dirichlet Allocation (LDA) – Better model analysis • With more data, improve our tokenizer – Incorporate stemming, improve chunking • Incorporating user feedback into prediction model (ex: Flask interface)
  • 20. Additional Opportunities • “Waiter-caller” function that would allow users to login, use the restaurant map search function, click on a restaurant, and be matched up with menu items based on keyword matches. As opposed to reading through an entire menu to find relevant items. –Required more knowledge and implementation of javascript, css, and jinja into the Flask environment. • Sentiment analyzer was developed but not integrated. Would allow users to go to restaurant and input a review. The review would then be analyzed giving back a recommended score (1- 5) to the user. –Similar requirements
  • 21. Sources • Downey, Allen B. Think Bayes. O’Reilly Media; 1st Edition. 2013. Paperback. • Downey, Allen B. Think Python. O’Reilly Media; 1st Edition, 2012. Paperback. • Dwyer, Gareth. Flask by Example. Packt Publishing, 2016. Paperback. • Harris, Harlin, Sean Murphy, and Marck Vaisman. Analyzing the Analyzers: An Introspective Survey of Data Scientists and Their Work. O’Reilly Media; 1st Edition, 2013. • Julian, David. Designing Machine Learning Systems with Python. Packt Publishing, 2016. Paperback. • Kirk, Matthew. Thoughtful Machine Learning: A Test-Driven Approach. O’Reilly Media; 1st Edition, 2014. Paperback. • Kumar, Ashish. Learning Predictive Analytics with Python. Packt Publishing, 2016. Paperback. • McKinney, Wes. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O’Reilly Media; 1st Edition, 2012. Paperback. • Mitchell, Ryan. Web Scraping with Python: Collecting Data from the Modern Web. O’Reilly Media; 1st Edition, 2015. Paperback. • Raschka, Sebastian. Python Machine Learning. Packt Publishing, 2015. Paperback. • Segaran, Toby. Programming Collective Intelligence: Building Smart Web 2.0 Applications. O’Reilly Media, 2007. Paperback.