SlideShare a Scribd company logo
1 of 38
Download to read offline
Discover Your Latent Food Graph
with this 1 Weird Trick
Grubhub Search Data Science
Restaurant
Recommendations
Menu/Dish
Recommendations
Rest/Dish/Cuisine
Search
Cuisine
Recommendations
Ecommerce Dilemma
● Our catalog grows everyday
● Data is unstructured
● and unbounded
● How can we understand it to
drive: search &
recommendations?
Use Cases
● Where can I get amazing Blueberry Pancakes? (semantic dish search)
● What are some synonyms for Pierogi? (query expansion)
● Show me French restaurants in Brooklyn? (semantic cuisine search)
● What are the top-10 asian noodle dishes near me? (semantics dish recs)
● Find me a new French restaurant that I’ll like (personalized restaurant recs)
Weird Trick: Representation Learning
1. Query2vec: understanding users
2. Rest2vec: understanding restaurants
3. FastMenu: understanding menus
Users + Restaurants + Menus = Grubhub Food Universe
query2vec
Query Understanding
● Language
Normalization
● Intent Classification
Query Building
● Filtering
● Query Expansion
Candidate
Selection
● Phrase/Term
Matching
● Semantic Matching
Enrichment
● Pruning
● Hydration
● Pagination
Ranking
● Revenue
● Relevance
● Personalization
Query Expansion
Original Query:
● Dan Dan Noodles
Expanded Query:
● Dan Dan Noodles
● Spicy Noodles
● Chinese
● Japanese
● Asian
Increased Recall!
Classical Query Expansion
● Thesaurus/Synonyms: cranium, brain, noggin, thinker
● Knowledge Graph
Modern Query Expansion
● Representation Learning
○ Click Pattern Mining: Cluster similar queries
based on converting restaurant
○ query2vec à la word2vec
“Dan Dan Noodles”
#network weights
query_embeddings =
tf.Variable(tf.random_uniform([len(query_mapping),
k], -1.0, 1.0), name="query_embeddings")
softmax_weights =
tf.Variable(tf.truncated_normal([len(item_mapping),
k], stddev=1.0 / math.sqrt(k)),
name="softmax_weights")
softmax_bias =
tf.Variable(tf.zeros([len(item_mapping)]),
name="softmax_bias")
#Select input items from embedding.
x_one_hot = tf.one_hot(x, len(query_mapping),
name="one_hot_input")
h = tf.matmul(x_one_hot, query_embeddings,
name="projection") # [None, K]
#select input labels
batched_labels = tf.reshape(mapped_labels, [-1, 1])
logits = tf.matmul(h, tf.transpose( softmax_weights))
full_softmax_loss =
tf.nn.sparse_softmax_cross_entropy_with_logits(logit
s=logits, labels=batched_labels)
approx_softmax_loss =
tf.nn.nce_loss(softmax_weights, softmax_bias,
batched_labels, h, neg, len(item_mapping))
train_op =
tf.train.AdamOptimizer(learning_rate).minimize(tf.re
duce_mean(loss))
return tf.estimator.EstimatorSpec(mode=mode,
loss=mean_loss, train_op=train_op)
Data
● Dataset: 1 year of (search query, restaurant_id) pairs
● Spark preprocessing: normalization w/ EMR cluster
● 10min/epoch @1 GPU (AWS p2)
Rest2Vec
Query Understanding
● Language
Normalization
● Intent Classification
Query Building
● Filtering
● Query Expansion
Candidate
Selection
● Phrase/Term
Matching
● Semantic Matching
Enrichment
● Pruning
● Hydration
● Pagination
Ranking
● Revenue
● Relevance
● Personalization
Rest2Vec
Creates numerical vector representation of restaurants from historical clickstream
data using user’s clicks/conversions
● Helps to understand Restaurants
● Helps to power Discovery
● Helps to power Personalization
From Word2Vec to Rest2Vec (Data)
Distributional Hypothesis
A word is characterized by the company it keeps - Firth (1957)
From Word2Vec to Rest2Vec (Algorithm)
Word2Vec Rest2Vec
Training Data
● Number of Intentful Sessions ~ 60M
● Interactions per Session 4 to 8
● Number of Restaurants ~140K
● Sample Session
Tensorboard
Visualization
● Each Market has its
own Cluster
● Cluster size indicates
how big the Market is
Integration with Service
Learn more about Fast KNN lookup using Annoy here
Query Understanding
● Language
Normalization
● Intent Classification
Query Building
● Filtering
● Query Expansion
Candidate
Selection
● Phrase/Term
Matching
● Semantic Matching
Enrichment
● Pruning
● Hydration
● Pagination
Ranking
● Revenue
● Relevance
● Personalization
FastMenu
FastMenu
Creates numerical vector representation of menu items using associated textual
data rather than diner behavior
● Helps to understand menus
● Helps to power semantic search
● Complete catalogue coverage
Menu Text Matching
Menu Item Description:
● mai fun
● blueberry pancake
String Matched Menu Items:
● mai fun, chow fun, shrimp mai fun
● blueberry smoothie, buttermilk pancake
Semantic Matched Menu Items
● stir fried noodles, thin rice noodles
● grand slam breakfast
● Increased recall
Static Sequence Embeddings
● Fasttext = sub-words
● Handles out of vocabulary words
● “pizza”
● <START>p, pi, iz, zz, za, a<END>
Menu Item Feature
How do you characterize a unique menu item with text?
Text Source Example Use
Restaurant Name San Gennaro’s No - no semantic info
Name margarita pizza Yes
Description Adorned simply in the colors of the Italian flag:
green from basil, white from mozzarella, red from
tomato sauce.
Yes
Menu Section House Favorites No - too noisy
Restaurant Cuisine Pizza, Subs, Italian, American, Lunch Specials Yes
Reviews “3/5” No
BUT: This content has no location awareness
Tensorboard
Visualization
● Each Market has its
own Cluster
● Cluster size indicates
how big the Market is
Geohashes:
● Covers the surface of the earth
● Denotes rectangular area
● Alphanumeric string
● ~32 bit lat-long specification
● Nested precision levels
dr
dr5 drh
dr5x dr5z
Geohash Embedding
Geohashes: same representative characters as language
● Location words: geohash (dr5ru)
● Sentence = geohashes < 40 mi
“dr725 dr72h dr72j dr5rg dr5ru dr5rv dr5re dr5rs dr5rt”
● Concat geohash sentence to menu text
margarita pizza adorned simply colors italian flag green from
basil white from mozzarella red from tomato sauce pizza subs
italian american lunch specials dr725 dr72h dr72j dr5rg dr5ru
dr5rv dr5re dr5rs dr5rt
● Expand “word” vocabulary, but still 26 chars and 10
numbers
● Menu item text now knows about location
Data
Menu items: ~10 M
Geohash radius: 40 mi - geohash precision 4
Embedding Dimension: 30
Vocabulary: 3k words account for 97 % of all words used to describe menu items.
Visualization: TensorBoard
t-SNE: local variation
(cuisine separable)
PCA: global variation
(geography separable)
Phoenix:
mexican Topeka NashvillePhoenix
Phoenix:
asian
Phoenix:
indian
Nashville:
asian
Topeka:
Small market
Nearest Neighbors
Now we can answer the important questions
AMAZING!!!
blueberry
pancakes
Are pierogis really
empanadas?!?
10 Asian
noodles
near you
Try this French
restaurant instead
Alex Egg: @eggie5
Emily Ray: eray1@grubhub.com
Parin Choganwala: pchoganwala@grubhub.com
FOR MORE INFO CHECK OUT:
https://bit.ly/32fmBwJ

More Related Content

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Discover your-latent-food-graph-with-this-1-weird-trick -- PyData NYC 2019

  • 1. Discover Your Latent Food Graph with this 1 Weird Trick Grubhub Search Data Science
  • 2.
  • 4. Ecommerce Dilemma ● Our catalog grows everyday ● Data is unstructured ● and unbounded ● How can we understand it to drive: search & recommendations?
  • 5. Use Cases ● Where can I get amazing Blueberry Pancakes? (semantic dish search) ● What are some synonyms for Pierogi? (query expansion) ● Show me French restaurants in Brooklyn? (semantic cuisine search) ● What are the top-10 asian noodle dishes near me? (semantics dish recs) ● Find me a new French restaurant that I’ll like (personalized restaurant recs)
  • 6. Weird Trick: Representation Learning 1. Query2vec: understanding users 2. Rest2vec: understanding restaurants 3. FastMenu: understanding menus Users + Restaurants + Menus = Grubhub Food Universe
  • 7. query2vec Query Understanding ● Language Normalization ● Intent Classification Query Building ● Filtering ● Query Expansion Candidate Selection ● Phrase/Term Matching ● Semantic Matching Enrichment ● Pruning ● Hydration ● Pagination Ranking ● Revenue ● Relevance ● Personalization
  • 8. Query Expansion Original Query: ● Dan Dan Noodles Expanded Query: ● Dan Dan Noodles ● Spicy Noodles ● Chinese ● Japanese ● Asian Increased Recall!
  • 9. Classical Query Expansion ● Thesaurus/Synonyms: cranium, brain, noggin, thinker ● Knowledge Graph Modern Query Expansion ● Representation Learning ○ Click Pattern Mining: Cluster similar queries based on converting restaurant ○ query2vec à la word2vec “Dan Dan Noodles”
  • 10. #network weights query_embeddings = tf.Variable(tf.random_uniform([len(query_mapping), k], -1.0, 1.0), name="query_embeddings") softmax_weights = tf.Variable(tf.truncated_normal([len(item_mapping), k], stddev=1.0 / math.sqrt(k)), name="softmax_weights") softmax_bias = tf.Variable(tf.zeros([len(item_mapping)]), name="softmax_bias") #Select input items from embedding. x_one_hot = tf.one_hot(x, len(query_mapping), name="one_hot_input") h = tf.matmul(x_one_hot, query_embeddings, name="projection") # [None, K] #select input labels batched_labels = tf.reshape(mapped_labels, [-1, 1]) logits = tf.matmul(h, tf.transpose( softmax_weights)) full_softmax_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logit s=logits, labels=batched_labels) approx_softmax_loss = tf.nn.nce_loss(softmax_weights, softmax_bias, batched_labels, h, neg, len(item_mapping)) train_op = tf.train.AdamOptimizer(learning_rate).minimize(tf.re duce_mean(loss)) return tf.estimator.EstimatorSpec(mode=mode, loss=mean_loss, train_op=train_op)
  • 11. Data ● Dataset: 1 year of (search query, restaurant_id) pairs ● Spark preprocessing: normalization w/ EMR cluster ● 10min/epoch @1 GPU (AWS p2)
  • 12.
  • 13.
  • 14.
  • 15. Rest2Vec Query Understanding ● Language Normalization ● Intent Classification Query Building ● Filtering ● Query Expansion Candidate Selection ● Phrase/Term Matching ● Semantic Matching Enrichment ● Pruning ● Hydration ● Pagination Ranking ● Revenue ● Relevance ● Personalization
  • 16. Rest2Vec Creates numerical vector representation of restaurants from historical clickstream data using user’s clicks/conversions ● Helps to understand Restaurants ● Helps to power Discovery ● Helps to power Personalization
  • 17. From Word2Vec to Rest2Vec (Data) Distributional Hypothesis A word is characterized by the company it keeps - Firth (1957)
  • 18. From Word2Vec to Rest2Vec (Algorithm) Word2Vec Rest2Vec
  • 19.
  • 20. Training Data ● Number of Intentful Sessions ~ 60M ● Interactions per Session 4 to 8 ● Number of Restaurants ~140K ● Sample Session
  • 21. Tensorboard Visualization ● Each Market has its own Cluster ● Cluster size indicates how big the Market is
  • 22.
  • 23.
  • 24. Integration with Service Learn more about Fast KNN lookup using Annoy here
  • 25. Query Understanding ● Language Normalization ● Intent Classification Query Building ● Filtering ● Query Expansion Candidate Selection ● Phrase/Term Matching ● Semantic Matching Enrichment ● Pruning ● Hydration ● Pagination Ranking ● Revenue ● Relevance ● Personalization FastMenu
  • 26. FastMenu Creates numerical vector representation of menu items using associated textual data rather than diner behavior ● Helps to understand menus ● Helps to power semantic search ● Complete catalogue coverage
  • 27. Menu Text Matching Menu Item Description: ● mai fun ● blueberry pancake String Matched Menu Items: ● mai fun, chow fun, shrimp mai fun ● blueberry smoothie, buttermilk pancake Semantic Matched Menu Items ● stir fried noodles, thin rice noodles ● grand slam breakfast ● Increased recall
  • 28. Static Sequence Embeddings ● Fasttext = sub-words ● Handles out of vocabulary words ● “pizza” ● <START>p, pi, iz, zz, za, a<END>
  • 29. Menu Item Feature How do you characterize a unique menu item with text? Text Source Example Use Restaurant Name San Gennaro’s No - no semantic info Name margarita pizza Yes Description Adorned simply in the colors of the Italian flag: green from basil, white from mozzarella, red from tomato sauce. Yes Menu Section House Favorites No - too noisy Restaurant Cuisine Pizza, Subs, Italian, American, Lunch Specials Yes Reviews “3/5” No BUT: This content has no location awareness
  • 30. Tensorboard Visualization ● Each Market has its own Cluster ● Cluster size indicates how big the Market is
  • 31. Geohashes: ● Covers the surface of the earth ● Denotes rectangular area ● Alphanumeric string ● ~32 bit lat-long specification ● Nested precision levels dr dr5 drh dr5x dr5z
  • 32. Geohash Embedding Geohashes: same representative characters as language ● Location words: geohash (dr5ru) ● Sentence = geohashes < 40 mi “dr725 dr72h dr72j dr5rg dr5ru dr5rv dr5re dr5rs dr5rt” ● Concat geohash sentence to menu text margarita pizza adorned simply colors italian flag green from basil white from mozzarella red from tomato sauce pizza subs italian american lunch specials dr725 dr72h dr72j dr5rg dr5ru dr5rv dr5re dr5rs dr5rt ● Expand “word” vocabulary, but still 26 chars and 10 numbers ● Menu item text now knows about location
  • 33. Data Menu items: ~10 M Geohash radius: 40 mi - geohash precision 4 Embedding Dimension: 30 Vocabulary: 3k words account for 97 % of all words used to describe menu items.
  • 34. Visualization: TensorBoard t-SNE: local variation (cuisine separable) PCA: global variation (geography separable) Phoenix: mexican Topeka NashvillePhoenix Phoenix: asian Phoenix: indian Nashville: asian Topeka: Small market
  • 36.
  • 37. Now we can answer the important questions AMAZING!!! blueberry pancakes Are pierogis really empanadas?!? 10 Asian noodles near you Try this French restaurant instead
  • 38. Alex Egg: @eggie5 Emily Ray: eray1@grubhub.com Parin Choganwala: pchoganwala@grubhub.com FOR MORE INFO CHECK OUT: https://bit.ly/32fmBwJ