Machine Learning with GraphLab Create

Dato Confidential1
Neel Kishan – Technical Sales Lead
neel@dato.com

Dato Confidential
Hello my name is
Neel Kishan
Technical Sales Lead
(former neuroscientist, GPU programmer,
Eagle Scout, Chicago sports fan)
2
neel@dato.com
Let’s Schedule a Time to Talk:
https://calendly.com/dato-neel

Dato Confidential
We empower developers to
create intelligent applications with
real-time machine learning services
quickly and easily.
Intelligent
Applications
Dato
Platform
GraphLab
Create
Dato
Predictive
Services
Machine
Learning
Lifecycle

Dato Confidential4
Teams have found ways to build
intelligent applications…
Recommenders
Lead Scoring
Churn Prediction
Multi-channel Targeting
Auto-Summarization
Fraud detection
Intrusion Detection
Demand Forecasting
Data Matching
Failure Prediction

Dato Confidential5
Why do these projects take so long?
• Lengthy code rewrites for scalable production services
• Mundane tasks to integrate libraries, transform data to
specific formats, fill in missing values, etc.
• Many tools are just slow

Dato Confidential6
Challenges for developing intelligent apps
• Algorithm-centric APIs create confusion and a steep
learning curve
• Understanding models has been a craft passed only
through tribal knowledge
• Production services are hard to maintain and manage

Dato Confidential
Intuitive APIs
Easy to learn with smart defaults so your first application comes together fast
Deploy instantly as REST
Eliminates the lengthy rewrites to integrate and serve live, at scale
Integrated libraries for any data
Deep learning, graphs, text, and images on a common scalable data structure eliminates all the
glue code and context switching
Dato Machine Learning
Built to rapidly deliver intelligent applications

Dato Confidential
What makes Dato special?
8

Dato Confidential
The Dato Machine Learning Platform
Deploy
Models
Feedback
GraphLab Create &
Dato Distributed
TrainDevelop
Experiments
Dato Predictive Services
Serve
(REST API)
Monitor
www.
on your infrastructure:
GraphLab Create &
Dato Distributed
• Creating models
• Data engineering
• Evaluation &
Visualization
Predictive Services
• Serving models
• Live experimentation
• Model management

Dato Confidential10
Scalable Data Structures for Machine Learning
User Com.
Title Body
User Disc.
SFrame - on-disk, columnar & partitioned table
SGraph – graph structure composed of multiple tables
TimeSeries – table with a time index

Dato Confidential
High performance machine learning
11
0.60%
0.65%
0.70%
0.75%
0.80%
0.85%
0 2 4 6 8 10 12
TestError
Time(hr)
H2O.ai:
10 machines/80 cores
recommenders deep learning & images graph analytics
Faster algorithms accelerate teams
Fails to complete on other systems!

Dato Confidential12
Intuitive API – Easily create a live machine learning service
import graphlab as gl
data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(
data,
user_id='user',
item_id='movie’,
target='rating')
recommendations = model.recommend(k=5)
cluster = gl.deploy.load(‘s3://path’)
cluster.add(‘servicename’, model)
Create a Recommender
5 lines of code
Toolkit w/auto selection
Deploy in minutes

Dato Confidential13
Dato Machine Learning Toolkits
Applications
• recommender
• sentiment_analysis
• churn_predictor
• data_matching
• pattern_mining
• anomaly_detection
Fundamentals
• regression
• classifier
• nearest_neighbors
• clustering
• deeplearning
• text_analytics
• graph_analytics
Utilities
• model_parameter_search
• cross_validation
• evaluation
• comparison
• feature_engineering
Join us April 7th for a webinar on Deep Learning: Image Similarity and Beyond

Dato Confidential
Demo of GLC & PS
14

Dato Confidential
Deployment scenarios
15

Dato Confidential16
Neel Kishan – Technical Sales Lead
neel@dato.com

Dato Confidential
Appendix
And Supporting Material

Dato Confidential
Dato is becoming the backbone of intelligent applications for 80+ customers
• Commercialization of Carnegie Mellon ML Project founded by Professor
Carlos Guestrin in 2013
• Vibrant user community numbering 40,000+ from Coursera and open
source projects
• Major customers in retail, finance, media, and software
18

Dato Confidential19
Appendix
1919
Deployment Scenarios &
Pricing

Dato Confidential
Machine Learning Deployment Options
20
Dato Predictive Services
Batch write of predictions
Embedded process or script
Export (e.g. PMML)

Dato Confidential
Pricing
• Subscription license
which includes support
and and upgrades
• Licensed by user for
Create & by machine for
production use
• Training & technical
services also available
21

Dato Confidential222222
Use Cases

Dato Confidential23
Our customers are leading
the creation of intelligent
applications

Dato Confidential
Quantifying the value – Fastest to Production & Reduced Operational Cost
Built a 90% accurate sentiment analyzer for hotel reviews after 30 minutes of trying Dato’s
GraphLab Create
Created an efficient (40 mins in Dato vs. 33 days in R) pipeline with 46% lift in accuracy
“[Dato’s] GraphLab CreateTM gives us easy access to some of the most advanced machine
learning and this lets us iterate on our ideas faster”
24
Simplify the process to develop and deploy internal services for SalesForce PDS and adjacent teams
Reduced hundreds of tools to manage, complexity of solution, and development time
Achieved in 2 days with Dato’s GraphLab Create what took 2 weeks in R
Dropped concept to deployment from months to minutes
Replace a heuristic heavy job ranking system to improve job search relevance
Developed in weeks with significant increase in clickthrough after years of no growth

Dato Confidential
Fraud Detection and Security
“Merchant intelligence for safer, more profitable commerce.”
Others like Alan & G2 Web Services:
Alan Krumholz, Principal Data Scientist
Score merchants based on their web presence and actions to help their
banking customers identify fraudulent merchants.
Accelerate business decisions, reducing manual intervention required
and minimizing false positives.
Achieved in 2 days with GraphLab Create what took two weeks in R.
Dropped deployment from months to minutes.
WHO:
INSPIRATION:
VALUE:
OUTCOME:
Customer Success Story
25

Dato Confidential
Data Matching
“Fast, free, thorough home search.”
Others like Nick & Zillow:
Nicholas McClure, Senior Data Scientist
Build a service that matches property listings across many inbound data
feeds and collapses to a most accurate listing.
Data & listing quality is critical to Zillow’s core product.
Created an efficient (40 mins in GLC vs. 33 day R pipeline) pipeline with
much higher accuracy (95% up from 65%).
WHO:
INSPIRATION:
VALUE:
OUTCOME:
26

Dato Confidential
Recommenders
They are the site for “Advice and support on pregnancy and parenting.”
Others like Shelley & BabyCenter:
Shelley Klopp, DBA & Chief Architect
Build and deploy their first recommender to increase session engagement
by recommending relevant content
Initial model increased average session by multiple page views
First prototype built in < 1 week
Ongoing model experimentation is increasing engagement
WHO:
INSPIRATION:
VALUE:
OUTCOME:
27

Dato Confidential
Sentiment and Text Analysis
“Get hired. Love your job.”
Others like Marcos and Glassdoor:
Marcos Sainz, Lead Machine Learning Engineer
Replace a heuristic heavy job ranking system with an ML driven system
to improve job search relevance
More relevant jobs led to happier users and higher clickthrough
Concept to production in weeks
WHO:
INSPIRATION:
VALUE:
OUTCOME:
28

Dato Confidential
Image analytics and Deep features
“Smart waste management.”
Others like Ben & Compology:
Ben Chehebar, Co-founder/Lead of Product
Use machine learning to predict how full dumpsters are.
This allows them to augment their human classification using mechanical
turk and allows them to scale their operations.
Concept to deployed service in less than a month with accuracy as good
or better than the humans.
WHO:
INSPIRATION:
VALUE:
OUTCOME:
29

Machine Learning with GraphLab Create

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (14)

Semelhante a Machine Learning with GraphLab Create

Semelhante a Machine Learning with GraphLab Create (20)

Mais de Turi, Inc.

Mais de Turi, Inc. (20)

Último

Último (20)

Machine Learning with GraphLab Create

Notas do Editor