SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Machine Learning with H2O.ai
on Google Cloud
Nicholas Png
Partnerships Software Engineer
nicholas@h2o.ai
Who is H2O.ai?
Company
● Founded in Silicon Valley
in 2012
● Funded: $75m
● Investors: Wells Fargo,
NVIDIA, Nexus Ventures,
Paxion Ventures
Products
● H2O Open Source Machine
Learning (14,000
organizations)
● H2O Driverless AI -
Automated Machine
Learning
Leadership
Leader in Gartner MQ
machine learning and
data science platform
Team
90 AI experts (5 of the
world’s top 100 data
scientists with Kaggle
Grandmasters)
Global
Mountain View
London
Prague
India
Technology leader with most
completeness of vision
Recognized for the mindshare, partner network
and status as a quasi-industry standard for
machine learning and AI
H2O.ai customers gave the highest overall
score among all the vendors for sales relationship
and account management, customer support
(onboarding, troubleshooting, etc.) and overall
service and support
Get the Gartner Magic Quadrant here
H2O.ai is a Leader in the 2018 Gartner Data Science and
Machine Learning Platforms Magic Quadrant
In-Memory,
Distributed Machine
Learning Algorithms
with H2O Flow GUI
H2O AI Open
Source Engine
Integration
with Spark
Lightning Fast
machine learning on
GPUs
100% open source – Apache V2 licensed
Built for data scientists – interface using R, Python
on H2O Flow (interactive notebook interface)
We offer Enterprise Support subscriptions
Commercial Licensed
(closed source)
Built for domain users, analysts &
data scientists – GUI based
interface for end-to-end
data science
Fully automated machine
learning from ingest to
deployment
We offer user licenses on a per
seat basis (annual subscription)
Automatic feature engineering,
machine learning and
interpretability
H2O.ai Product Suite
H2O-3
What is H2O?
Math Platform
Open source in-memory
AI engine
● Parallelized and distributed
algorithms
● GLM, Random Forest, GBM,
Deep Learning, etc.
Tech and API
Easy to use and adopt
● Written in Java - perfect for
Java programmers
● Install is lightweight
● REST API (Java) - run H2O
from R, Python, WebUI
Big data
More data or better models?
BOTH
● Use all of your data - model
without sampling
● More data + better models
= better predictions
Clustering
• K-Means (Auto-K)
Dimension reduction
• Principal Component Analysis
• Generalized Low Rank Models
Word embedding
• Word2Vec
Time series
• iSAX
Machine Learning tuning
• Hyperparameter Search
• Early Stopping
Algorithms on H2O
Statistical analysis
• Linear Models (GLM)
• Naïve Bayes
Ensembles
• Random Forest
• Distributed Trees
• Gradient Boosting Machine
• Stacking / Super Learner
Deep Neural Networks
• NLP
• Autoencoder
• Anomaly Detection
• Deep Features
• CNN, RNN (Deep Water)
+
Data integration
Data quality and
transformation
Modeling table Model building Model
Features Target
Simplified typical machine learning pipeline
Production
Environments
JobFluid Vector Frame
MRTaskDistributed K/V Store
Distributed Fork/JoinNon-Blocking Hash Table
Distributed In-Memory Processing
REST / JSON
Parse
Exploratory
Analysis
Feature
Engineering
ML
Algorithms
Model
Evaluation
Scoring
Data/Model
Export
SQL
NFS
Local
GCS
HDFS
POJO
High level architecture
Driverless AI
Driverless AI delivers
“Expert data scientist in a box”
Created and supported by world renowned AI experts
Empowers companies to accomplish AI and ML
with a single platform
Performs the function of an expert data scientist
and adds more power to both novice and expert
teams
Details and highlights insights and interpretability
with easy to understand results and visualizations
21 day free trial for Driverless AI
Driverless AI
+
Data integration
Data quality and
transformation
Modeling table Model building Model
Features Target
Typical enterprise machine learning workflow
Data is a team sport
~100
Data science experts in the
world
Weeks to
hours
Time for a data scientist to
build a model
Black box models
Lack of AI talent Time to insights slow Lack of trust in AI
”US alone faces a shortage of 190,000
people with analytical expertise.”
Driverless
AI delivers
Your digital data
scientist
Automatic Feature Engineering with
GPU accelerated machine learning
Explainable and
Interpretable AI
Why Driverless AI for Enterprise AI adoption
Automatic feature engineering to
increase accuracy - AlphaGo for AI
Automatic Kaggle Grandmaster
recipes in a box for solving wide
variety of use-cases
Automatic machine learning to find
and tune the right ensemble of
models
Accuracy
Original features
Generated
features
Automatic Text Handling
Frequency Encoding
Cross Validation Target Encoding
Truncated Singular Value
Decomposition
Clustering and more
Feature transformations
Auto feature generation
Kaggle Grandmaster Out of the Box
Deployment Options
YARN
CPU CPU
Model BuildingSQL NFS
GCS
Kubernetes / Kubefow
H2O.ai Driverless AI
H2O Distributed
In-Memory
H2O.ai + Kubeflow
CPU
H2
O Flow
H2O Cluster
(H2O can run anywhere: desktop, cloud, on-prem;
Hadoop and Spark environments supported)
Model training
Model Repository
POJO
(java file)
MOJO
(zip file)
C++
MOJO
Library
Java
MOJO
Library
Java R Py .NET ...
...
...
Apps Language bindings
Model management Model deployment
(Store models in H2O Steam, git, HDFS, S3, etc.) (Add any language with C/C++ binding support)
Save Model
Load Model
Load Model
H2O deployment options
BigQuery
NFS
Local
Cloud
Storage
HDFS
Storage Data Munging Driverless AI
Compute Engine
MOJO
(.zip)
Compute Engine
Inference
●Initial data stored on
HDFS or Google
BigQuery
●Deploy MOJO file to serve
real-time inference (millisecond
response times)
●Additional logic can be placed
before or after calling the MOJO
High Level Deployment Pipeline - Spark
Google Dataproc
●Save munged data to structured data file
●Ingest data file into Driverless AI
●Automatic feature engineering
●Automatic visualizations
●Complete model pipeline exported as MOJO
●Generate high performance model, ensemble
XGBoost, + TF + RunFit
●Ingest data into Spark running
on Google Dataproc.
●Use Sparkling Water for
preliminary modeling and data
munging.
●Current data pipeline can be
added here
BigQuery
NFS
Local
Cloud
Storage
HDFS
Storage
Google BigQuery
Data Munging Driverless AI
Compute Engine
MOJO
(.zip)
Compute Engine
Inference
●Initial data stored on
HDFS or Google
BigQuery
●Perform data cleaning and data
munging in Google BigQuery.
●Driverless AI has an integrated
connector with GBQ for direct
data ingest via SQL queries
●Automatic feature engineering
●Automatic visualizations
●Complete model pipeline exported as MOJO
●Generate high performance models, ensemble
XGBoost + TF + RunFit
●Deploy MOJO file to serve
real-time inference (millisecond
response times)
●Additional logic can be placed
before or after calling the MOJO
High Level Deployment Pipeline - BigQuery
Demo Time!
nicholas@h2o.ai

Mais conteúdo relacionado

Mais procurados

Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Databricks
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneSri Ambati
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureMark Tabladillo
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)Jasjeet Thind
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranDatabricks
 
Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLYulia Tell
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning PlatformMk Kim
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling WaterSri Ambati
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OSri Ambati
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI TodayDESMOND YUEN
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...Databricks
 
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...Databricks
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Databricks
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Rodney Joyce
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentDatabricks
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Databricks
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019VMware Tanzu
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R MeetupJo-fai Chow
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Sri Ambati
 

Mais procurados (20)

Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
Using BigDL on Apache Spark to Improve the MLS Real Estate Search Experience ...
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
 
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)DevOps and Machine Learning (Geekwire Cloud Tech Summit)
DevOps and Machine Learning (Geekwire Cloud Tech Summit)
 
CI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel KobranCI/CD for Machine Learning with Daniel Kobran
CI/CD for Machine Learning with Daniel Kobran
 
Distributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDLDistributed Deep Learning At Scale On Apache Spark With BigDL
Distributed Deep Learning At Scale On Apache Spark With BigDL
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
 
H2O PySparkling Water
H2O PySparkling WaterH2O PySparkling Water
H2O PySparkling Water
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
Accelerate Your AI Today
Accelerate Your AI TodayAccelerate Your AI Today
Accelerate Your AI Today
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
Portable Scalable Data Visualization Techniques for Apache Spark and Python N...
 
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
 
Infrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload DeploymentInfrastructure Agnostic Machine Learning Workload Deployment
Infrastructure Agnostic Machine Learning Workload Deployment
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
 
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
Operationalizing AI at scale using MADlib Flow - Greenplum Summit 2019
 
H2O at Berlin R Meetup
H2O at Berlin R MeetupH2O at Berlin R Meetup
H2O at Berlin R Meetup
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
 

Semelhante a Machine Learning on Google Cloud with H2O

Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2OSri Ambati
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AIJames Serra
 
Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterSri Ambati
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Ian Gomez
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OSri Ambati
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 editionDavid Talby
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureMark Tabladillo
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAlberto Diaz Martin
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Sri Ambati
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckNicholas Vossburg
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform OverviewDavid Chou
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
 
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Startup Club
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJANicolas Poggi
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSSri Ambati
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 

Semelhante a Machine Learning on Google Cloud with H2O (20)

Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Machine Learning and AI
Machine Learning and AIMachine Learning and AI
Machine Learning and AI
 
Belgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep WaterBelgrade R - Intro to H2O and Deep Water
Belgrade R - Intro to H2O and Deep Water
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
 
Scalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2OScalable and Automatic Machine Learning with H2O
Scalable and Automatic Machine Learning with H2O
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session Bring Your Own Recipes Hands-On Session
Bring Your Own Recipes Hands-On Session
 
Deep Learning Technical Pitch Deck
Deep Learning Technical Pitch DeckDeep Learning Technical Pitch Deck
Deep Learning Technical Pitch Deck
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)Lviv Data Science Club (Sergiy Lunyakin)
Lviv Data Science Club (Sergiy Lunyakin)
 
sudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJAsudoers: Benchmarking Hadoop with ALOJA
sudoers: Benchmarking Hadoop with ALOJA
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 

Mais de Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

Mais de Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Machine Learning on Google Cloud with H2O

  • 1. Machine Learning with H2O.ai on Google Cloud Nicholas Png Partnerships Software Engineer nicholas@h2o.ai
  • 2. Who is H2O.ai? Company ● Founded in Silicon Valley in 2012 ● Funded: $75m ● Investors: Wells Fargo, NVIDIA, Nexus Ventures, Paxion Ventures Products ● H2O Open Source Machine Learning (14,000 organizations) ● H2O Driverless AI - Automated Machine Learning Leadership Leader in Gartner MQ machine learning and data science platform Team 90 AI experts (5 of the world’s top 100 data scientists with Kaggle Grandmasters) Global Mountain View London Prague India
  • 3. Technology leader with most completeness of vision Recognized for the mindshare, partner network and status as a quasi-industry standard for machine learning and AI H2O.ai customers gave the highest overall score among all the vendors for sales relationship and account management, customer support (onboarding, troubleshooting, etc.) and overall service and support Get the Gartner Magic Quadrant here H2O.ai is a Leader in the 2018 Gartner Data Science and Machine Learning Platforms Magic Quadrant
  • 4. In-Memory, Distributed Machine Learning Algorithms with H2O Flow GUI H2O AI Open Source Engine Integration with Spark Lightning Fast machine learning on GPUs 100% open source – Apache V2 licensed Built for data scientists – interface using R, Python on H2O Flow (interactive notebook interface) We offer Enterprise Support subscriptions Commercial Licensed (closed source) Built for domain users, analysts & data scientists – GUI based interface for end-to-end data science Fully automated machine learning from ingest to deployment We offer user licenses on a per seat basis (annual subscription) Automatic feature engineering, machine learning and interpretability H2O.ai Product Suite
  • 6. What is H2O? Math Platform Open source in-memory AI engine ● Parallelized and distributed algorithms ● GLM, Random Forest, GBM, Deep Learning, etc. Tech and API Easy to use and adopt ● Written in Java - perfect for Java programmers ● Install is lightweight ● REST API (Java) - run H2O from R, Python, WebUI Big data More data or better models? BOTH ● Use all of your data - model without sampling ● More data + better models = better predictions
  • 7. Clustering • K-Means (Auto-K) Dimension reduction • Principal Component Analysis • Generalized Low Rank Models Word embedding • Word2Vec Time series • iSAX Machine Learning tuning • Hyperparameter Search • Early Stopping Algorithms on H2O Statistical analysis • Linear Models (GLM) • Naïve Bayes Ensembles • Random Forest • Distributed Trees • Gradient Boosting Machine • Stacking / Super Learner Deep Neural Networks • NLP • Autoencoder • Anomaly Detection • Deep Features • CNN, RNN (Deep Water)
  • 8. + Data integration Data quality and transformation Modeling table Model building Model Features Target Simplified typical machine learning pipeline
  • 9. Production Environments JobFluid Vector Frame MRTaskDistributed K/V Store Distributed Fork/JoinNon-Blocking Hash Table Distributed In-Memory Processing REST / JSON Parse Exploratory Analysis Feature Engineering ML Algorithms Model Evaluation Scoring Data/Model Export SQL NFS Local GCS HDFS POJO High level architecture
  • 11. Driverless AI delivers “Expert data scientist in a box” Created and supported by world renowned AI experts Empowers companies to accomplish AI and ML with a single platform Performs the function of an expert data scientist and adds more power to both novice and expert teams Details and highlights insights and interpretability with easy to understand results and visualizations 21 day free trial for Driverless AI
  • 12. Driverless AI + Data integration Data quality and transformation Modeling table Model building Model Features Target Typical enterprise machine learning workflow
  • 13. Data is a team sport ~100 Data science experts in the world Weeks to hours Time for a data scientist to build a model Black box models Lack of AI talent Time to insights slow Lack of trust in AI ”US alone faces a shortage of 190,000 people with analytical expertise.” Driverless AI delivers Your digital data scientist Automatic Feature Engineering with GPU accelerated machine learning Explainable and Interpretable AI Why Driverless AI for Enterprise AI adoption
  • 14. Automatic feature engineering to increase accuracy - AlphaGo for AI Automatic Kaggle Grandmaster recipes in a box for solving wide variety of use-cases Automatic machine learning to find and tune the right ensemble of models Accuracy
  • 15. Original features Generated features Automatic Text Handling Frequency Encoding Cross Validation Target Encoding Truncated Singular Value Decomposition Clustering and more Feature transformations Auto feature generation Kaggle Grandmaster Out of the Box
  • 17. YARN CPU CPU Model BuildingSQL NFS GCS Kubernetes / Kubefow H2O.ai Driverless AI H2O Distributed In-Memory H2O.ai + Kubeflow CPU
  • 18. H2 O Flow H2O Cluster (H2O can run anywhere: desktop, cloud, on-prem; Hadoop and Spark environments supported) Model training Model Repository POJO (java file) MOJO (zip file) C++ MOJO Library Java MOJO Library Java R Py .NET ... ... ... Apps Language bindings Model management Model deployment (Store models in H2O Steam, git, HDFS, S3, etc.) (Add any language with C/C++ binding support) Save Model Load Model Load Model H2O deployment options
  • 19. BigQuery NFS Local Cloud Storage HDFS Storage Data Munging Driverless AI Compute Engine MOJO (.zip) Compute Engine Inference ●Initial data stored on HDFS or Google BigQuery ●Deploy MOJO file to serve real-time inference (millisecond response times) ●Additional logic can be placed before or after calling the MOJO High Level Deployment Pipeline - Spark Google Dataproc ●Save munged data to structured data file ●Ingest data file into Driverless AI ●Automatic feature engineering ●Automatic visualizations ●Complete model pipeline exported as MOJO ●Generate high performance model, ensemble XGBoost, + TF + RunFit ●Ingest data into Spark running on Google Dataproc. ●Use Sparkling Water for preliminary modeling and data munging. ●Current data pipeline can be added here
  • 20. BigQuery NFS Local Cloud Storage HDFS Storage Google BigQuery Data Munging Driverless AI Compute Engine MOJO (.zip) Compute Engine Inference ●Initial data stored on HDFS or Google BigQuery ●Perform data cleaning and data munging in Google BigQuery. ●Driverless AI has an integrated connector with GBQ for direct data ingest via SQL queries ●Automatic feature engineering ●Automatic visualizations ●Complete model pipeline exported as MOJO ●Generate high performance models, ensemble XGBoost + TF + RunFit ●Deploy MOJO file to serve real-time inference (millisecond response times) ●Additional logic can be placed before or after calling the MOJO High Level Deployment Pipeline - BigQuery