Machine Learning automation. Advanced WhizzML workflows: feature selection, boosting, gradient descent, and stacking.
VSSML18: 4th edition of the Valencian Summer School in Machine Learning.
3. BigML, Inc #VSSML18 3
ML Automation is Awesome!
http://www.clparker.org/ml_benchmark/
4. BigML, Inc #VSSML18 4
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• BigML’s Abstractions
5. BigML, Inc #VSSML18 5
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• BigML’s Abstractions
6. BigML, Inc #VSSML18 6
Goals of BigML
• Accessible to Non-
programmers
• Limit the distance from
“I have data” to a model
in production
• It sounds like more
automation is what we
need!
7. BigML, Inc #VSSML18 7
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• BigML’s Abstractions
9. BigML, Inc #VSSML18 9
What Is ML Automation?
ML AlgorithmData ModelPreprocessing
Parameterization
10. BigML, Inc #VSSML18 10
What Is ML Automation?
ML AlgorithmData ModelPreprocessing
Parameterization
Solution: Just Re-Blacken The Box!
11. BigML, Inc #VSSML18 11
NIPS Workshops
• People want their learning automated . . .
• Towards an AI for Data Science
• Bayesian Parameter Optimization
• But they also want to interact with it?
• Deep Learning Visualizations
• The Future of Interactive Machine Learning
• Interpretable Machine Learning for Complex Systems
12. BigML, Inc #VSSML18 12
BayesOpt
• Bayesian Parameter Optimization: Tuning the
parameters for anything
• One experiment at a time; BPO gives you “next” experiment by
modeling previous experiments and preferring novelty
• Classic case is neural networks
• Some important caveats learned at Twitter
• Engineers wanted importance metrics
• Difficult to elicit objective functions; often they were only
determinable after interaction with consumers
• System has the tendency to be myopic or “cheat”
https://bayesopt.github.io/
13. BigML, Inc #VSSML18 13
Interactive Machine Learning
• Program to Tutor Children In Math
• Has a series of tutorial modules it can show to children
• Optimizes number and order using RL
• But:
• Optimality determined using comprehensive test scores
• What happens when no module can improve students’ scores
in a given subject?
http://www.filmnips.com/
15. BigML, Inc #VSSML18 15
Fraud Prediction
• Model to detect fraudulent loan applications
• “Non-fraud” instances are passed to human underwriters for
verification
• Human underwriters provide the training data for the model
• Which applications will be marked as “passing”?
• Legitimate ones, of course
• But also . . .
17. BigML, Inc #VSSML18 17
Themes
• People love the fact that computers can automate
the optimization process
• People don’t want to do it (as there is often drudgery involved)
• People are worse at it
• People are suspicious (and should be) of the
results of that optimization
• Humans are bad at specifying objective functions
• This isn’t just a nuisance, it’s a security issue
• Humans don’t mind being (and should be) in the loop on their
own terms
18. BigML, Inc #VSSML18 18
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• BigML’s Abstractions
19. BigML, Inc #VSSML18 19
Abstractions
• A “translational layer” used simplify interaction with
complex processes
• Establishes a mode of interaction
• Hides the remaining detail
• Programming languages are a canonical example
• The syntax is the mode of interaction
• The layer below is machine code
• Abstractions make technology usable
20. BigML, Inc #VSSML18 20
Evaluating Abstractions
• Abstractions can be strong or weak:
• Is the interaction free of much of the detail of the layer below?
• Is the interaction natural and intuitive, perhaps in a way that the
layer below is not?
• Abstractions can be lossy or lossless
• Is the user able to get a satisfactory result with the mode of
interaction that the abstraction provides?
• When they can’t, how graceful is the failure (example: Wavelet-
based image compression)?
• It’s easy to do just one or the other!
21. BigML, Inc #VSSML18 21
Some Abstractions
• Some good abstractions (rich interaction mostly
free of technical detail)
• Driving controls
• WYSIWIG Editors
• What about Bayesian Parameter Optimization?
• Hides: Hyper-parameter optimization / Model selection
• Interaction?
• As an abstraction, BPO is lossy and weak (but I
love it anyway)
22. BigML, Inc #VSSML18 22
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• You Can Help!
23. BigML, Inc #VSSML18 23
People Will Meet You Halfway
Case 1: Programming Language Primitives
(and the Sapir-Whorf Hypothesis)
Case 2: Google
(actually not very good)
Case 3: Decision Trees
(actually better than you think)
24. BigML, Inc #VSSML18 24
Models Are Abstractions!
• A learned model is an abstraction
• You want to answer a question using the data
• The model hides data details and provides modes of interaction
• How good of an abstraction is it?
• Hiding details: Pretty excellent
• Interaction? (a “predict” method?)
25. BigML, Inc #VSSML18 25
Going Somewhere?
“We’ve developed technology that
automates travel between two places!”
Here you go!
26. BigML, Inc #VSSML18 26
Talk Overview
• Introduction to BigML
• Recent Ideas in Automation
• Automation vs. Abstraction
• Some Nice Abstractions
• BigML’s Abstractions
27. BigML, Inc 27VSSML18
Go Broader with Abstraction
• Underlying Architecture
• Data storage / manipulation
• Speed / scaling
• “Technology Debt”
• How much software do you need before you even start (data
manipulation, scaling, visualization)?
• How many different software systems will you have to
integrate?
• Even worse: Having to hire technical people
• These things are “easier”, and also probably less
lossy (very few people will miss them)
28. BigML, Inc 28VSSML18
Technology Debt
• There are lots of programs out there for data
preprocessing / feature engineering
• Download, pay for, learn, set up software
• Often has difficulties with data ingestion
• We want to be able to do all of this stuff server-
side without constraining which operations we
can do
• Solution: A DSL (Flatline)
29. BigML, Inc 29VSSML18
What Can Flatline Do?
• A whole lot
• Binary, arithmetic, statistical functions
• Regular expressions
• Date/Time parsing
• Discretization / Binarization
• Temporary variable binding
• Window transformations
• It compiles down to Java, so adding new features
is pretty trivial
https://github.com/bigmlcom/flatline/blob/master/user-manual.md
30. BigML, Inc 30VSSML18
Is This Automation?
• You’re making your users learn a language (true)!
• What didn’t they have to do?
• Install and/or Learn sed or perl or python
• Prepare their data for ingestion into those tools (and debug that
process)
• Worry at all about scaling anything
• Take the results and feed them into BigML
• Iterate?
• Perhaps not automation, but a near-lossless and
fairly useful abstraction