mabl software engineer Joe Lust presents the mabl cloud architecture at the Cambridge Cloud Exchange: Machine Learning meetup at Google Cambridge. The talk takes an in-depth look at mabl’s machine learning and specifically how mabl uses numerous Google Cloud systems for its intelligent auto-healing tests and visual change detection.
2. About Me: Joseph Lust
Engineer
■ Building the web for two decades
■ Cloud native since 2011
■ Currently building mabl ML cloud on GCP
■ Co-organizer GDG Cloud Boston Meetup
3. Agenda
■ mabl’s problem statement
■ ML on Google Cloud
■ Cloud best practices at mabl
■ Q & A
5. Developing Quality Software is Slow
Image Source: IBM SAGE
■ Software is complex
■ Testing software is slow
■ Humans have low clock speeds
■ Humans are expensive
6. Continuous Delivery breaks QA
■ Change happens infrequently
■ Weeks or months to write tests
■ Weeks to execute tests
■ Change is constant
■ Hours to write tests
■ Minutes to execute tests
7. The Doughnut Hole: Continuous Testing (CT)
Source: Google Trends
Hudson
Released
Facebook
CD Article
10. Machine Learning Areas @ mabl
■ Anomaly detection
▲ Timing anomalies
▲ Visual change anomalies
■ Auto-healing tests
▲ Alternative element candidate election
▲ Multi-branch evaluation
11.
12.
13.
14. Why anomaly detection isn’t quite that easy
■ Domain variability
one app’s anomaly is another’s normal
■ Temporal variability
today’s anomaly is tomorrow’s normal
■ Detection quality
not all anomalies are bad (or important)
■ Initial quality
what can we detect with minimal app-specific data
15. Approach
■ Modeling the app
▲ domain variability, quality
■ Incremental learning
▲ temporal variability, initial quality
■ User feedback
▲ use case variability, quality
16. Analysis pipeline
Observations
▲ Raw data/events
Measurements
▲ Calculated features
State detections
▲ Aggregate measurements
▲ Apply detection models (abstraction)
▲ Integrate detections (higher abstraction)
Insights
▲ Recognize significant state changes
31. Pieces of the puzzle
■ Cloud ML Engine
▲ Tensorflow machine learning models
▲ Managed model training
▲ Online prediction service
■ Dataflow
▲ Streaming data processing
▲ I/O connectors for PubSub, etc.
■ Datastore
▲ Augment trained model with mutable data
▲ Fast query by type and filter (e.g., by key)
34. mabl cloud design tenets
■ Serverless
■ Decoupled sub systems
■ Event driven architecture
35. Serverless just works
■ No Provisioning
■ Transparent Scaling
■ Event Driven
■ Pay only for Use
35
36. e.g. Kubernetes Engine
■ NoOps - developers’ containers in prod
■ Language agnostic
■ Massively scale
▲ running millions of containers a month
■ “Set it and forget it!”
36
Kubernetes
Engine
38. Event Driven Architecture
■ No “cron jobs” or batch processes
■ Continuous pipelines
■ Surge buffers
■ 100s ms to 1-2s E2E handling
38
Cloud
Dataflow
Cloud
Functions
39. Google Cloud’s Impact on mabl
■ From zero to alpha product in ~6mo with 8 developers
▲ Processing ~100M pages per month
■ Systems designed for scale on day one
■ No dreaded “rewrite”
■ Product families work very well together out of the box