Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
1. www.univa.com
Ian Lumb, Solutions Architect
Disruptive Technology Track
Rice University Oil/Gas HPC Conference
March 15, 2017
Drilling Deep with
Machine Learning as
an Enterprise Enabled
Micro Service
2. Hypothesis
Container clusters are disruptive enablers of enterprise-
grade Machine Learning capabilities in oil/gas
applications and workflows when delivered as a fully
converged platform
2
www.univa.com
3. 3
Univa ML Survey: Key Findings
Most organizations have been using Machine Learning for more
than 2 years
Available infrastructure for Machine Learning remains CPU-heavy
There is interest in deploying new infrastructure to support Machine
Learning over the next 6 months
Machine Learning applications will make use of all capabilities -
existing CPUs & GPUs plus new Big Data & containerized
There is definitely interest in private/public/hybrid clouds – though
on-premise deployments are expected to dominate
www.univa.com
5. www.univa.com
5
Container Clusters for Machine Learning
Apache Spark is easily containerized as a service or an application
Navops Command delivers sophisticated, enterprise-grade
workload placement and advanced policy management capabilities
for Kubernetes-based container clusters that addresses mixed
workloads
Microservices-based approaches can be systematically refactored
into existing applications and/or workflows
Univa offers unique solutions for fully converged infrastructures
Main points
Always enjoy sharing our most-disruptive ideas at this event
Last year, we intro’d & promoted containerizing HPC applications – apps run in Docker containers, and are managed by UGE like any other kind of workload
Uptake continues to be strong – we have a number of PoCs underway with some moving towards production
Our intention is to focus this year’s DTT contribution on containers again, but in an even more disruptive way
Link to next slide: That being the case, a hypothesis seems like a reasonable place to initiative our disruptiveness
Main points:
Let’s start with a hypothesis for this year’s DTT contribution
We’re actually talking about clusters comprised entirely of containers – and this is quite a departure from the traditional HPC (built around UGE, for example) even for us!
Along with others in this container cluster ecosystem, we’re working expediently to deliver enterprise-grade capabilities to our customers and for our partners
There was an indication of interest in Machine Learning at last year’s RiceU event, and even more at this year’s
We thought it appropriate, then, to share use cases relating to ML
As you’ll learn/see in a moment, we have reason to believe that there is no single solution for introducing ML capabilities into the enterprise – so that apps and workflows might benefit from its introduction … and that is why we are hypothesizing that a fully converged platform is required to ensure that ML in container clusters makes good as a disruptive enabler
A fully converged platform refers to a single infrastructural element that supports a wide variety of use cases
Link to next slide: On the next slide we share some of the data that contributed to the formulation of this hypothesis
Main points:
Last August, we ran a webinar with a Machine Learning emphasis – it was (surprisingly) well attended and caused us to follow up with participants via a survey
Here we’ve summararized the key findings of that survey
ML’s been around for ~30 yrs, so perhaps it isn’t too surprising that many claimed to be using it for > 2 yrs
And although existing deployments were heavily slanted towards use of CPUs, there was interest in enhancing that infrastructure in the near term
Our survey, as well as anecdotal evidence from our ML-facing interactions with our prospects as well as our existing customers, suggests that there is no single capability that will be used to deliver ML capabilities even within a single organization – in other words, some will favor GPUs, while others Big Data offerings like Spark or based around Hadoop … and some of this will be containerized
And, not too surprisingly, many expressed interest in making use of the cloud in adopting ML capabilities
Given the interest in a broad and deep array of ML capabilities, converged platforms that can handle this high degree of variability are extremely attractive to those needing to provide ML capabilities
Link to next slide: To fix ideas, on the next slide we return to our hypothesis and share a specific use case
Fine Print
Small sample size bias challenges extrapolation to larger markets
North American and EMEA sampling bias challenges extrapolation to other geographies
New Intel Xeon Phi processor requires interpolation to determine its impact
Existing GPU capabilities likely to be repurposed to accommodate Machine Learning requirements
Main points:
On this slide, we’re sharing a converged platform based on Kubernetes – originally from Google and now open source, Kubernetes allows you to deploy container clusters
In this case, we’re emphasizing Machine Learning capabilities
On the left side, ML apps that only require CPUs can be run the ‘traditional way’ – perhaps as they are being now … with or without workload managers like Univa Grid Engine – these are the legacy, non-containerized workloads alluded to at the top of the slide
For the record, we’re not restricted to Machine Learning apps, as HPC apps could also be included
On the other side of the slide we’ve introduced capabilities based on Apache Spark that have been containerized … In one case, we’ve illustrated a completely containerized Spark application; whereas in the other, Spark is provided as a containerized service
Taken together, this is a scenario that supports mixed workloads – legacy, non-containerized alongside containerized … and that’s really making good on the promise of convergence
Importantly you can run your ML apps as they are or as they will be – and this includes an allowance for progressively refactoring in micro services based architectures or developing new apps with micro services in mind from the outset
Our Navops Command is a key enterprise enabler as it introduces the ability to place workload according to policies – policies, and variants thereof, of policies you are likely familiar with from workload managers in an HPC context … please see our poster for a more-complete description of the policies Command brings to Kubernetes container clusters
Finally, Command works with vanilla, open-source Kubernetes or with enhanced distributions such as Red Hat OpenShift
Link to next slide: On the next slide, we wrap by sharing our conclusions
The Unique Capabilities of Navops Command
Workload prioritization
Sophisticated policies include Maximize Resource Utilization / Proportional Shares / Runtime Quotas / Access Restrictions / Interleaving / Priority Ranking
Web-UI driven policy configuration
Workload affiliation based decision making
Pluggable support for any Kubernetes distribution
On-the-fly policy re-configuration
Main points:
Machine Learning is increasingly impacting all of us, and we may seek to leverage existing or deploy enhanced infrastructures to address the demand
Containerization can sensibly play a role in this introduction, and containerizing Spark apps or services serves as a compelling use case
Container clusters are rapidly becoming enterprise grade – and Navops Command delivers workload placement and policy management capabilities that inevitably arise as adoption progresses
In addition to providing a converged platform for mixed workloads (legacy, non-containerized plus containerized), container clusters present the ideal opportunity to systematically introduce micro services based architectures – from refactoring to net-new implementations
CTA: We believe we have unique offerings to assist you in efficiently and effectively making this journey – into containerized clusters. Please visit our poster to learn more about converged platforms for Machine Learning – platforms that are enterprise ready through the introduction of Navops Command for Kubernetes container clusters – and drop by our table and posters in the exhibits area