Emixa Mendix Meetup 11 April 2024 about Mendix Native development
SUSE Data Scientists Dev Program GPU Containers ML Frameworks
1. SUSE Developer
program for Data
Scientists
Developers Program Architect
Marco Varlese
marco.varlese@suse.com
Sr. Product Manager
Accelerators & Artificial
Intelligence
Alessandro Festa
alessandro.festa@suse.com
@bringyourownid
2. In this session…
You will learn… Why a Dev
Program for Data
Scientist…
Containers, GPU’s, Tricks and…
…how about some juggling?
4. Why a GPU aware Container
• Technical needs:
• Machine Learning and Deep Learning need high computational power
• It’s not only GPU but market is there right now (see next slide) = 90% of the users/customers
• Machine Learning in a container are the way to go: are simples to use for a non-technical person,
easy to deploy, easy to “transport” (from on-prem to cloud and reverse)
• Challenges:
• NVIDIA drivers are no open source so cannot be shipped with Leap/Tubleweed (no OBS) so we (as
community) need to find a solutions to make users life easier
• Nvidia-docker from NVIDIA CUDA are required
• Docker images for Machine Learning frameworks are HUGE (over 3 GB)
Wait wait…. What you are talking about? NVIDIA what?
6. Mandatory Requirements
“Make sure you have installed the NVIDIA driver and a supported version of Docker for your distribution”
GNU/Linux x86_64 with kernel version > 3.10
Docker >= 1.12
NVIDIA GPU with Architecture > Fermi (2.1)
NVIDIA drivers ~= 361.93 (untested on older versions)
10. Where to start
NVIDIA on Docker Hub: https://hub.docker.com/r/nvidia/cuda/
CUDA images come in three flavors:
• base: starting from CUDA 9.0, contains the bare minimum (libcudart) to
deploy a pre-built CUDA application.Use this image if you want to
manually select which CUDA packages you want to install.
• runtime: extends the base image by adding all the shared libraries from
the CUDA toolkit.Use this image if you have a pre-built application using
multiple CUDA libraries.
• devel: extends the runtime image by adding the compiler toolchain, the
debugging tools, the headers and the static libraries.Use this image to
compile a CUDA application from sources.
11. Challenges (Resume)
• HOST require nvidia-docker V2 installed (github pull waiting to
be merged - https://github.com/NVIDIA/nvidia-docker/pull/790)
: we are working on IT (Thanks Darren Davis our TAM to push
on NVIDIA!)
• CudNN and CUDA require license acceptance by user –
cannot be easily delivered as SUSE package – Partner Hub to
the rescue ! And in containers may be installed silently using
an explicit variable (i.e.: -e ACCEPT_EULA=Y)
• Some dependencies are missing in SLE but not in
openSUSE when install CUDA directly from the NVIDA Repo
– as alternative we may use the CUDA script.
12. Both packages
seems to be
optional to me.
Do we need
samples? -
Maybe
Do we need
X11 driver in a
container? –
Would say it
depends….
Both are published as openSUSE packages
15. But the containers is not enough…
You’re a Data Scientist not a
SysAdmin/DevOps
16.
17. AI Use Cases (for openSUSE)
Data
Scientist
Machine
Learning
Engineer
• Run an experiment with different
coefficients and summarize the results
• Work “local” first
• Create “template” and need to re-apply
to production ready environment
• Write Code based on Dataset samples
• Work either “local” or “remote” connected
• Need to re-test (QA) code on a different
environment
Can Customers Do
It Alone?
So why would you need SUSE Global Services when you already have SUSE Support with your subscription?
Excellent question.
To put it simply, your team has their every day job. They are tasked with “keeping the lights on,” which means they are responsible for all the baseline needs, including:
Maintenance and security of all your servers
Maintaining uptime and avoiding business disruption
And providing quality services to your business and customers
At the same time, your business is asking you to transform to meet the needs of the digital economy.
That is, your team has to grow themselves to become IT generalists that can both span development and operations.
They need to speed software and application delivery so that they are not merely doing yearly releases but maybe releasing products quarterly, monthly or even faster.
You are also grappling with a skills gap. And it seems as soon as you have someone certified or trained on the technology another company or recruiter poaches that talent.
So can you do it alone?