Jupyter Notebooks provide an interactive computational environment, in which you can combine Python code, rich text, mathematics, plots and rich media. It provides a convenient way for data analysts to explore, capture and share their research.
Numerous options exist for working with Jupyter Notebooks, including running a Jupyter Notebook instance locally or by using a Jupyter Notebook hosting service.
This talk will provide a quick tour of some of the more well known options available for running Jupyter Notebooks. It will then look at custom options for hosting Jupyter Notebooks yourself using public or private cloud infrastructure.
An in-depth look at how you can run Jupyter Notebooks in OpenShift will be presented. This will cover how you can directly deploy a Jupyter Notebook server image, as well as how you can use Source-to-Image (S2I) to create a custom application for your requirements by combining an existing Jupyter Notebook server image with your own notebooks, additional code and research data.
Specific use cases around Jupyter Notebooks which will be explored will include individual use, team use within an organisation, and class room environments for teaching. Other issues which will be covered include importing of notebooks and data into an environment, storing data using persistent volumes and other forms of centralised storage.
As an example of the possibilities of using Jupyter Notebooks with a cloud, it will be shown how you can easily use OpenShift to set up a distributed parallel computing cluster using ‘ipyparallel’ and use it in conjunction with a Jupyter Notebook.
16. Positives
• Save notebooks/data locally.
• Python virtual environments.
• Select Python version you want.
• Install required Python packages.
17. Negatives
• Operating system differences.
• Python distribution differences.
• Python version differences.
• Package index differences.
• PyPi (pip) vs Anaconda (conda)
• Effort to setup and maintain.
21. Positives
• Pre-created images.
• Bundled operating system packages.
• Known Python distribution/vendor.
• Bundled Python packages.
• Docker images are read only.
• Don’t need to maintain the image.
22. Negatives (1)
• More effort to customise experience.
• Build a custom Docker image to extend.
• Install extra packages each time you run it.
• Images can be very large.
• Multiple Python versions.
• Packages that you do not need.
23. Negatives (2)
• Access to and saving your notebooks/data.
• Need to mount persistent storage volumes.
• Ensuring access is done securely.
40. Positives
• Use existing features of OpenShift
• No special storage backends required.
• No custom provisioning applications.
• Cluster can still be used for other applications.
• Simply set quotas and users do what they want.
42. Positives
• Easily build custom images.
• Pre-populated with required Python packages.
• Pre-populated with required Jupyter Notebooks.
• Pre-populated with required data files.
• Direct to application, or to create images.