SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
Computable Content with
Jupyter, Docker, Mesos
Strata+HW Singapore

2016-12-07
Paco Nathan, @pacoid

Director, Learning Group @ O’Reilly Media
1
Project Jupyter
3
Project Jupyter is the evolution of iPython notebooks,
applied to a range of different programming languages
and environments
https://jupyter.org/
https://github.com/ipython/ipython/wiki/IPython-
kernels-for-other-languages
Some history…
4
Download Anaconda:
continuum.io/downloads
Activate the environment needed:
source activate py3k
Launch Juypter:
jupyter notebook
An example notebook (requires installs; see notes):
github.com/ceteri/oriole_jupyterday_atl/blob/master/example.ipynb
Installation and launch using Anaconda
5
text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''
from textblob import TextBlob
blob = TextBlob(text)
print(blob.tags)
print(blob.noun_phrases)
Installation and launch using Anaconda
7
At its core, one can think of Jupyter as a suite 

of network protocols:
Jupyter is to the remote semantics of a REPL

as…

HTTP is to the remote semantics of file share
A suite of network protocols
8
An excellent team
9
JupyterHub
github.com/jupyterhub/jupyterhub
Jupyter in Education
groups.google.com/forum/#!forum/jupyter-education
JupyterLab (alpha preview)
github.com/jupyterlab/jupyterlab
Jupyter Kernels
github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages
Projects:
10
documentation
jupyter.readthedocs.io/en/latest/index.html
discussions
groups.google.com/forum/#!forum/jupyter
gitter.im/jupyter/jupyter
events
calendar.google.com/calendar/embed?
src=p51j0ac1iccmj44tae12hq4dk0%40group.calendar.google.com
Resources:
11
speaking of upcoming events, stay tuned for …
JupyterCon
Resources:
Computable Content
13
An observation…
14
Jupyter @ O’Reilly Media
Embracing Jupyter Notebooks at O'Reilly

oreilly.com/ideas/jupyter-at-oreilly
Learn alongside innovators, thought-by-thought, in context

oreilly.com/ideas/oreilly-oriole-learn-alongside-innovators-
thought-by-thought-in-context
Oriole Online Tutorials

safaribooksonline.com/oriole/
How Do You Learn?
oreilly.com/learning/how-do-you-learn
15
For example…
• A unique new medium blends code,
data, text, and video into a narrated
learning experience with computable
content
• Purely browser-based UX; zero
installation required
• Substantially higher engagement
metrics
• Opens the door for live coding 

in assessments
• GitHub lists over 300K public 

Jupyter notebooks
Regex Golf by Peter Norvig

oreilly.com/learning/regex-golf-
with-peter-norvig
16
Motivations
O’Reilly needed a way for authors to use Jupyter notebooks to create
professional publications. We also wanted to integrate video narration
into the UX. The result is a unique new medium called Oriole:
• Jupyter notebooks are used in the middleware
• each viewer gets a 100% HTML experience 

(no download/install needed)
• context as a “unit of thought”
• the code and video are sync’ed together
• each web session has a Docker container running in the cloud
17
Motivations
Innovators in programming, data science, dev ops, design, etc., tend to
be really busy people. Tutorials are now much quicker to publish than
“traditional” books and videos. The audience gets direct, hands-on,
contextualized experience across a wide variety of programming
environments.
18
A notebook, a container, and ~20 minutes of
informal video walk into a bar...
19
Literate Programming, Don Knuth

literateprogramming.com/
Paraphrased:
Instead of telling computers what to do, tell other
people what you want the computers to do
Some history
20
Wolfram Research introduced notebooks in 1988 

for working with Mathematica…
Some history
21
PyCon 2016 Keynote, Lorena Barba
youtu.be/ckW1xuGVpug?t=35m11s (video)
figshare.com/articles/PyCon2016_Keynote/3407779 (slides)
Highly recommended: speech acts (based 

on Winograd and Flores) as theory for what 

we’re doing here
More recently
Notebook Practice
23
• focus on a concise “unit of thought”
• invest the time and editorial effort to create a good intro
• keep your narrative simple and reasonably linear
• “chunk” the text and code into understandable parts
• alternate between text, code, output, further links, etc.
• use markdown for interesting links: background, deep-dive, etc.
• code cells shouldn’t be long (< 10 lines), must show output
• load data+libraries from the container, not the network
• clear all output then “Run All” – or it didn’t happen
• video narratives: there’s text, and there’s subtext...
• pause after each “beat” – smile, breathe, let people follow you
Tips learned by teaching with Jupyter
For the JVM people: stop thinking only about IDEs, Ivy, Maven, etc. (ibid, Knuth1984)

BUILD UBER JARS, LOAD LIBS FROM CONTAINER, NOT THE NETWORK!

(apologies for shouting)
24
Jupyter notebooks + Git repos provide a low-cost,
pragmatic way toward the practice of repeatable
science – in this case, repeatable Data Science
• executable documents
• code + params + results + descriptions
• shareable insights
Notebooks: a cure for silos
25
In data science, we see the benefits to teams for shared
insights, storytelling, etc.
Meanwhile domain expertise is generally more important than
knowledge about tools
There’s a value for developers to use notebooks in lieu of IDEs
in some cases – what are those cases?
GitHub now renders notebooks, so they can be used for
documentation, reporting, etc.
Digital Object Identifiers (DOI) can be assigned through
Zenodo, making notebooks citable for academic publication
“Sharing is caring”
Authoring & Scale-Out
27
Launchbot.io
28
Launchbot allows a notebook author to build a
container that includes the required Jupyter kernel,
installed libraries, datasets, etc.
You need to have Docker installed on your laptop
The backend uses Git and DockerHub to manage
containers
For scale, deploy to DC/OS
Achieving scale
presenter:
Just Enough Math
O’Reilly (2014)
justenoughmath.com
monthly newsletter for updates, 

events, conf summaries, etc.:
liber118.com/pxn/

@pacoid

Mais conteúdo relacionado

Mais de Paco Nathan

Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
Paco Nathan
 

Mais de Paco Nathan (20)

Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science Reinvents Learning?
Data Science Reinvents Learning?Data Science Reinvents Learning?
Data Science Reinvents Learning?
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and Erasmus
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML Unpaused
 
Microservices, Containers, and Machine Learning
Microservices, Containers, and Machine LearningMicroservices, Containers, and Machine Learning
Microservices, Containers, and Machine Learning
 
Databricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User GroupDatabricks Meetup @ Los Angeles Apache Spark User Group
Databricks Meetup @ Los Angeles Apache Spark User Group
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
 
What's new with Apache Spark?
What's new with Apache Spark?What's new with Apache Spark?
What's new with Apache Spark?
 
How Apache Spark fits in the Big Data landscape
How Apache Spark fits in the Big Data landscapeHow Apache Spark fits in the Big Data landscape
How Apache Spark fits in the Big Data landscape
 
Strata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case StudiesStrata EU 2014: Spark Streaming Case Studies
Strata EU 2014: Spark Streaming Case Studies
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely heading
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Último (20)

Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Computable Content with Jupyter Docker Mesos

  • 1. Computable Content with Jupyter, Docker, Mesos Strata+HW Singapore
 2016-12-07 Paco Nathan, @pacoid
 Director, Learning Group @ O’Reilly Media 1
  • 3. 3 Project Jupyter is the evolution of iPython notebooks, applied to a range of different programming languages and environments https://jupyter.org/ https://github.com/ipython/ipython/wiki/IPython- kernels-for-other-languages Some history…
  • 4. 4 Download Anaconda: continuum.io/downloads Activate the environment needed: source activate py3k Launch Juypter: jupyter notebook An example notebook (requires installs; see notes): github.com/ceteri/oriole_jupyterday_atl/blob/master/example.ipynb Installation and launch using Anaconda
  • 5. 5 text = ''' The titular threat of The Blob has always struck me as the ultimate movie monster: an insatiably hungry, amoeba-like mass able to penetrate virtually any safeguard, capable of--as a doomed doctor chillingly describes it--"assimilating flesh on contact. Snide comparisons to gelatin be damned, it's a concept with the most devastating of potential consequences, not unlike the grey goo scenario proposed by technological theorists fearful of artificial intelligence run rampant. ''' from textblob import TextBlob blob = TextBlob(text) print(blob.tags) print(blob.noun_phrases) Installation and launch using Anaconda
  • 6.
  • 7. 7 At its core, one can think of Jupyter as a suite 
 of network protocols: Jupyter is to the remote semantics of a REPL
 as…
 HTTP is to the remote semantics of file share A suite of network protocols
  • 9. 9 JupyterHub github.com/jupyterhub/jupyterhub Jupyter in Education groups.google.com/forum/#!forum/jupyter-education JupyterLab (alpha preview) github.com/jupyterlab/jupyterlab Jupyter Kernels github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages Projects:
  • 11. 11 speaking of upcoming events, stay tuned for … JupyterCon Resources:
  • 14. 14 Jupyter @ O’Reilly Media Embracing Jupyter Notebooks at O'Reilly
 oreilly.com/ideas/jupyter-at-oreilly Learn alongside innovators, thought-by-thought, in context
 oreilly.com/ideas/oreilly-oriole-learn-alongside-innovators- thought-by-thought-in-context Oriole Online Tutorials
 safaribooksonline.com/oriole/ How Do You Learn? oreilly.com/learning/how-do-you-learn
  • 15. 15 For example… • A unique new medium blends code, data, text, and video into a narrated learning experience with computable content • Purely browser-based UX; zero installation required • Substantially higher engagement metrics • Opens the door for live coding 
 in assessments • GitHub lists over 300K public 
 Jupyter notebooks Regex Golf by Peter Norvig
 oreilly.com/learning/regex-golf- with-peter-norvig
  • 16. 16 Motivations O’Reilly needed a way for authors to use Jupyter notebooks to create professional publications. We also wanted to integrate video narration into the UX. The result is a unique new medium called Oriole: • Jupyter notebooks are used in the middleware • each viewer gets a 100% HTML experience 
 (no download/install needed) • context as a “unit of thought” • the code and video are sync’ed together • each web session has a Docker container running in the cloud
  • 17. 17 Motivations Innovators in programming, data science, dev ops, design, etc., tend to be really busy people. Tutorials are now much quicker to publish than “traditional” books and videos. The audience gets direct, hands-on, contextualized experience across a wide variety of programming environments.
  • 18. 18 A notebook, a container, and ~20 minutes of informal video walk into a bar...
  • 19. 19 Literate Programming, Don Knuth
 literateprogramming.com/ Paraphrased: Instead of telling computers what to do, tell other people what you want the computers to do Some history
  • 20. 20 Wolfram Research introduced notebooks in 1988 
 for working with Mathematica… Some history
  • 21. 21 PyCon 2016 Keynote, Lorena Barba youtu.be/ckW1xuGVpug?t=35m11s (video) figshare.com/articles/PyCon2016_Keynote/3407779 (slides) Highly recommended: speech acts (based 
 on Winograd and Flores) as theory for what 
 we’re doing here More recently
  • 23. 23 • focus on a concise “unit of thought” • invest the time and editorial effort to create a good intro • keep your narrative simple and reasonably linear • “chunk” the text and code into understandable parts • alternate between text, code, output, further links, etc. • use markdown for interesting links: background, deep-dive, etc. • code cells shouldn’t be long (< 10 lines), must show output • load data+libraries from the container, not the network • clear all output then “Run All” – or it didn’t happen • video narratives: there’s text, and there’s subtext... • pause after each “beat” – smile, breathe, let people follow you Tips learned by teaching with Jupyter For the JVM people: stop thinking only about IDEs, Ivy, Maven, etc. (ibid, Knuth1984)
 BUILD UBER JARS, LOAD LIBS FROM CONTAINER, NOT THE NETWORK!
 (apologies for shouting)
  • 24. 24 Jupyter notebooks + Git repos provide a low-cost, pragmatic way toward the practice of repeatable science – in this case, repeatable Data Science • executable documents • code + params + results + descriptions • shareable insights Notebooks: a cure for silos
  • 25. 25 In data science, we see the benefits to teams for shared insights, storytelling, etc. Meanwhile domain expertise is generally more important than knowledge about tools There’s a value for developers to use notebooks in lieu of IDEs in some cases – what are those cases? GitHub now renders notebooks, so they can be used for documentation, reporting, etc. Digital Object Identifiers (DOI) can be assigned through Zenodo, making notebooks citable for academic publication “Sharing is caring”
  • 28. 28 Launchbot allows a notebook author to build a container that includes the required Jupyter kernel, installed libraries, datasets, etc. You need to have Docker installed on your laptop The backend uses Git and DockerHub to manage containers For scale, deploy to DC/OS Achieving scale
  • 29. presenter: Just Enough Math O’Reilly (2014) justenoughmath.com monthly newsletter for updates, 
 events, conf summaries, etc.: liber118.com/pxn/
 @pacoid