SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
What makes Data driven
environments more efficient and how to
build a data science toolchain around
Notebook technologies
Creator of Apache Zeppelin
Co-Founder, CTO
Moon soo Lee
moon@zepl.com
#GDSC 2018
Who am I
A true believer that data science notebook changes how
people collaborate
Creator of Apache Zeppelin
Co-founder
https://github.com/Leemoonsoo
#GDSC 2018
It was 2013, really wanted to have
interactive analytics interface for .
#GDSC 2018
Started an opensource project -
Zeppelin http://zeppelin-project.org/
data science notebook.Became an project in 2016.
http://zeppelin.apache.org
#GDSC 2018
Iterations REPL interface (2012)
Editor / Result interface (2013)
Notebook interface (2014)
#GDSC 2018
Pilot to Production in 1 day
Hey, take a look
I need an update every morning!
#GDSC 2018
More notebook consumers than producers
#GDSC 2018
At the same time
Opensource project receiving contributions like
Authentication
Access control
#GDSC 2018
Realized that notebook is a great collaboration tool
Why notebook?
#GDSC 2018
Notebook is
- Interactive
- Flexible
- Visualized
- Inline description
- Contain a story
- Shareable
#GDSC 2018
How to build collaborative environment
with notebook technology
Data sharing
Multi-user
environment
Notebook sharing
#GDSC 2018
Data scientist
Data engineer Data Analyst
Marketing
SW
engineer
Sales
Executive
You
Notebook Sharing
#GDSC 2018
You’re using only half of its
potential if not sharing
#GDSC 2018
Github
nbviewer
Zeppelin
Airbnb/knowledge-repo
Commercial services for notebook sharing
VCS
Open
source
Service
#GDSC 2018
Github
● Store notebook in github
● Versioning
● Github provides .ipynb viewer
● Fork / pull request / merge
● Private / Public / Team / Org
● Hard to apply Notebook level ACL
● Not easy for Non-engineers
#GDSC 2018
nbviewer
● Publishing notebook
● Share notebook by
sharing link
● Easy use
● No access control
Nbconvert (endering ipynb to static HTML) as a webservice
#GDSC 2018
Apache Zeppelin
● Share notebook with ACL, Read/Write/Execute
● In case of Jupyter notebook, need to convert .ipynb to
zeppelin format in command line.
#GDSC 2018
Airbnb/knowledge-repo
https://github.com/airbnb/knowledge-repo
● .ipynb, md as a post
● Git repo for version
control
● Feeds
● Search
● No access control
#GDSC 2018
Commercial services for notebook sharing
Google Colab
● Share notebook through google drive
● View/Edit/Run ipynb notebook using Colab
● Realtime collaboration
ZEPL
● Notebook level ACL
● View/Edit/Run .ipynb and Zeppelin notebook
● Realtime collaboration
● Import existing notebook from git/s3 storage
www.zepl.com
#GDSC 2018
Data Sharing
#GDSC 2018
DON’Ts
● Email attach
● Direct send
● Share through USB
● ...
Email attach
Local copy in laptop
USB drive
#GDSC 2018
DO’s
● Provide access to the same
dataset
● Access control capability
● Horizontal scalability
#GDSC 2018
Data catalog
● Provides location of data, what it means and how to load
○ e.g.
● Catalogue need to be accessible / searchable / annotatable
● Many different way to build depends on team / infra
○ Hive Metastore as a data catalog
○ Cloud infrastructure service (e.g. AWS glue data catalog, Azure data catalog)
○ Data catalog / publishing software (e.g. CKAN, DKAN)
○ Custom built on top of RDBMS, Nosql, Indexing engine
○ Build data catalog using Notebook
Dataset Location Schema Note
Activity s3://service/activity Date (DateTime), type (INT), action(String) Type is either RUN or STOP. ….
Images s3://service/images 512x256 pixel images Images are collected from profile photo...
#GDSC 2018
Build data catalog using Notebook
● Flexible enough to describe data
● Searchable, shareable, annotatable
● Programmatic generation
#GDSC 2018
Multi-user environment
#GDSC 2018
I like my notebook running on my laptop.
No you don’t.
#GDSC 2018
Sign in and Run
Install libraries and
Install notebook and
Configure driver, environments and
Request access to data and
Setup access to notebook repo and
….
Run
#GDSC 2018
Reverse Proxy
JupyterHub
/hub
Jupyter server
Kernel (Python, R)
Jupyter server
Kernel (Python, R)
/user/[name]
Authenticator
Spawner
Notebook
Storage
(Filesystem, Git, etc)
LDAP,
OAuth,
etc
Docker, k8s
Zeppelin Server
LDAP,
OAuth,
etc
Notebook
Storage
(Filesystem, Git, etc)
Interpreter Manager
Auth / ACL
Interpreter (kernel)
Interpreter (kernel)
Interpreter (kernel)
#GDSC 2018
● Easier to implement / manage
● Notebook sharing is decoupled with
execution environment
● Usually notebook sharing is basic or
restricted. (no notebook level ACL)
● e.g.
○ JupyterHub
○ AWS Sagemaker
Reverse Proxy
Single user
Notebook server
Kernel
Single user
Notebook server
Kernel
Notebook
Storage
Multi user
Notebook server
Notebook
Storage
Kernel Kernel Kernel
Browser
Browser
● More complex to implement / manage
● Notebook sharing is coupled with execution
environment
● Usually notebook sharing is more advanced
and fine grained
● e.g.
○ Apache Zeppelin
○ ZEPL
○ Google Colab
#GDSC 2018
Conclusion
Notebook Share
Data share
Multi-user environment
Collaboration
#GDSC 2018
Thanks

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Plotly dash and data visualisation in Python
Plotly dash and data visualisation in PythonPlotly dash and data visualisation in Python
Plotly dash and data visualisation in Python
 
Deep dive into serverless on Google Cloud
Deep dive into serverless on Google CloudDeep dive into serverless on Google Cloud
Deep dive into serverless on Google Cloud
 
Modular GraphQL with Schema Stitching
Modular GraphQL with Schema StitchingModular GraphQL with Schema Stitching
Modular GraphQL with Schema Stitching
 
Adding GraphQL to your existing architecture
Adding GraphQL to your existing architectureAdding GraphQL to your existing architecture
Adding GraphQL to your existing architecture
 
GraphQL + relay
GraphQL + relayGraphQL + relay
GraphQL + relay
 
Meetup
MeetupMeetup
Meetup
 
GraphQL in Production
GraphQL in ProductionGraphQL in Production
GraphQL in Production
 
20170927 py data_n3_bokeh_plotly
20170927 py data_n3_bokeh_plotly20170927 py data_n3_bokeh_plotly
20170927 py data_n3_bokeh_plotly
 
GraphQL
GraphQLGraphQL
GraphQL
 
Go lambda-presentation
Go lambda-presentationGo lambda-presentation
Go lambda-presentation
 
Kubernetes Config Management Landscape
Kubernetes Config Management LandscapeKubernetes Config Management Landscape
Kubernetes Config Management Landscape
 
GraphQL in an Age of REST
GraphQL in an Age of RESTGraphQL in an Age of REST
GraphQL in an Age of REST
 
Google cloud infrastructure workshop
Google cloud infrastructure workshopGoogle cloud infrastructure workshop
Google cloud infrastructure workshop
 
GraphQL & Relay
GraphQL & RelayGraphQL & Relay
GraphQL & Relay
 
Serverless with Google Cloud
Serverless with Google CloudServerless with Google Cloud
Serverless with Google Cloud
 
Introduction to GraphQL
Introduction to GraphQLIntroduction to GraphQL
Introduction to GraphQL
 
月刊ライトニングトーク 2014/06-07: 前回からのダイジェスト
月刊ライトニングトーク 2014/06-07: 前回からのダイジェスト月刊ライトニングトーク 2014/06-07: 前回からのダイジェスト
月刊ライトニングトーク 2014/06-07: 前回からのダイジェスト
 
Firebase Code Lab - 2015 GDG Buffalo DevFest
Firebase Code Lab - 2015 GDG Buffalo DevFestFirebase Code Lab - 2015 GDG Buffalo DevFest
Firebase Code Lab - 2015 GDG Buffalo DevFest
 
Introduction to GraphQL
Introduction to GraphQLIntroduction to GraphQL
Introduction to GraphQL
 
How to GraphQL
How to GraphQLHow to GraphQL
How to GraphQL
 

Semelhante a Collaborative environment with data science notebook

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
carlostorres15106
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Andrey Dotsenko
 

Semelhante a Collaborative environment with data science notebook (20)

AirBNB's ML platform - BigHead
AirBNB's ML platform - BigHeadAirBNB's ML platform - BigHead
AirBNB's ML platform - BigHead
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 
DocDoku: Using web technologies in a desktop application. OW2con'15, November...
DocDoku: Using web technologies in a desktop application. OW2con'15, November...DocDoku: Using web technologies in a desktop application. OW2con'15, November...
DocDoku: Using web technologies in a desktop application. OW2con'15, November...
 
DocDokuPLM presentation - OW2Con 2015 Community Award winner
DocDokuPLM presentation - OW2Con 2015 Community Award winnerDocDokuPLM presentation - OW2Con 2015 Community Award winner
DocDokuPLM presentation - OW2Con 2015 Community Award winner
 
What cloud changes the developer
What cloud changes the developerWhat cloud changes the developer
What cloud changes the developer
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
A GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CD
A GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CDA GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CD
A GitOps Kubernetes Native CICD Solution with Argo Events, Workflows, and CD
 
Lupus Decoupled Drupal - Drupal Austria Meetup - 2023-04.pdf
Lupus Decoupled Drupal - Drupal Austria Meetup - 2023-04.pdfLupus Decoupled Drupal - Drupal Austria Meetup - 2023-04.pdf
Lupus Decoupled Drupal - Drupal Austria Meetup - 2023-04.pdf
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQueryCodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
CodeCamp Iasi - Creating serverless data analytics system on GCP using BigQuery
 
Instant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositoriesInstant developer onboarding with self contained repositories
Instant developer onboarding with self contained repositories
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with KubernetesKubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
Kubernetes Forum Seoul 2019: Re-architecting Data Platform with Kubernetes
 
Introduction to serverless computing on Google Cloud
Introduction to serverless computing on Google CloudIntroduction to serverless computing on Google Cloud
Introduction to serverless computing on Google Cloud
 
From React to React Native - Things I wish I knew when I started
From React to React Native - Things I wish I knew when I startedFrom React to React Native - Things I wish I knew when I started
From React to React Native - Things I wish I knew when I started
 
Apache Airflow in the Cloud: Programmatically orchestrating workloads with Py...
Apache Airflow in the Cloud: Programmatically orchestrating workloads with Py...Apache Airflow in the Cloud: Programmatically orchestrating workloads with Py...
Apache Airflow in the Cloud: Programmatically orchestrating workloads with Py...
 
AGES Presentation on Web, Python, Django and GeoServer
AGES Presentation on Web, Python, Django and GeoServerAGES Presentation on Web, Python, Django and GeoServer
AGES Presentation on Web, Python, Django and GeoServer
 
Simplified News Analytics in Presidential Election with Google Cloud Platform
Simplified News Analytics in Presidential Election with Google Cloud PlatformSimplified News Analytics in Presidential Election with Google Cloud Platform
Simplified News Analytics in Presidential Election with Google Cloud Platform
 
Building an analytics workflow using Apache Airflow
Building an analytics workflow using Apache AirflowBuilding an analytics workflow using Apache Airflow
Building an analytics workflow using Apache Airflow
 

Último

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 

Último (20)

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 

Collaborative environment with data science notebook

  • 1. What makes Data driven environments more efficient and how to build a data science toolchain around Notebook technologies Creator of Apache Zeppelin Co-Founder, CTO Moon soo Lee moon@zepl.com
  • 2. #GDSC 2018 Who am I A true believer that data science notebook changes how people collaborate Creator of Apache Zeppelin Co-founder https://github.com/Leemoonsoo
  • 3. #GDSC 2018 It was 2013, really wanted to have interactive analytics interface for .
  • 4. #GDSC 2018 Started an opensource project - Zeppelin http://zeppelin-project.org/ data science notebook.Became an project in 2016. http://zeppelin.apache.org
  • 5. #GDSC 2018 Iterations REPL interface (2012) Editor / Result interface (2013) Notebook interface (2014)
  • 6. #GDSC 2018 Pilot to Production in 1 day Hey, take a look I need an update every morning!
  • 7. #GDSC 2018 More notebook consumers than producers
  • 8. #GDSC 2018 At the same time Opensource project receiving contributions like Authentication Access control
  • 9. #GDSC 2018 Realized that notebook is a great collaboration tool Why notebook?
  • 10. #GDSC 2018 Notebook is - Interactive - Flexible - Visualized - Inline description - Contain a story - Shareable
  • 11. #GDSC 2018 How to build collaborative environment with notebook technology Data sharing Multi-user environment Notebook sharing
  • 12. #GDSC 2018 Data scientist Data engineer Data Analyst Marketing SW engineer Sales Executive You Notebook Sharing
  • 13. #GDSC 2018 You’re using only half of its potential if not sharing
  • 15. #GDSC 2018 Github ● Store notebook in github ● Versioning ● Github provides .ipynb viewer ● Fork / pull request / merge ● Private / Public / Team / Org ● Hard to apply Notebook level ACL ● Not easy for Non-engineers
  • 16. #GDSC 2018 nbviewer ● Publishing notebook ● Share notebook by sharing link ● Easy use ● No access control Nbconvert (endering ipynb to static HTML) as a webservice
  • 17. #GDSC 2018 Apache Zeppelin ● Share notebook with ACL, Read/Write/Execute ● In case of Jupyter notebook, need to convert .ipynb to zeppelin format in command line.
  • 18. #GDSC 2018 Airbnb/knowledge-repo https://github.com/airbnb/knowledge-repo ● .ipynb, md as a post ● Git repo for version control ● Feeds ● Search ● No access control
  • 19. #GDSC 2018 Commercial services for notebook sharing Google Colab ● Share notebook through google drive ● View/Edit/Run ipynb notebook using Colab ● Realtime collaboration ZEPL ● Notebook level ACL ● View/Edit/Run .ipynb and Zeppelin notebook ● Realtime collaboration ● Import existing notebook from git/s3 storage www.zepl.com
  • 21. #GDSC 2018 DON’Ts ● Email attach ● Direct send ● Share through USB ● ... Email attach Local copy in laptop USB drive
  • 22. #GDSC 2018 DO’s ● Provide access to the same dataset ● Access control capability ● Horizontal scalability
  • 23. #GDSC 2018 Data catalog ● Provides location of data, what it means and how to load ○ e.g. ● Catalogue need to be accessible / searchable / annotatable ● Many different way to build depends on team / infra ○ Hive Metastore as a data catalog ○ Cloud infrastructure service (e.g. AWS glue data catalog, Azure data catalog) ○ Data catalog / publishing software (e.g. CKAN, DKAN) ○ Custom built on top of RDBMS, Nosql, Indexing engine ○ Build data catalog using Notebook Dataset Location Schema Note Activity s3://service/activity Date (DateTime), type (INT), action(String) Type is either RUN or STOP. …. Images s3://service/images 512x256 pixel images Images are collected from profile photo...
  • 24. #GDSC 2018 Build data catalog using Notebook ● Flexible enough to describe data ● Searchable, shareable, annotatable ● Programmatic generation
  • 26. #GDSC 2018 I like my notebook running on my laptop. No you don’t.
  • 27. #GDSC 2018 Sign in and Run Install libraries and Install notebook and Configure driver, environments and Request access to data and Setup access to notebook repo and …. Run
  • 28. #GDSC 2018 Reverse Proxy JupyterHub /hub Jupyter server Kernel (Python, R) Jupyter server Kernel (Python, R) /user/[name] Authenticator Spawner Notebook Storage (Filesystem, Git, etc) LDAP, OAuth, etc Docker, k8s Zeppelin Server LDAP, OAuth, etc Notebook Storage (Filesystem, Git, etc) Interpreter Manager Auth / ACL Interpreter (kernel) Interpreter (kernel) Interpreter (kernel)
  • 29. #GDSC 2018 ● Easier to implement / manage ● Notebook sharing is decoupled with execution environment ● Usually notebook sharing is basic or restricted. (no notebook level ACL) ● e.g. ○ JupyterHub ○ AWS Sagemaker Reverse Proxy Single user Notebook server Kernel Single user Notebook server Kernel Notebook Storage Multi user Notebook server Notebook Storage Kernel Kernel Kernel Browser Browser ● More complex to implement / manage ● Notebook sharing is coupled with execution environment ● Usually notebook sharing is more advanced and fine grained ● e.g. ○ Apache Zeppelin ○ ZEPL ○ Google Colab
  • 30. #GDSC 2018 Conclusion Notebook Share Data share Multi-user environment Collaboration