SlideShare uma empresa Scribd logo
1 de 24
Leveraging scriptable
infrastructures,
Towards a paradigm shift in
software for data science
Cloud Era Ltd 14 June 2013 karim.chine@cloudera.co.uk
Karim Chine
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/elastic-r-data
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
22
Outline
 Introduction
 Fragmentation and friction in the Data Science arena
 The understated potential of Cloud scriptability
 Elastic-R: Towards a universal platform for Data Science
 Elastic-R: Design and Technologies Overview
 Elastic-R: The scriptability toolset
 Demo
 Conclusion
33
Introduction
“The next information revolution is well underway. But it is
not happening where information scientists, information
executives, and the information industry in general are
looking for it. It is not a revolution in technology, machinery,
techniques, software, or speed. It is a revolution in
concepts.”
Peter Drucker. Forbes ASAP, August 24, 1998
44
Introduction
 Science and the 4th Paradigm
 The e-Research Dancing Bears
2
2
2.
3
4
a
cG
a
a










Experimental Science Theoretical Science Computational Science e-Science / Data-intensive Science
The townspeople gather to
see the wondrous sight as the
massive, lumbering beast
shambles and shuffles from
paw to paw. The bear is really
a terrible dancer, and the
wonder isn't that the bear
dances well but that the bear
dances at all.
5
Fragmentation and friction in the
Data Science Arena
www.scilab.org
http://root.cern.ch
www.sagemath.org
www.sas.com
office.microsoft.com
www.mathworks.com
www.scipy.org
www.spss.com
www.wolfram.com
www.python.org
 Open-source (GPL) software
environment for statistical computing
and graphics
 Lingua franca of data analysis.
 Repositories of contributed R
packages related to a variety of
problem domains in life sciences,
social sciences, finance, econometrics,
chemo metrics, etc. are growing at an
exponential rate.
 R is Super Glue
6
The understated potential of
Cloud scriptability
 3D printers are becoming
a common place:
 Creating three-dimensional
solid object of virtually any
shape from a digital model.
 Scripting the physical world
 Sharing physical reality on
Facebook is now as easy as
sharing your holiday pictures
7
Elastic-R: Towards a universal
platform for Data Science
Computational Components
R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits…
Open source or commercial
Computational Resources
Clusters, grids, private or public clouds
Free or pay-per-use
Computational GUIs
HTML5 and Desktop Workbench
Built-in views /Plugins /Collaborative views
Open source or commercial
Computational Scripts
R / Python / Matlab / Groovy
Computational APIs
Java / SOAP / REST, Stateless and stateful
Computational Storage
Local, NFS, FTP, Amazon S3, EBS
Generated Computational Web Services
Stateful or stateless, mapping of R objects/functions
Elastic
-R
8
Elastic-R: Towards a universal
platform for Data Science
9
Elastic-R: Towards a universal
platform for Data Science
Public
Clouds
Private Cloud
10
Elastic-R: Design and
Technologies Overview
You are here
Your data
is here
11
Elastic-R: Design and
Technologies Overview
 Remote Java/R
Processes
 Events-driven Remote
Objects/Engines
 R, Python, Mathematica,
Matlab, Scilab, ...
 Collaborative Spreadsheets
 Collaborative Scientific
Graphics Canvas
 Collaborative Dashboard with
collaborative widgets
12
Elastic-R: Design and
Technologies Overview
Elastic-R
AMI 1
R 2.10
BioC 2.5
Elastic-R
AMI 2
R 2.9
BioC 2.3
Elastic-R
AMI 3
R 2.8
BioC 2.0
Elastic-R Amazon Machine Images
Elastic-R
EBS 1
Data Set XXX
Elastic-R
EBS 2
Data Set
YYY
Elastic-R
EBS 3
Data Set
ZZZ
Elastic-R
EBS 4
Data Set VVV
Elastic-R
AMI 2
R 2.9
BioC 2.3
Elastic-R
EBS 4
Data Set
VVV
Amazon Elastic Block Stores
Eastic-R
AMI 2
R 2.9
BioC 2.3
Elastic-R.org
Elastic-R
EBS 4
Data Set
VVV
1313
Modes of access
 Individuals with AWS accounts
 Standard AMIs (Amazon Machine Images): paid-per-use to Amazon.
For Academic use and trial purposes
 Paid AMI: Paid-per-use software model. For business users
 Individuals without AWS accounts
 Trial tokens, purchased tokens, tokens granted by other users.
 Resources (data science engines) shared by other users
 Individual subscribtions
 Companies/Educational & Research Institutions
 Dedicated Platform and AMIs on an Amazon VPC (Virtual Private
Cloud). Paid via subscribtion
1414
Elastic-R: The scriptability
toolset
Command Line
Web Console
SDK
API
1515
Elastic-R: The scriptability
toolset
Command Line
Web Console
SDK
API
1616
Elastic-R: The scriptability
toolset
WS generator
rws.war
+ mapping.jar
+ pooling framework
+ R Java Bridge
+ JAX-WS
- Servlets
- Generated artifacts
Eclipse Web Service Client Generator
public static void main(String[] args) throws Exception {
RGlobalEnvFunctionWeb g=new
RGlobalEnvFunctionWebServiceLocator().getrGlobalEnvFunctionWebPort();
RNumeric x=new RNumeric(); x.setValue(new Double[]{6.0});
System.out.println(g.square(x).getValue()[0]);
}
Web Services Generation
square  function(x) {return(x^2) }
typeInfo(square)  SimultaneousTypeSpecification(
TypedSignature(x = "numeric"), returnType =
"numeric")
Script / globals.r
Script / rjmap.xml
<rj>
<publish>
<functions> <function name="square" forWeb="true"/> </functions>
</publish>
<scripts> <initScript name="globals.r" embed="true"/> </scripts>
</rj>
Deploy
1717
Elastic-R: The scriptability
toolset
 API: Soap and Restful Web Services
 Data Analysis Engines control
 Data Analysis Engines Management (Life cycle, etc.)
 Virtual appliances/artifacts management
 Platform administartion
 SDKs: Java, R, Microsoft Office (Vba)
 Command line: elasticR package
 Html5 Workbench
 Elastic-R artifacts management interface
 Engines control interface
1818
Demo
 Register to Elastic-R academic and trial portal
(www.elastic-r.org)
 Create Data Science Engines using trial tokens
 Work with R, Python and Scientific Spreadsheets in the
browser
 Share Data Science Engine and Collaborate
 Connect to the remote Data Science Engine from within
a local R session, push and pull data, execute
commands and show impact on spreadsheets and
dashboards
1919
Conclusion
 Elastic -R unlocks the potential of the cloud for Data
Scientists and analytical applications developers and
improves dramatically their productivity
 The Data Science IT Factory chain, from resources
acquisition to services and applications design and
publication becomes:
 Fully scriptable in R
 under the direct control of the Data Scientist and the developer
 Fully traceable and reproducible
 Elastic-R can seamlessly extend any existing application
or service and can be used as a collaboration backend
 Openness, security and reliability are among the key
capabilities delivered
2020
What to do Next
 Register to Elastic-R and try the HTML 5 Workbench and
the collaboration capabilities
 Download the R package elasticR and use it to access
the cloud from local R sessions
 Download the Java SDK and create your first analytical
application using AWS and advanced tools for
programming with data.
 Get in touch with me to explore potential partnerships
and collaborations
2121
Contact details
 Karim Chine
 karim.chine@cloudera.co.uk
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/elastic-r-
data

Mais conteúdo relacionado

Destaque

Case Studies in Ethics Teaching
Case Studies in Ethics TeachingCase Studies in Ethics Teaching
Case Studies in Ethics TeachingChris Willmott
 
Reward system by thinker
Reward system by thinkerReward system by thinker
Reward system by thinkerRavi Patel
 
Module 4 Leverage points and systemic interventions
Module 4 Leverage points and systemic interventionsModule 4 Leverage points and systemic interventions
Module 4 Leverage points and systemic interventionsThink2Impact
 
Module 3 Systems archetypes
Module 3 Systems archetypesModule 3 Systems archetypes
Module 3 Systems archetypesThink2Impact
 
Systems Thinking: A Foxy Approach
Systems Thinking: A Foxy ApproachSystems Thinking: A Foxy Approach
Systems Thinking: A Foxy ApproachVenkatesh Rao
 
Systems Thinking & The Building Blocks of Health Systems
Systems Thinking & The Building Blocks of Health SystemsSystems Thinking & The Building Blocks of Health Systems
Systems Thinking & The Building Blocks of Health SystemsBorwornsom Leerapan
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesHBaseCon
 
8 Archetypes of Leadership
8 Archetypes of Leadership 8 Archetypes of Leadership
8 Archetypes of Leadership Thane Ritchie
 
Module 2 Causal loop modelling
Module 2 Causal loop modellingModule 2 Causal loop modelling
Module 2 Causal loop modellingThink2Impact
 
Design Tools for Systems Thinking
Design Tools for Systems ThinkingDesign Tools for Systems Thinking
Design Tools for Systems ThinkingPeter Vermaercke
 
MENTAL MODELS: Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...
MENTAL MODELS:  Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...MENTAL MODELS:  Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...
MENTAL MODELS: Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...Amy Rae
 
Business Model Archetypes
Business Model ArchetypesBusiness Model Archetypes
Business Model ArchetypesNeal Cabage
 
Module 1 Introduction to systems thinking
Module 1 Introduction to systems thinkingModule 1 Introduction to systems thinking
Module 1 Introduction to systems thinkingThink2Impact
 
The Power Of Archetypes In Brand Creation
The Power Of Archetypes In Brand CreationThe Power Of Archetypes In Brand Creation
The Power Of Archetypes In Brand CreationBianca Cawthorne
 
Cola wars case presentation
Cola wars case presentationCola wars case presentation
Cola wars case presentationjkwong5
 

Destaque (19)

Case Studies in Ethics Teaching
Case Studies in Ethics TeachingCase Studies in Ethics Teaching
Case Studies in Ethics Teaching
 
The Archetypal Lens
The Archetypal LensThe Archetypal Lens
The Archetypal Lens
 
Reward system by thinker
Reward system by thinkerReward system by thinker
Reward system by thinker
 
Archetypes
ArchetypesArchetypes
Archetypes
 
Module 4 Leverage points and systemic interventions
Module 4 Leverage points and systemic interventionsModule 4 Leverage points and systemic interventions
Module 4 Leverage points and systemic interventions
 
Module 3 Systems archetypes
Module 3 Systems archetypesModule 3 Systems archetypes
Module 3 Systems archetypes
 
Systems Thinking: A Foxy Approach
Systems Thinking: A Foxy ApproachSystems Thinking: A Foxy Approach
Systems Thinking: A Foxy Approach
 
Systems Thinking & The Building Blocks of Health Systems
Systems Thinking & The Building Blocks of Health SystemsSystems Thinking & The Building Blocks of Health Systems
Systems Thinking & The Building Blocks of Health Systems
 
A Survey of HBase Application Archetypes
A Survey of HBase Application ArchetypesA Survey of HBase Application Archetypes
A Survey of HBase Application Archetypes
 
8 Archetypes of Leadership
8 Archetypes of Leadership 8 Archetypes of Leadership
8 Archetypes of Leadership
 
Module 2 Causal loop modelling
Module 2 Causal loop modellingModule 2 Causal loop modelling
Module 2 Causal loop modelling
 
Design Tools for Systems Thinking
Design Tools for Systems ThinkingDesign Tools for Systems Thinking
Design Tools for Systems Thinking
 
MENTAL MODELS: Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...
MENTAL MODELS:  Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...MENTAL MODELS:  Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...
MENTAL MODELS: Lessons From The Fifth Discipline Fieldbook by Senge, Kleiker...
 
Business Model Archetypes
Business Model ArchetypesBusiness Model Archetypes
Business Model Archetypes
 
Module 1 Introduction to systems thinking
Module 1 Introduction to systems thinkingModule 1 Introduction to systems thinking
Module 1 Introduction to systems thinking
 
Systems thinking in a nutshell
Systems thinking in a nutshellSystems thinking in a nutshell
Systems thinking in a nutshell
 
The Power Of Archetypes In Brand Creation
The Power Of Archetypes In Brand CreationThe Power Of Archetypes In Brand Creation
The Power Of Archetypes In Brand Creation
 
Cola wars case presentation
Cola wars case presentationCola wars case presentation
Cola wars case presentation
 
Intro to Systems Thinking
Intro to Systems ThinkingIntro to Systems Thinking
Intro to Systems Thinking
 

Mais de C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Mais de C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Último

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Último (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Leveraging Scriptable Infrastructures, Towards a Paradigm Shift in Software for Data Science

  • 1. Leveraging scriptable infrastructures, Towards a paradigm shift in software for data science Cloud Era Ltd 14 June 2013 karim.chine@cloudera.co.uk Karim Chine
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /elastic-r-data
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. 22 Outline  Introduction  Fragmentation and friction in the Data Science arena  The understated potential of Cloud scriptability  Elastic-R: Towards a universal platform for Data Science  Elastic-R: Design and Technologies Overview  Elastic-R: The scriptability toolset  Demo  Conclusion
  • 5. 33 Introduction “The next information revolution is well underway. But it is not happening where information scientists, information executives, and the information industry in general are looking for it. It is not a revolution in technology, machinery, techniques, software, or speed. It is a revolution in concepts.” Peter Drucker. Forbes ASAP, August 24, 1998
  • 6. 44 Introduction  Science and the 4th Paradigm  The e-Research Dancing Bears 2 2 2. 3 4 a cG a a           Experimental Science Theoretical Science Computational Science e-Science / Data-intensive Science The townspeople gather to see the wondrous sight as the massive, lumbering beast shambles and shuffles from paw to paw. The bear is really a terrible dancer, and the wonder isn't that the bear dances well but that the bear dances at all.
  • 7. 5 Fragmentation and friction in the Data Science Arena www.scilab.org http://root.cern.ch www.sagemath.org www.sas.com office.microsoft.com www.mathworks.com www.scipy.org www.spss.com www.wolfram.com www.python.org  Open-source (GPL) software environment for statistical computing and graphics  Lingua franca of data analysis.  Repositories of contributed R packages related to a variety of problem domains in life sciences, social sciences, finance, econometrics, chemo metrics, etc. are growing at an exponential rate.  R is Super Glue
  • 8. 6 The understated potential of Cloud scriptability  3D printers are becoming a common place:  Creating three-dimensional solid object of virtually any shape from a digital model.  Scripting the physical world  Sharing physical reality on Facebook is now as easy as sharing your holiday pictures
  • 9. 7 Elastic-R: Towards a universal platform for Data Science Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits… Open source or commercial Computational Resources Clusters, grids, private or public clouds Free or pay-per-use Computational GUIs HTML5 and Desktop Workbench Built-in views /Plugins /Collaborative views Open source or commercial Computational Scripts R / Python / Matlab / Groovy Computational APIs Java / SOAP / REST, Stateless and stateful Computational Storage Local, NFS, FTP, Amazon S3, EBS Generated Computational Web Services Stateful or stateless, mapping of R objects/functions Elastic -R
  • 10. 8 Elastic-R: Towards a universal platform for Data Science
  • 11. 9 Elastic-R: Towards a universal platform for Data Science Public Clouds Private Cloud
  • 12. 10 Elastic-R: Design and Technologies Overview You are here Your data is here
  • 13. 11 Elastic-R: Design and Technologies Overview  Remote Java/R Processes  Events-driven Remote Objects/Engines  R, Python, Mathematica, Matlab, Scilab, ...  Collaborative Spreadsheets  Collaborative Scientific Graphics Canvas  Collaborative Dashboard with collaborative widgets
  • 14. 12 Elastic-R: Design and Technologies Overview Elastic-R AMI 1 R 2.10 BioC 2.5 Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R AMI 3 R 2.8 BioC 2.0 Elastic-R Amazon Machine Images Elastic-R EBS 1 Data Set XXX Elastic-R EBS 2 Data Set YYY Elastic-R EBS 3 Data Set ZZZ Elastic-R EBS 4 Data Set VVV Elastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R EBS 4 Data Set VVV Amazon Elastic Block Stores Eastic-R AMI 2 R 2.9 BioC 2.3 Elastic-R.org Elastic-R EBS 4 Data Set VVV
  • 15. 1313 Modes of access  Individuals with AWS accounts  Standard AMIs (Amazon Machine Images): paid-per-use to Amazon. For Academic use and trial purposes  Paid AMI: Paid-per-use software model. For business users  Individuals without AWS accounts  Trial tokens, purchased tokens, tokens granted by other users.  Resources (data science engines) shared by other users  Individual subscribtions  Companies/Educational & Research Institutions  Dedicated Platform and AMIs on an Amazon VPC (Virtual Private Cloud). Paid via subscribtion
  • 18. 1616 Elastic-R: The scriptability toolset WS generator rws.war + mapping.jar + pooling framework + R Java Bridge + JAX-WS - Servlets - Generated artifacts Eclipse Web Service Client Generator public static void main(String[] args) throws Exception { RGlobalEnvFunctionWeb g=new RGlobalEnvFunctionWebServiceLocator().getrGlobalEnvFunctionWebPort(); RNumeric x=new RNumeric(); x.setValue(new Double[]{6.0}); System.out.println(g.square(x).getValue()[0]); } Web Services Generation square  function(x) {return(x^2) } typeInfo(square)  SimultaneousTypeSpecification( TypedSignature(x = "numeric"), returnType = "numeric") Script / globals.r Script / rjmap.xml <rj> <publish> <functions> <function name="square" forWeb="true"/> </functions> </publish> <scripts> <initScript name="globals.r" embed="true"/> </scripts> </rj> Deploy
  • 19. 1717 Elastic-R: The scriptability toolset  API: Soap and Restful Web Services  Data Analysis Engines control  Data Analysis Engines Management (Life cycle, etc.)  Virtual appliances/artifacts management  Platform administartion  SDKs: Java, R, Microsoft Office (Vba)  Command line: elasticR package  Html5 Workbench  Elastic-R artifacts management interface  Engines control interface
  • 20. 1818 Demo  Register to Elastic-R academic and trial portal (www.elastic-r.org)  Create Data Science Engines using trial tokens  Work with R, Python and Scientific Spreadsheets in the browser  Share Data Science Engine and Collaborate  Connect to the remote Data Science Engine from within a local R session, push and pull data, execute commands and show impact on spreadsheets and dashboards
  • 21. 1919 Conclusion  Elastic -R unlocks the potential of the cloud for Data Scientists and analytical applications developers and improves dramatically their productivity  The Data Science IT Factory chain, from resources acquisition to services and applications design and publication becomes:  Fully scriptable in R  under the direct control of the Data Scientist and the developer  Fully traceable and reproducible  Elastic-R can seamlessly extend any existing application or service and can be used as a collaboration backend  Openness, security and reliability are among the key capabilities delivered
  • 22. 2020 What to do Next  Register to Elastic-R and try the HTML 5 Workbench and the collaboration capabilities  Download the R package elasticR and use it to access the cloud from local R sessions  Download the Java SDK and create your first analytical application using AWS and advanced tools for programming with data.  Get in touch with me to explore potential partnerships and collaborations
  • 23. 2121 Contact details  Karim Chine  karim.chine@cloudera.co.uk
  • 24. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/elastic-r- data