SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
MedTech Pharma
Nürnberg 2014
Taking (some of) the mystery out of Big Data
www.gritsystems.dk
Contact
Claus Stie Kallesøe
Founder, CEO
claus@gritsystems.dk
+45 30 14 15 36
Introduction
Big Data –
Either VERY large datasets AND/OR other complexities
Characteristics of big data
Source: IBM methodology
A couple of words about scale
• 100’s of Megabytes
• This should not be a problem. Can be handled with Matlab, R, Ruby
• 100/500 Gigabytes – 1Terabyte
• 2 Terabyte harddrives can be bought in the local shop for €100
• Connect it to your laptop and install postgresql or a no-sql database on it
• > 5 Terabytes
• Now you might have a size issue
Inspired by: http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html
Big Data - “Definition”
"Big Data is high volume, high velocity, and/or high variety information
assets that require new forms of processing to enable
enhanced decision making, insight discovery and process
optimization."
Cool, but remember where we are!
Gartner Hype Cycle 2013
Big Data in Pharma R&D
What is Big Data in Pharma R&D?
• Many ideas/possibilities across Pharma R&D and market
access
• But many of them are likley NOT “real” Big Data problems!
• Are they relevant and can they bring insights?
• Yes, very much so
• Should we than find a way to handle them?
• Absolutely
Disclaimer
• I am a (web) tech geek
• I have nothing against new technologies
• Like many other geeks I like it
• But do try to use the right tool for the right
job
http://blog.mongohq.com/you-dont-have-big-data/
Another great tool - for some
Q: “Could you help me get to Nürnberg, pls?”
A: “Yes, absolutely. Not a problem”
Q: “Ok, btw I want to try the Endeavour
A: “...ahh why?”
Q: “Because I have read it’s great”
A: “Yes, but the ICE….”
MapReduce explained in 41 words
Goal: Count the number of books in the library.
Map: You count up shelf #1, I count up shelf #2.
(The more people we get, the faster this part goes. )
Reduce: We all get together and add up our individual counts.
http://www.chrisstucchio.com/blog/2011/mapreduce_explained.html
What is it then? Linked data?
Does it matter what it is?
No!
It’s data - and potential analytics (business)
opportunities.
Size and complexity should drive the
technology
Technologies
Can we do anything on our own
For many people/companies
”Big data technology” is a black box
”A lot of stuff”
And then the vendors go:
If
{ box = magic or money}
then
{ box = expensive}
Working within a community
A lot of tools available
From: ttp://people10.com/blog/ruby-on-rails-the-popular-platform-for-web-development/
New visualisations – easy and free
http://philogb.github.io/jit/demos.html
Automated calculations - can bring you far
Job submitted to async
calculation server
https://circleci.com/
Also a lot of great tools to handle data
Elasticsearch text indexes
• Indexed research assay metadata
=> Google like search to find the relevant assay
• Indexed sharepoint project workspaces
=> Enable easy, fast cross project queries to find trends
Conclusion – Big data in Pharma R&D
• Many opportunities across R&D and market access
• More data linking and data analytics than Big Data
• You can use freely available tools on ”normal” hardware
• No magic ”Under the hood” – it’s just data
BUT you still need to define
the questions you
want to answer
– before diving into technology!
www.gritsystems.dk
Ask….

Mais conteúdo relacionado

Mais procurados

Big data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nnBig data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nn
Cathy McKnight
 

Mais procurados (20)

Decoding Data Science
Decoding Data ScienceDecoding Data Science
Decoding Data Science
 
A Big Data Timeline
A Big Data TimelineA Big Data Timeline
A Big Data Timeline
 
How to Build Successful Data Team - Dataiku ?
How to Build Successful Data Team -  Dataiku ? How to Build Successful Data Team -  Dataiku ?
How to Build Successful Data Team - Dataiku ?
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
A Brief History Of Data
A Brief History Of DataA Brief History Of Data
A Brief History Of Data
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"Gail Zhou on "Big Data Technology, Strategy, and Applications"
Gail Zhou on "Big Data Technology, Strategy, and Applications"
 
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
 
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
Chattanooga Hadoop Meetup - Hadoop 101 - November 2014
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
Big data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nnBig data and enterprise search trends 120827nn
Big data and enterprise search trends 120827nn
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
 
Big Data & Machine Learning
Big Data & Machine LearningBig Data & Machine Learning
Big Data & Machine Learning
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Using hadoop for big data
Using hadoop for big dataUsing hadoop for big data
Using hadoop for big data
 
Big data PPT
Big data PPT Big data PPT
Big data PPT
 
Big data – An Introduction, July 2013
Big data – An Introduction, July 2013Big data – An Introduction, July 2013
Big data – An Introduction, July 2013
 
Big Data e as Tecnologias Disruptivas - TDC 2014
Big Data e as Tecnologias Disruptivas - TDC 2014Big Data e as Tecnologias Disruptivas - TDC 2014
Big Data e as Tecnologias Disruptivas - TDC 2014
 
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of TechnologyGuest Lecture: Introduction to Big Data at Indian Institute of Technology
Guest Lecture: Introduction to Big Data at Indian Institute of Technology
 
Bio-IT Asia 2013: Informatics & Cloud - Best Practices & Lessons Learned
Bio-IT Asia 2013: Informatics & Cloud - Best Practices & Lessons LearnedBio-IT Asia 2013: Informatics & Cloud - Best Practices & Lessons Learned
Bio-IT Asia 2013: Informatics & Cloud - Best Practices & Lessons Learned
 

Destaque (6)

Pistoia presidents startup webinar sep2015
Pistoia presidents startup webinar sep2015Pistoia presidents startup webinar sep2015
Pistoia presidents startup webinar sep2015
 
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
Big Data and its Impact on Industry (Example of the Pharmaceutical Industry)
 
Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
Problems facing the pharmaceutical industry
Problems facing the pharmaceutical industryProblems facing the pharmaceutical industry
Problems facing the pharmaceutical industry
 
Analytics in Pharmaceutical Industry
Analytics in Pharmaceutical IndustryAnalytics in Pharmaceutical Industry
Analytics in Pharmaceutical Industry
 

Semelhante a Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014

Semelhante a Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014 (20)

2014 pycon-talk
2014 pycon-talk2014 pycon-talk
2014 pycon-talk
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Big data analytics 1
Big data analytics 1Big data analytics 1
Big data analytics 1
 
Big Data & the importance of Data Science
Big Data & the importance of Data ScienceBig Data & the importance of Data Science
Big Data & the importance of Data Science
 
BigData primer
BigData primerBigData primer
BigData primer
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Big Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with AzureBig Data Analytics: Finding diamonds in the rough with Azure
Big Data Analytics: Finding diamonds in the rough with Azure
 
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your DataCloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
Cloudera Breakfast: Advanced Analytics Part II: Do More With Your Data
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
 
Big data management
Big data managementBig data management
Big data management
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
HadoopWorkshopJuly2014
HadoopWorkshopJuly2014HadoopWorkshopJuly2014
HadoopWorkshopJuly2014
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
2015 02-tpmaca-big data in product mgmt
2015 02-tpmaca-big data in product mgmt2015 02-tpmaca-big data in product mgmt
2015 02-tpmaca-big data in product mgmt
 
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
Democratizing Advanced Analytics Propels Instant Analysis Results to the Ubiq...
 
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
 
3 Mitos de Big Data revelados
3 Mitos de Big Data revelados 3 Mitos de Big Data revelados
3 Mitos de Big Data revelados
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Taken some of the hype out of Big Data again - Medtech Pharma, Nürnberg july 2014

  • 1. MedTech Pharma Nürnberg 2014 Taking (some of) the mystery out of Big Data
  • 3. Contact Claus Stie Kallesøe Founder, CEO claus@gritsystems.dk +45 30 14 15 36
  • 5. Big Data – Either VERY large datasets AND/OR other complexities Characteristics of big data Source: IBM methodology
  • 6. A couple of words about scale • 100’s of Megabytes • This should not be a problem. Can be handled with Matlab, R, Ruby • 100/500 Gigabytes – 1Terabyte • 2 Terabyte harddrives can be bought in the local shop for €100 • Connect it to your laptop and install postgresql or a no-sql database on it • > 5 Terabytes • Now you might have a size issue Inspired by: http://www.chrisstucchio.com/blog/2013/hadoop_hatred.html
  • 7. Big Data - “Definition” "Big Data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."
  • 8. Cool, but remember where we are! Gartner Hype Cycle 2013
  • 9. Big Data in Pharma R&D
  • 10. What is Big Data in Pharma R&D? • Many ideas/possibilities across Pharma R&D and market access • But many of them are likley NOT “real” Big Data problems! • Are they relevant and can they bring insights? • Yes, very much so • Should we than find a way to handle them? • Absolutely
  • 11. Disclaimer • I am a (web) tech geek • I have nothing against new technologies • Like many other geeks I like it • But do try to use the right tool for the right job
  • 13. Another great tool - for some Q: “Could you help me get to Nürnberg, pls?” A: “Yes, absolutely. Not a problem” Q: “Ok, btw I want to try the Endeavour A: “...ahh why?” Q: “Because I have read it’s great” A: “Yes, but the ICE….”
  • 14. MapReduce explained in 41 words Goal: Count the number of books in the library. Map: You count up shelf #1, I count up shelf #2. (The more people we get, the faster this part goes. ) Reduce: We all get together and add up our individual counts. http://www.chrisstucchio.com/blog/2011/mapreduce_explained.html
  • 15. What is it then? Linked data?
  • 16. Does it matter what it is? No! It’s data - and potential analytics (business) opportunities. Size and complexity should drive the technology
  • 17. Technologies Can we do anything on our own
  • 18. For many people/companies ”Big data technology” is a black box ”A lot of stuff” And then the vendors go: If { box = magic or money} then { box = expensive}
  • 19. Working within a community A lot of tools available From: ttp://people10.com/blog/ruby-on-rails-the-popular-platform-for-web-development/
  • 20. New visualisations – easy and free http://philogb.github.io/jit/demos.html
  • 21. Automated calculations - can bring you far Job submitted to async calculation server
  • 22. https://circleci.com/ Also a lot of great tools to handle data
  • 23. Elasticsearch text indexes • Indexed research assay metadata => Google like search to find the relevant assay • Indexed sharepoint project workspaces => Enable easy, fast cross project queries to find trends
  • 24. Conclusion – Big data in Pharma R&D • Many opportunities across R&D and market access • More data linking and data analytics than Big Data • You can use freely available tools on ”normal” hardware • No magic ”Under the hood” – it’s just data
  • 25. BUT you still need to define the questions you want to answer – before diving into technology!