Enviar pesquisa
Carregar
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
•
1 gostou
•
1,228 visualizações
Romeo Kienzler
Seguir
Presentation held on Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Leia menos
Leia mais
Dados e análise
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 73
Baixar agora
Baixar para ler offline
Recomendados
SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...
SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...
Romeo Kienzler
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Ryousei Takano
GIST AI-X Computing Cluster
GIST AI-X Computing Cluster
Jax Jargalsaikhan
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
Ryousei Takano
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
templedf
Japan Lustre User Group 2014
Japan Lustre User Group 2014
Hitoshi Sato
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)
Ryousei Takano
Recomendados
SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...
SQL on Hadoop - 12th Swiss Big Data User Group Meeting, 3rd of July, 2014, ET...
Romeo Kienzler
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Ryousei Takano
GIST AI-X Computing Cluster
GIST AI-X Computing Cluster
Jax Jargalsaikhan
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
Ryousei Takano
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
templedf
Japan Lustre User Group 2014
Japan Lustre User Group 2014
Hitoshi Sato
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)
Ryousei Takano
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
Ryousei Takano
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
Kohei KaiGai
Supermicro cloudera hadoop
Supermicro cloudera hadoop
Supermicro_SMCI
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
Ryousei Takano
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
Kohei KaiGai
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
Kohei KaiGai
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
Kohei KaiGai
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
GPGPU programming with CUDA
GPGPU programming with CUDA
Savith Satheesh
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
NVIDIA Taiwan
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
The Rise of Parallel Computing
The Rise of Parallel Computing
bakers84
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Kohei KaiGai
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
NVIDIA Taiwan
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
Ryousei Takano
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
Ryousei Takano
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Kohei KaiGai
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Ural-PDC
Open science resources for `Big Data' Analyses of the human connectome
Open science resources for `Big Data' Analyses of the human connectome
Cameron Craddock
Use of big data technologies in capital markets
Use of big data technologies in capital markets
Infosys
Mais conteúdo relacionado
Mais procurados
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
Ryousei Takano
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
Kohei KaiGai
Supermicro cloudera hadoop
Supermicro cloudera hadoop
Supermicro_SMCI
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
Ryousei Takano
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
Kohei KaiGai
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
Kohei KaiGai
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
Kohei KaiGai
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
GPGPU programming with CUDA
GPGPU programming with CUDA
Savith Satheesh
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
NVIDIA Taiwan
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
The Rise of Parallel Computing
The Rise of Parallel Computing
bakers84
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Kohei KaiGai
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
NVIDIA Taiwan
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
Ryousei Takano
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PeterAndreasEntschev
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
Ryousei Takano
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Kohei KaiGai
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Ural-PDC
Mais procurados
(20)
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
Supermicro cloudera hadoop
Supermicro cloudera hadoop
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
20181212 - PGconfASIA - LT - English
20181212 - PGconfASIA - LT - English
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
GPGPU programming with CUDA
GPGPU programming with CUDA
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
The Rise of Parallel Computing
The Rise of Parallel Computing
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Destaque
Open science resources for `Big Data' Analyses of the human connectome
Open science resources for `Big Data' Analyses of the human connectome
Cameron Craddock
Use of big data technologies in capital markets
Use of big data technologies in capital markets
Infosys
Big Data Expo 2015 - Data Science Innovation Privacy Considerations
Big Data Expo 2015 - Data Science Innovation Privacy Considerations
BigDataExpo
Data juice
Data juice
Davide Mauri
The DATALAB - building a world-class innovation centre in data science
The DATALAB - building a world-class innovation centre in data science
University of Glasgow Research Strategy & Innovation Office
Big data Summit
Big data Summit
Nilan Peiris
The Complete Guide to Capital Markets for Quantitative Professionals - Summary
The Complete Guide to Capital Markets for Quantitative Professionals - Summary
radhap
Data Science at Atlassian: The transition towards a data-driven organisation
Data Science at Atlassian: The transition towards a data-driven organisation
Ilias Flaounas
UNICEF Innovation: Innovation Lab Do-It-Yourself Guide
UNICEF Innovation: Innovation Lab Do-It-Yourself Guide
Christopher Fabian
The Epidemiology of Innovation
The Epidemiology of Innovation
Tim Stock
Big Data and Data Science @ BNL - D. Morgagni & L. Dell'Anna
Big Data and Data Science @ BNL - D. Morgagni & L. Dell'Anna
Data Driven Innovation
Open Innovation
Open Innovation
Alar Kolk
Innovation can be Trained
Innovation can be Trained
Slides That Rock
Understand Innovation in 5 Minutes
Understand Innovation in 5 Minutes
Gordon Graham
Innovation Strategy
Innovation Strategy
Yodhia Antariksa
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Booz Allen Hamilton
What is Big Data?
What is Big Data?
Bernard Marr
Destaque
(17)
Open science resources for `Big Data' Analyses of the human connectome
Open science resources for `Big Data' Analyses of the human connectome
Use of big data technologies in capital markets
Use of big data technologies in capital markets
Big Data Expo 2015 - Data Science Innovation Privacy Considerations
Big Data Expo 2015 - Data Science Innovation Privacy Considerations
Data juice
Data juice
The DATALAB - building a world-class innovation centre in data science
The DATALAB - building a world-class innovation centre in data science
Big data Summit
Big data Summit
The Complete Guide to Capital Markets for Quantitative Professionals - Summary
The Complete Guide to Capital Markets for Quantitative Professionals - Summary
Data Science at Atlassian: The transition towards a data-driven organisation
Data Science at Atlassian: The transition towards a data-driven organisation
UNICEF Innovation: Innovation Lab Do-It-Yourself Guide
UNICEF Innovation: Innovation Lab Do-It-Yourself Guide
The Epidemiology of Innovation
The Epidemiology of Innovation
Big Data and Data Science @ BNL - D. Morgagni & L. Dell'Anna
Big Data and Data Science @ BNL - D. Morgagni & L. Dell'Anna
Open Innovation
Open Innovation
Innovation can be Trained
Innovation can be Trained
Understand Innovation in 5 Minutes
Understand Innovation in 5 Minutes
Innovation Strategy
Innovation Strategy
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
What is Big Data?
What is Big Data?
Semelhante a Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
SQL on Hadoop
SQL on Hadoop
Swiss Big Data User Group
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
Romeo Kienzler
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
Romeo Kienzler
日本発のオープンソース・データベース GridDB
日本発のオープンソース・データベース GridDB
griddb
Molecular Shape Searching on GPUs: A Brave New World
Molecular Shape Searching on GPUs: A Brave New World
Can Ozdoruk
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Romeo Kienzler
Hadoop Fundamentals I
Hadoop Fundamentals I
Romeo Kienzler
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
ScyllaDB
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
inside-BigData.com
Leveraging Cloud for the Modern SQL Developer
Leveraging Cloud for the Modern SQL Developer
Jason Strate
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Amazon Web Services
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Ali Hodroj
Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
Amazon Web Services
IBM Analytics Accelerator Trends & Directions Namk Hrle
IBM Analytics Accelerator Trends & Directions Namk Hrle
Surekha Parekh
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
Surekha Parekh
YOW2021 Computing Performance
YOW2021 Computing Performance
Brendan Gregg
DDMS
DDMS
amrhaggag
Semelhante a Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
(20)
SQL on Hadoop
SQL on Hadoop
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
BigData processing in the cloud – Guest Lecture - University of Applied Scien...
日本発のオープンソース・データベース GridDB
日本発のオープンソース・データベース GridDB
Molecular Shape Searching on GPUs: A Brave New World
Molecular Shape Searching on GPUs: A Brave New World
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Hadoop Fundamentals I
Hadoop Fundamentals I
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Trends in Systems and How to Get Efficient Performance
Trends in Systems and How to Get Efficient Performance
Leveraging Cloud for the Modern SQL Developer
Leveraging Cloud for the Modern SQL Developer
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Empowering Congress with Data-Driven Analytics (BDT304) | AWS re:Invent 2013
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Hybrid Transactional/Analytics Processing with Spark and IMDGs
Getting Started with Amazon Redshift
Getting Started with Amazon Redshift
IBM Analytics Accelerator Trends & Directions Namk Hrle
IBM Analytics Accelerator Trends & Directions Namk Hrle
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
IBM DB2 Analytics Accelerator Trends & Directions by Namik Hrle
YOW2021 Computing Performance
YOW2021 Computing Performance
DDMS
DDMS
Mais de Romeo Kienzler
Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network Training
Romeo Kienzler
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Romeo Kienzler
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...
Romeo Kienzler
Blockchain Technology Book Vernisage
Blockchain Technology Book Vernisage
Romeo Kienzler
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Romeo Kienzler
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
Romeo Kienzler
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
Romeo Kienzler
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Romeo Kienzler
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoT
Romeo Kienzler
Geo Python16 keynote
Geo Python16 keynote
Romeo Kienzler
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor Data
Romeo Kienzler
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Romeo Kienzler
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Romeo Kienzler
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
Romeo Kienzler
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAAS
Romeo Kienzler
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
Romeo Kienzler
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
Romeo Kienzler
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14
Romeo Kienzler
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014
Romeo Kienzler
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Romeo Kienzler
Mais de Romeo Kienzler
(20)
Parallelization Stategies of DeepLearning Neural Network Training
Parallelization Stategies of DeepLearning Neural Network Training
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Love & Innovative technology presented by a technology pioneer and an AI expe...
Love & Innovative technology presented by a technology pioneer and an AI expe...
Blockchain Technology Book Vernisage
Blockchain Technology Book Vernisage
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
IBM Middle East Data Science Connect 2016 - Doha, Qatar
IBM Middle East Data Science Connect 2016 - Doha, Qatar
Apache SystemML - Declarative Large-Scale Machine Learning
Apache SystemML - Declarative Large-Scale Machine Learning
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
DeepLearning and Advanced Machine Learning on IoT
DeepLearning and Advanced Machine Learning on IoT
Geo Python16 keynote
Geo Python16 keynote
Real-time DeepLearning on IoT Sensor Data
Real-time DeepLearning on IoT Sensor Data
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
TDWI_DW2014_SQLNoSQL_DBAAS
TDWI_DW2014_SQLNoSQL_DBAAS
Cloudant Overview Bluemix Meetup from Lisa Neddam
Cloudant Overview Bluemix Meetup from Lisa Neddam
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
DBaaS Bluemix Meetup DACH 26.8.14
DBaaS Bluemix Meetup DACH 26.8.14
Cloud Databases, Developer Week Nuernberg 2014
Cloud Databases, Developer Week Nuernberg 2014
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Último
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Elaine Werffeli
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
ibrahimabdi22
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
ThinkInnovation
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
amy56318795
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
EfruzAsilolu
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
Paras Gupta
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
EfruzAsilolu
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
gargpaaro
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
Vivek487417
Último
(20)
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
1.
© 2013 IBM
Corporation1 The Data Scientists Workplace of the Future - Data Science Connect 22nd of July, 2014 Romeo Kienzler IBM Center of Excellence for Data Science, Cognitive Systems and BigData (A joint-venture between IBM Research Zurich and IBM Innovation Center DACH) Source: http://www.kdnuggets.com/2012/04/data-science-history.jpg
2.
© 2013 IBM
Corporation2 What is DataScience? Source: Statoo.com http://slidesha.re/1kmNiX0
3.
© 2013 IBM
Corporation3 DataScience at present ● Tools (http://blog.revolutionanalytics.com/2014/01/in-data-scientist-survey-r-is-the-most-used-tool-other-than-databases.html) ● SQL (42%) ● R (33%) ● Python (26%) ● Excel (25%) ● Java, Ruby, C++ (17%) ● SPSS, SAS (9%) ● Limitations (Single Node usage) ● Main Memory ● CPU <> Main Memory Bandwidth ● CPU ● Storage <> Main Memory Bandwidth (either Single node or SAN)
4.
© 2013 IBM
Corporation4 What is BIG data?
5.
© 2013 IBM
Corporation5 What is BIG data?
6.
© 2013 IBM
Corporation6 What is BIG data? Big Data Hadoop
7.
© 2013 IBM
Corporation7 What is BIG data? Business Intelligence Data Warehouse
8.
© 2013 IBM
Corporation8 BigData == Hadoop? Hadoop BigData Hadoop
9.
© 2013 IBM
Corporation9 What is beyond “Data Warehouse”? Data Lake Data Warehouse
10.
© 2013 IBM
Corporation10 First “BigData” UseCase ? ● Google Index ● 40 X 10^9 = 40.000.000.000 => 40 billion pages indexed ● Will break 100 PB barrier soon ● Derived from MapReduce ● now “caffeine” based on “percolator” ● Incremental vs. batch ● In-Memory vs. disk ●
11.
© 2013 IBM
Corporation11 Map-Reduce → Hadoop → BigInsights
12.
© 2013 IBM
Corporation12 BigData Analytics – Predictive Analytics "sometimes it's not who has the best algorithm that wins; it's who has the most data." (C) Google Inc. The Unreasonable Effectiveness of Data¹ ¹http://www.csee.wvu.edu/~gidoretto/courses/2011-fall-cp/reading/TheUnreasonable%20EffectivenessofData_IEEE_IS2009.pdf No Sampling => Work with full dataset => No p-Value/z-Scores anymore
13.
© 2013 IBM
Corporation13 Aggregated Bandwith between CPU, Main Memory and Hard Drive 1 TB (at 10 GByte/s) - 1 Node - 100 sec - 10 Nodes - 10 sec - 100 Nodes - 1 sec - 1000 Nodes - 100 msec
14.
© 2013 IBM
Corporation14 Fault Tolerance / Commodity Hardware AMD Turion II Neo N40L (2x 1,5GHz / 2MB / 15W), 8 GB RAM, 3TB SEAGATE Barracuda 7200.14 < CHF 500 100 K => 200 X (2, 4, 3) => 400 Cores, 1,6 TB RAM, 200 TB HD MTBF ~ 365 d > 1,5 d Source: http://www.cloudcomputingpatterns.org/Watchdog
15.
© 2013 IBM
Corporation15 “Elastic” Scale-Out Source: http://www.cloudcomputingpatterns.org/Continuously_Changing_Workload
16.
© 2013 IBM
Corporation16 “Elastic” Scale-Out of
17.
© 2013 IBM
Corporation17 “Elastic” Scale-Out of CPU Cores
18.
© 2013 IBM
Corporation18 “Elastic” Scale-Out of CPU Cores Storage
19.
© 2013 IBM
Corporation19 “Elastic” Scale-Out of CPU Cores Storage Memory
20.
© 2013 IBM
Corporation20 “Elastic” Scale-Out linear Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
21.
© 2013 IBM
Corporation21 How do Databases Scale-Out? Shared Disk Architectures
22.
© 2013 IBM
Corporation22 How do Databases Scale-Out? Shared Nothing Architectures
23.
© 2013 IBM
Corporation23 Hadoop? Shared Nothing Architecture? Shared Disk Architecture? http://bluemix.net/ 6 Node Hadoop Cluster 4 Free
24.
© 2013 IBM
Corporation24 Data Science on Hadoop SQL (42%) R (33%) Python (26%) Excel (25%) Java, Ruby, C++ (17%) SPSS, SAS (9%) Data Science Hadoop
25.
© 2013 IBM
Corporation25 SQL on Hadoop ● IBM BigSQL (ANSI 92 compliant) ● HIVE, Presto ● Cloudera Impala ● Lingual ● Shark ● ... SQL Hadoop
26.
© 2013 IBM
Corporation26 Two types of SQL Engines ● Type I ● Compiler and Optimizer SQL->MapReduce ● Type II ● Brings own distributed execution engine on Data Nodes ● Brings own Task Scheduler ● The Hadoop SQL Ecosystem is evolving very fast
27.
© 2013 IBM
Corporation27 Hive ● Runs on top of MapReduce ● → Type I Source: http://cdn.venublog.com/wp-content/uploads/2013/07/hive-1.jpg
28.
© 2013 IBM
Corporation28 Lingual ● ANSI SQL Layer on top of Cascading ● Cascading ● Java API do express DAG ● Runs on top of MapReduce ● → Type I
29.
© 2013 IBM
Corporation29 Limits of MapReduce ● Disk writes between Map and Reduce ● Slow for computations which depend on previously computed values ● JOINs are very slow and difficult to implement ● Only sequential data access ● Only tuple-wise data access ● Map-Side joins have sort and size constraints ● Reduce-Side joins require secondary sorting of values ● … ● ...
30.
© 2013 IBM
Corporation30 Impala (Type II) http://blog.cloudera.com/blog/wp-content/uploads/2012/10/impala.png
31.
© 2013 IBM
Corporation31 Presto (Type II) https://www.facebook.com/notes/facebook-engineering/presto-interacting-with-petabytes-of-data-at-facebook/10151786197628920
32.
© 2013 IBM
Corporation32 Spark / Shark (Type II) Source: http://bighadoop.files.wordpress.com/2014/04/spark-architecture.png
33.
© 2013 IBM
Corporation33 BigSQL V3.0 (Type II) Like in Spark, MapReduce has been Kicked out :) (No JobTracker, No Task Tracker, But HDFS/GPFS remains)
34.
© 2013 IBM
Corporation34 BigSQL V3.0 – Architecture Putting the story together…. Big SQL shares a common SQL dialect with DB2 Big SQL shares the same client drivers with DB2
35.
© 2013 IBM
Corporation35 BigSQL V3.0 – Performance Query rewrites Exhaustive query rewrite capabilities Leverages additional metadata such as constraints and nullability Optimization Statistics and heuristic driven query optimization Query optimizer based upon decades of IBM RDBMS experience Tools and metrics Highly detailed explain plans and query diagnostic tools Extensive number of available performance metrics SELECT ITEM_DESC, SUM(QUANTITY_SOLD), AVG(PRICE), AVG(COST) FROM PERIOD, DAILY_SALES, PRODUCT, STORE WHERE PERIOD.PERKEY=DAILY_SALES.PERKEY AND PRODUCT.PRODKEY=DAILY_SALES.PRODKE Y AND STORE.STOREKEY=DAILY_SALES.STOREKEY AND CALENDAR_DATE BETWEEN AND '01/01/2012' AND '04/28/2012' AND STORE_NUMBER='03' AND CATEGORY=72 GROUP BY ITEM_DESC Access plan generationQuery transformation Dozens of query transformations Hundreds or thousands of access plan options Store Product Product Store NLJOIN Daily SalesNLJOIN Period NLJOIN Product NLJOIN Daily Sales NLJOIN Period NLJOIN Store HSJOIN Daily Sales HSJOIN Period HSJOIN Product StoreZZJOIN Daily Sales HSJOIN Period
36.
© 2013 IBM
Corporation36 BigSQL V3.0 – Performance You are substantially faster if you don't use MapReduce IBM BigInsights v3.0, with Big SQL 3.0, is the only Hadoop distribution to successfully run ALL 99 TPC-DS queries and ALL 22 TPC-H queries without modification. Source: http://www.ibmbigdatahub.com/blog/big-deal-about- infosphere-biginsights-v30-big-sql
37.
© 2013 IBM
Corporation37 BigSQL V3.0 – Query Federation Head Node Big SQL Compute Node Task Tracker Data Node Big SQL Compute Node Task Tracker Data Node Big SQL Compute Node Task Tracker Data Node Big SQL Compute Node Task Tracker Data Node Big SQL
38.
© 2013 IBM
Corporation38 BigSQL V1.0 – Demo (small) ● 32 GB Data, ~650.000.000 rows (small, Innovation Center Zurich) ● 3 TB Data, ~ 60.937.500.000 rows (middle, Innovation Center Zurich) ● 0.7 PB Data, ~ 1.421875×10¹³ rows (large, Innovation Center Hursley) ● 32 GB Data, ~650.000.000 rows (small, Innovation Center Zurich) ● 3 TB Data, ~ 60.937.500.000 rows (middle, Innovation Center Zurich) ● 0.7 PB Data, ~ 1.421875×10¹³ rows (large, Innovation Center Hursley)
39.
© 2013 IBM
Corporation39 BigSQL V1.0 – Demo (small) CREATE EXTERNAL TABLE trace ( hour integer, employeeid integer, departmentid integer, clientid integer, date string, timestamp string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n' STORED AS TEXTFILE LOCATION '/user/biadmin/32Gtest';
40.
© 2013 IBM
Corporation40 BigSQL V1.0 – Demo (small)
41.
© 2013 IBM
Corporation41 BigSQL V1.0 – Demo (small)
42.
© 2013 IBM
Corporation42 BigSQL V1.0 – Demo (small) [bivm.ibm.com][biadmin] 1> select count(*) from trace1; +----------+ | | +----------+ | 11416740 | +----------+ 1 row in results(first row: 39.78s; total: 39.78s)
43.
© 2013 IBM
Corporation43 BigSQL V1.0 – Demo (small) select count(hour), hour from trace group by hour order by hour 30 rows in results(first row: 37.98s; total: 37.99s)
44.
© 2013 IBM
Corporation44 BigSQL V1.0 – Demo (small) [bivm.ibm.com][biadmin] 1> select count(*) from trace1 t3 inner join trace2 t4 on t3.hour=t4.hour; +--------+ | | +--------+ | 477340 | +--------+ 1 row in results(first row: 32.24s; total: 32.25s)
45.
© 2013 IBM
Corporation45 BigSQL V3.0 – Demo (small) CREATE HADOOP TABLE trace3 ( hour int, employeeid int, departmentid int,clientid int, date varchar(30), timestamp varchar(30) ) row format delimited fields terminated by '|' stored as textfile;
46.
© 2013 IBM
Corporation46 BigSQL V3.0 – Demo (small) [bivm.ibm.com][biadmin] 1> select count(*) from trace3; +----------+ | 1 | +----------+ | 12014733 | +----------+ 1 row in results(first row: 2.94s; total: 2.95s)
47.
© 2013 IBM
Corporation47 BigSQL V3.0 – Demo (small) [bivm.ibm.com][biadmin] 1> select count(*) from trace3 t3 inner join trace4 t4 on t3.hour=t4.hour; +--------+ | 1 | +--------+ | 504360 | +--------+ 1 row in results(first row: 0.79s; total: 0.80s)
48.
© 2013 IBM
Corporation48 BigSQL V3.0 – Demo (small) [bivm.ibm.com][biadmin] 1> select count(hour), hour from trace3 group by hour order by hour; 29 rows in results(first row: 1.88s; total: 1.89s)
49.
© 2013 IBM
Corporation49 R on Hadoop ● IBM BigR (based on SystemML Almadan Research project) ● Rhadoop ● RHIPE ● ... “R” Hadoop
50.
© 2013 IBM
Corporation50
51.
© 2013 IBM
Corporation5151 Goal: Find column mean Problems: • Column vector can not fit into memory You have to partition and parallelize
52.
© 2013 IBM
Corporation52 ● Sampling Full dataset > RAM Example: use 1% vs 100% of dataset Precision loss from skewed/sparse data ● Numerical Stability Limitation from finite precision in computing Algorithms must be carefully implemented Instability causes errors to cascade throughout your analysis Catastrophic Cancellation Error: 6.375 – 5.625 True value: 0.75 Computed: 0 Relative Error: 1.0 6.375 round to 6.0 5.625 round to 6.0
53.
© 2013 IBM
Corporation53 Data in Hadoop You R User Data in distributed memory
54.
© 2013 IBM
Corporation54 Data in Hadoop: Can run R on a single node R User Data in distributed memory You
55.
© 2013 IBM
Corporation55 BigR (based on SystemML) SystemML compiles hybrid runtime plans ranging from in- memory, single machine (CP) to large-scale, cluster (MR) compute ● Challenge ● Guaranteed hard memory constraints (budget of JVM size) ● for arbitrary complex ML programs ● Key Technical Innovations ● CP & MR Runtime: Single machine & MR operations, integrated runtime ● Caching: Reuse and eviction of in-memory objects ● Cost Model: Accurate time and worst-case memory estimates ● Optimizer: Cost-based runtime plan generation ● Dyn. Recompiler: Re-optimization for initial unknowns Data size Runtime CP CP/MR MR Gradually exploit MR parallelism High performance computing for small data sizes. Scalable computing for large data sizes. Hybrid Plans
56.
© 2013 IBM
Corporation56 R Clients SystemML Statistics Engine Data Sources Embedded R Execution IBM R Packages IBM R Packages Pull data (summaries) to R client Or, push R functions right on the data 1 2 3 © 2014 IBM Corporation17 IBM Internal Use Only BigR Architecture
57.
© 2013 IBM
Corporation57 Big R Data Structures: Proxy to entire dataset data <- bigr.frame(…) Appears and acts like all of the data is on your laptop You
58.
© 2013 IBM
Corporation58 BigR Demo (small) ● 32 GB Data, ~650.000.000 rows (small, Innovation Center Zurich) ● 3 TB Data, ~ 60.937.500.000 rows (middle, Innovation Center Zurich) ● 0.7 PB Data, ~ 1.421875×10¹³ rows (large, Innovation Center Hursley)
59.
© 2013 IBM
Corporation59 BigR Demo (small) library(bigr) bigr.connect(host="bigdata", port=7052, database="default", user="biadmin", password="xxx") is.bigr.connected() tbr <- bigr.frame(dataSource="DEL", coltypes = c("numeric","numeric","numeric","numeric","character","character"), dataPath="/user/biadmin/32Gtest", delimiter=",", header=F, useMapReduce=T) h <- bigr.histogram.stats(tbr$V1, nbins=24)
60.
© 2013 IBM
Corporation60 BigR Demo (small) class bins counts centroids 1 ALL 0 18289280 1.583333 2 ALL 1 15360 2.750000 3 ALL 2 55040 3.916667 4 ALL 3 189440 5.083333 5 ALL 4 579840 6.250000 6 ALL 5 5292160 7.416667 7 ALL 6 8074880 8.583333 8 ALL 7 15653120 9.750000 ...
61.
© 2013 IBM
Corporation61 BigR Demo (small)
62.
© 2013 IBM
Corporation62 BigR Demo (small) jpeg('hist.jpg') bigr.histogram(tbr$V1, nbins=24) # This command runs on 32 GB / ~650.000.000 rows in HDFS dev.off()
63.
© 2013 IBM
Corporation63 SPSS on Hadoop
64.
© 2013 IBM
Corporation64 SPSS on Hadoop
65.
© 2013 IBM
Corporation65 BigSheets Demo (small) ● 32 GB Data, ~650.000.000 rows (small, Innovation Center Zurich) ● 3 TB Data, ~ 60.937.500.000 rows (middle, Innovation Center Zurich) ● 0.7 PB Data, ~ 1.421875×10¹³ rows (large, Innovation Center Hursley) ● 32 GB Data, ~650.000.000 rows (small, Innovation Center Zurich) ● 3 TB Data, ~ 60.937.500.000 rows (middle, Innovation Center Zurich) ● 0.7 PB Data, ~ 1.421875×10¹³ rows (large, Innovation Center Hursley)
66.
© 2013 IBM
Corporation66 BigSheets Demo (small)
67.
© 2013 IBM
Corporation67 BigSheets Demo (small) This command runs on 32 GB / ~650.000.000 rows in HDFS
68.
© 2013 IBM
Corporation68 BigSheets Demo (small)
69.
© 2013 IBM
Corporation69 Text Extraction (SystemT, AQL)
70.
© 2013 IBM
Corporation70 Text Extraction (SystemT, AQL)
71.
© 2013 IBM
Corporation71 If this is not enough? → BigData AppStore
72.
© 2013 IBM
Corporation72 BigData AppStore, Eclipse Tooling ● Write your apps in ● Java (MapReduce) ● PigLatin,Jaql ● BigSQL/Hive/BigR ● Deploy it to BigInsights via Eclipse ● Automatically ● Schedule ● Update ● hdfs files ● BigSQL tables ● BigSheets collections
73.
© 2013 IBM
Corporation73 Questions? http://www.ibm.com/software/data/bigdata/ Twitter: @RomeoKienzler, @IBMEcosystem_DE, @IBM_ISV_Alps
Baixar agora