Enviar pesquisa
Carregar
Daniel Sikar: Hadoop MapReduce - 06/09/2010
•
1 gostou
•
1,303 visualizações
Skills Matter
Seguir
In this podcast speaker Daniel Sikar talks about Hadoop MapReduce.
Leia menos
Leia mais
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 29
Recomendados
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Skills Matter
scalable machine learning
scalable machine learning
Samir Bessalah
RedisConf17 - Distributed Java Map Structures and Services with Redisson
RedisConf17 - Distributed Java Map Structures and Services with Redisson
Redis Labs
Java data structures powered by Redis. Introduction to Redisson @ Redis Light...
Java data structures powered by Redis. Introduction to Redisson @ Redis Light...
Nikita Koksharov
Scala+data
Scala+data
Samir Bessalah
A Shiny Example-- R
A Shiny Example-- R
Dr. Volkan OBAN
Tracing and awk in ns2
Tracing and awk in ns2
Pradeep Kumar TS
Cloudstack interfaces to EC2 and GCE
Cloudstack interfaces to EC2 and GCE
ShapeBlue
Recomendados
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Aws Quick Dirty Hadoop Mapreduce Ec2 S3
Skills Matter
scalable machine learning
scalable machine learning
Samir Bessalah
RedisConf17 - Distributed Java Map Structures and Services with Redisson
RedisConf17 - Distributed Java Map Structures and Services with Redisson
Redis Labs
Java data structures powered by Redis. Introduction to Redisson @ Redis Light...
Java data structures powered by Redis. Introduction to Redisson @ Redis Light...
Nikita Koksharov
Scala+data
Scala+data
Samir Bessalah
A Shiny Example-- R
A Shiny Example-- R
Dr. Volkan OBAN
Tracing and awk in ns2
Tracing and awk in ns2
Pradeep Kumar TS
Cloudstack interfaces to EC2 and GCE
Cloudstack interfaces to EC2 and GCE
ShapeBlue
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
Ajit Nayak
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Altinity Ltd
Unleash your build with nuke
Unleash your build with nuke
Todor Todorov
Upgrading To The New Map Reduce API
Upgrading To The New Map Reduce API
Tom Croucher
Shrug2017 arcpy data_and_you
Shrug2017 arcpy data_and_you
SHRUG GIS
37562259 top-consuming-process
37562259 top-consuming-process
skumner
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
Openstack 簡介
Openstack 簡介
kao kuo-tung
Raw system logs processing with hive
Raw system logs processing with hive
Arpit Patil
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
PyCon KR 2019 sprint - RustPython by example
PyCon KR 2019 sprint - RustPython by example
YunWon Jeong
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Masayuki Nii
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Data Con LA
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
CloudxLab
Debugging & Tuning in Spark
Debugging & Tuning in Spark
Shiao-An Yuan
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Parallel Computing with R
Parallel Computing with R
Peter Solymos
Big data presenation
Big data presenation
leenagoyal
Big data
Big data
leenagoyal
10 Tips for WeChat
10 Tips for WeChat
Chris Baker
Mais conteúdo relacionado
Mais procurados
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
Ajit Nayak
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Altinity Ltd
Unleash your build with nuke
Unleash your build with nuke
Todor Todorov
Upgrading To The New Map Reduce API
Upgrading To The New Map Reduce API
Tom Croucher
Shrug2017 arcpy data_and_you
Shrug2017 arcpy data_and_you
SHRUG GIS
37562259 top-consuming-process
37562259 top-consuming-process
skumner
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
Yulia Shcherbachova
Openstack 簡介
Openstack 簡介
kao kuo-tung
Raw system logs processing with hive
Raw system logs processing with hive
Arpit Patil
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Modern Data Stack France
PyCon KR 2019 sprint - RustPython by example
PyCon KR 2019 sprint - RustPython by example
YunWon Jeong
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Masayuki Nii
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Data Con LA
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
CloudxLab
Debugging & Tuning in Spark
Debugging & Tuning in Spark
Shiao-An Yuan
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Parallel Computing with R
Parallel Computing with R
Peter Solymos
Mais procurados
(19)
NS2: AWK and GNUplot - PArt III
NS2: AWK and GNUplot - PArt III
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unified Data Platform, by Pauline Yeung of Cisco Systems
Unleash your build with nuke
Unleash your build with nuke
Upgrading To The New Map Reduce API
Upgrading To The New Map Reduce API
Shrug2017 arcpy data_and_you
Shrug2017 arcpy data_and_you
37562259 top-consuming-process
37562259 top-consuming-process
"Metrics: Where and How", Vsevolod Polyakov
"Metrics: Where and How", Vsevolod Polyakov
Openstack 簡介
Openstack 簡介
Raw system logs processing with hive
Raw system logs processing with hive
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
PyCon KR 2019 sprint - RustPython by example
PyCon KR 2019 sprint - RustPython by example
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Cocoa勉強会23-識別情報の変換〜文字エンコードとデータタイプ
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Sqoop | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Debugging & Tuning in Spark
Debugging & Tuning in Spark
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Parallel Computing with R
Parallel Computing with R
Destaque
Big data presenation
Big data presenation
leenagoyal
Big data
Big data
leenagoyal
10 Tips for WeChat
10 Tips for WeChat
Chris Baker
5 Steps To A Smart Compensation Plan
5 Steps To A Smart Compensation Plan
BambooHR
Benefits of drinking water
Benefits of drinking water
Eason Chan
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stanford GSB Corporate Governance Research Initiative
Destaque
(6)
Big data presenation
Big data presenation
Big data
Big data
10 Tips for WeChat
10 Tips for WeChat
5 Steps To A Smart Compensation Plan
5 Steps To A Smart Compensation Plan
Benefits of drinking water
Benefits of drinking water
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Semelhante a Daniel Sikar: Hadoop MapReduce - 06/09/2010
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17
Michele Orselli
Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
Databricks
FP - Découverte de Play Framework Scala
FP - Découverte de Play Framework Scala
Kévin Margueritte
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PROIDEA
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
PROIDEA
Cloud State of the Union for Java Developers
Cloud State of the Union for Java Developers
Burr Sutter
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015
Sri Ambati
Declarative & workflow based infrastructure with Terraform
Declarative & workflow based infrastructure with Terraform
Radek Simko
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
Rafal Kwasny
R the unsung hero of Big Data
R the unsung hero of Big Data
Dhafer Malouche
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Natalino Busa
Introduction to cloudforecast
Introduction to cloudforecast
Masahiro Nagano
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
Amazon Web Services Korea
Miscelaneous Debris
Miscelaneous Debris
frewmbot
Into The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and docker
Ortus Solutions, Corp
Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018
Ortus Solutions, Corp
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
Amazon Web Services Korea
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Аліна Шепшелей
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
Inhacking
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
BigData_Europe
Semelhante a Daniel Sikar: Hadoop MapReduce - 06/09/2010
(20)
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17
Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
FP - Découverte de Play Framework Scala
FP - Découverte de Play Framework Scala
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
4Developers 2018: Pyt(h)on vs słoń: aktualny stan przetwarzania dużych danych...
Cloud State of the Union for Java Developers
Cloud State of the Union for Java Developers
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015
Declarative & workflow based infrastructure with Terraform
Declarative & workflow based infrastructure with Terraform
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
R the unsung hero of Big Data
R the unsung hero of Big Data
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Introduction to cloudforecast
Introduction to cloudforecast
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
AWS re:Invent re:Cap - 데이터 분석: Amazon EC2 C4 Instance + Amazon EBS - 김일호
Miscelaneous Debris
Miscelaneous Debris
Into The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and docker
Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
대용량 데이타 쉽고 빠르게 분석하기 :: 김일호 솔루션즈 아키텍트 :: Gaming on AWS 2016
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics wi...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
SE2016 BigData Vitalii Bondarenko "HD insight spark. Advanced in-memory Big D...
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Mais de Skills Matter
5 things cucumber is bad at by Richard Lawrence
5 things cucumber is bad at by Richard Lawrence
Skills Matter
Patterns for slick database applications
Patterns for slick database applications
Skills Matter
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Skills Matter
Oscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheim
Skills Matter
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Skills Matter
Cukeup nyc ian dees on elixir, erlang, and cucumberl
Cukeup nyc ian dees on elixir, erlang, and cucumberl
Skills Matter
Cukeup nyc peter bell on getting started with cucumber.js
Cukeup nyc peter bell on getting started with cucumber.js
Skills Matter
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Skills Matter
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Skills Matter
Progressive f# tutorials nyc don syme on keynote f# in the open source world
Progressive f# tutorials nyc don syme on keynote f# in the open source world
Skills Matter
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Skills Matter
Dmitry mozorov on code quotations code as-data for f#
Dmitry mozorov on code quotations code as-data for f#
Skills Matter
A poet's guide_to_acceptance_testing
A poet's guide_to_acceptance_testing
Skills Matter
Russ miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-dive
Skills Matter
Serendipity-neo4j
Serendipity-neo4j
Skills Matter
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Skills Matter
Plug 20110217
Plug 20110217
Skills Matter
Lug presentation
Lug presentation
Skills Matter
I went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_t
Skills Matter
Plug saiku
Plug saiku
Skills Matter
Mais de Skills Matter
(20)
5 things cucumber is bad at by Richard Lawrence
5 things cucumber is bad at by Richard Lawrence
Patterns for slick database applications
Patterns for slick database applications
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Scala e xchange 2013 haoyi li on metascala a tiny diy jvm
Oscar reiken jr on our success at manheim
Oscar reiken jr on our success at manheim
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Progressive f# tutorials nyc dmitry mozorov & jack pappas on code quotations ...
Cukeup nyc ian dees on elixir, erlang, and cucumberl
Cukeup nyc ian dees on elixir, erlang, and cucumberl
Cukeup nyc peter bell on getting started with cucumber.js
Cukeup nyc peter bell on getting started with cucumber.js
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Agile testing & bdd e xchange nyc 2013 jeffrey davidson & lav pathak & sam ho...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc rachel reese & phil trelford on try f# from zero...
Progressive f# tutorials nyc don syme on keynote f# in the open source world
Progressive f# tutorials nyc don syme on keynote f# in the open source world
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Agile testing & bdd e xchange nyc 2013 gojko adzic on bond villain guide to s...
Dmitry mozorov on code quotations code as-data for f#
Dmitry mozorov on code quotations code as-data for f#
A poet's guide_to_acceptance_testing
A poet's guide_to_acceptance_testing
Russ miles-cloudfoundry-deep-dive
Russ miles-cloudfoundry-deep-dive
Serendipity-neo4j
Serendipity-neo4j
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Plug 20110217
Plug 20110217
Lug presentation
Lug presentation
I went to_a_communications_workshop_and_they_t
I went to_a_communications_workshop_and_they_t
Plug saiku
Plug saiku
Último
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
Architecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
MIND CTI
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
The Digital Insurer
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Igalia
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
Zilliz
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
MadyBayot
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
apidays
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
wesley chun
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Dropbox
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
The Digital Insurer
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
Último
(20)
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Architecting Cloud Native Applications
Architecting Cloud Native Applications
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
Daniel Sikar: Hadoop MapReduce - 06/09/2010
1.
QUICK AND DIRTY
PARALLEL PROCESSING ON THE CLOUD Daniel Sikar
2.
EC2 S3
3.
4.
5.
Elastic MapReduce Ruby
library
6.
Hadoop
7.
s3cmd
8.
Hadoop MapReduce Job
Tracker + Task Tracker + Slaves HDFS – Distributed file system
9.
Hadoop MapReduce usage
Data crunching in general Clicks Statistics etc
10.
Hadoop Project Mgmt
Committee
11.
MapReduce ?
12.
MapReduce Key Pairs
<key,value>
13.
MapReduce
14.
HTTP Logs Log
file A: (...) FreeTouchScreenNokia5230 (...) (...) GetRidofAllSpeedCameras(...) (...) USManWinsLottery (...) (...) BNPToLaunchElectionManifesto (...) Log file B: (...) FreeTouchScreenNokia5230 (...) (...) BodyLanguageTellsAll (...)
15.
MapReduce <FreeTouchScreenNokia5230, 1>
+ <FreeTouchScreenNokia5230, 1> = <FreeTouchScreenNokia5230, 2>
16.
Hadoop Streaming Running
MapReduce jobs with .exe fiels and scripts $ <list> | mapper | reducer
17.
Hadoop Streaming Running
MapReduce jobs with .exe fiels and scripts $ <list> | mapper | reducer
18.
Real life example
of Hadoop Streaming usage
19.
Wikipedia Page Access
Logs
20.
Wine Grape Varieties
21.
Wikipedia WGV Page
Access Stats
22.
Business Decisions
23.
Launching a virtual
Hadoop Cluster $ elastic-mapreduce --create --name "Wiki log crunch" --alive --num-instances –instance-type c1.medium 20 Created job flow <job flow id> $ ec2din (...)
24.
25.
26.
Pseudo-Distributed Operation
27.
Fully-Distributed Operation
28.
NameNode
29.
JobTracker
30.
DataNode + TaskTracker
31.
32.
Pseudo-Distributed Operation
33.
Fully-Distributed Operation
34.
NameNode
35.
JobTracker
36.
DataNode + TaskTracker
37.
Add a step
$ elastic-mapreduce --jobflow <jfid> --stream --step-name "Wiki log crunch" --input s3n://dsikar-wikilogs-2009/dec/ --output s3n://dsikar-wikilogs-output/21 --mapper s3n://dsikar-wiki-scripts/wikidictionarymap.pl --reducer s3n://dsikar-wiki-scripts/wikireduce.pl http://<instance public dns>:9100
38.
s3cmd # make
bucket $ s3cmd mb s3://dsikar-wikilogs # put log files $ s3cmd put pagecounts-200912*.gz s3://dsikar-wikilogs/dec $ s3cmd put pagecounts-201004*.gz s3://dsikar-wikilogs/apr # list log files $ s3cmd ls s3://dsikar-wikilogs/ # put scripts $ s3cmd put *.pl s3://dsikar-wiki-scripts/ # delete log files $ s3cmd del --recursive --force s3://dsikar-wikilogs/ # remove bucket $ s3cmd rb s3://dsikar-wikilogs/
39.
Elastic MapReduce --create
--list --jobflow --describe --stream --terminate
40.
Output files part-00000
part-00001 part-00002 (...)
41.
Further aggregation
42.
Conclusion Hadoop MapReduce
provides out-of-the-box ready-to-go distributed computing.
Notas do Editor
So without further ado lets get this show on the road and run a job concurrently on a few virtual machines.