Enviar pesquisa
Carregar
サンプルから見るMapReduceコード
•
5 gostaram
•
1,780 visualizações
Shinpei Ohtani
Seguir
Mapperしか出来ませんでしたが、とりあえず。
Leia menos
Leia mais
Tecnologia
Vista de apresentação de diapositivos
Denunciar
Compartilhar
Vista de apresentação de diapositivos
Denunciar
Compartilhar
1 de 22
Baixar agora
Baixar para ler offline
Recomendados
Introduction to Apache Pig
Introduction to Apache Pig
Jason Shao
Apache Hadoop for System Administrators
Apache Hadoop for System Administrators
Allen Wittenauer
Terraform infraestructura como código
Terraform infraestructura como código
Victor Adsuar
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
Pedro Figueiredo
Hadoop on osx
Hadoop on osx
Devopam Mittra
My life as a beekeeper
My life as a beekeeper
Pedro Figueiredo
Introduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
Introduction to Apache Pig
Introduction to Apache Pig
Anshul Bhatnagar
Recomendados
Introduction to Apache Pig
Introduction to Apache Pig
Jason Shao
Apache Hadoop for System Administrators
Apache Hadoop for System Administrators
Allen Wittenauer
Terraform infraestructura como código
Terraform infraestructura como código
Victor Adsuar
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
Pedro Figueiredo
Hadoop on osx
Hadoop on osx
Devopam Mittra
My life as a beekeeper
My life as a beekeeper
Pedro Figueiredo
Introduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
Introduction to Apache Pig
Introduction to Apache Pig
Anshul Bhatnagar
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
ragho
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Piotr Wikiel
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Hortonworks
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
Hive commands
Hive commands
Ganesh Sanap
Sql cheat sheet
Sql cheat sheet
solgenomics
Shark - Lab Assignment
Shark - Lab Assignment
Farzad Nozarian
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Mario IC
Lua: the world's most infuriating language
Lua: the world's most infuriating language
jgrahamc
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
gethue
Build your own_map_by_yourself
Build your own_map_by_yourself
Marc Huang
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Lucas Renan
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
Using spaces (Drupal)
Using spaces (Drupal)
Stijn De Meyere
Advanced Sqoop
Advanced Sqoop
Yogesh Kulkarni
What's New In JDK 10
What's New In JDK 10
Vladimir Tsanev
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Hanborq Inc.
Lecture 2 part 3
Lecture 2 part 3
Jazan University
Mais conteúdo relacionado
Mais procurados
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
ragho
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Piotr Wikiel
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Hortonworks
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
Hive commands
Hive commands
Ganesh Sanap
Sql cheat sheet
Sql cheat sheet
solgenomics
Shark - Lab Assignment
Shark - Lab Assignment
Farzad Nozarian
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Mario IC
Lua: the world's most infuriating language
Lua: the world's most infuriating language
jgrahamc
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
gethue
Build your own_map_by_yourself
Build your own_map_by_yourself
Marc Huang
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Lucas Renan
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
Using spaces (Drupal)
Using spaces (Drupal)
Stijn De Meyere
Advanced Sqoop
Advanced Sqoop
Yogesh Kulkarni
What's New In JDK 10
What's New In JDK 10
Vladimir Tsanev
Mais procurados
(20)
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Hive commands
Hive commands
Sql cheat sheet
Sql cheat sheet
Shark - Lab Assignment
Shark - Lab Assignment
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Introduction to scoop and its functions
Introduction to scoop and its functions
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Lua: the world's most infuriating language
Lua: the world's most infuriating language
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
Build your own_map_by_yourself
Build your own_map_by_yourself
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Using spaces (Drupal)
Using spaces (Drupal)
Advanced Sqoop
Advanced Sqoop
What's New In JDK 10
What's New In JDK 10
Semelhante a サンプルから見るMapReduceコード
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Hanborq Inc.
Lecture 2 part 3
Lecture 2 part 3
Jazan University
mapreduce ppt.ppt
mapreduce ppt.ppt
TAGADPALLEWARPARTHVA
L3.fa14.ppt
L3.fa14.ppt
Tushar557668
Osd ctw spark
Osd ctw spark
Wisely chen
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
Map Reduce
Map Reduce
Prashant Gupta
Hadoop Overview kdd2011
Hadoop Overview kdd2011
Milind Bhandarkar
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
Hive Anatomy
Hive Anatomy
nzhang
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Hadoop london
Hadoop london
Yahoo Developer Network
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index construction
Subhas Kumar Ghosh
Large Scale Data Processing & Storage
Large Scale Data Processing & Storage
Ilayaraja P
Elephant in the cloud
Elephant in the cloud
rhatr
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
IndicThreads
Brust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
Hadoop M/R Pig Hive
Hadoop M/R Pig Hive
zahid-mian
Semelhante a サンプルから見るMapReduceコード
(20)
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Lecture 2 part 3
Lecture 2 part 3
mapreduce ppt.ppt
mapreduce ppt.ppt
L3.fa14.ppt
L3.fa14.ppt
Osd ctw spark
Osd ctw spark
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
Map Reduce
Map Reduce
Hadoop Overview kdd2011
Hadoop Overview kdd2011
Hadoop Overview & Architecture
Hadoop Overview & Architecture
Hive Anatomy
Hive Anatomy
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Hadoop london
Hadoop london
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index construction
Large Scale Data Processing & Storage
Large Scale Data Processing & Storage
Elephant in the cloud
Elephant in the cloud
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Brust hadoopecosystem
Brust hadoopecosystem
MapReduce Paradigm
MapReduce Paradigm
MapReduce Paradigm
MapReduce Paradigm
Hadoop M/R Pig Hive
Hadoop M/R Pig Hive
Mais de Shinpei Ohtani
Amazon Aurora
Amazon Aurora
Shinpei Ohtani
AWS Lambda and Amazon API Gateway
AWS Lambda and Amazon API Gateway
Shinpei Ohtani
ECS for Docker Meetup #4
ECS for Docker Meetup #4
Shinpei Ohtani
JVM的な何か@JVM Operation Casual Talk
JVM的な何か@JVM Operation Casual Talk
Shinpei Ohtani
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Shinpei Ohtani
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Shinpei Ohtani
プログラマブルクラウドの薦め
プログラマブルクラウドの薦め
Shinpei Ohtani
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
Shinpei Ohtani
サンプルから見るMap reduceコード
サンプルから見るMap reduceコード
Shinpei Ohtani
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
Shinpei Ohtani
はやわかりHadoop
はやわかりHadoop
Shinpei Ohtani
T2 Web Framework
T2 Web Framework
Shinpei Ohtani
T2 Hacks
T2 Hacks
Shinpei Ohtani
T2 webframework
T2 webframework
Shinpei Ohtani
Struts2を始めよう!
Struts2を始めよう!
Shinpei Ohtani
Struts2 in a nutshell
Struts2 in a nutshell
Shinpei Ohtani
ASP.NET MVC 1.0
ASP.NET MVC 1.0
Shinpei Ohtani
Mais de Shinpei Ohtani
(17)
Amazon Aurora
Amazon Aurora
AWS Lambda and Amazon API Gateway
AWS Lambda and Amazon API Gateway
ECS for Docker Meetup #4
ECS for Docker Meetup #4
JVM的な何か@JVM Operation Casual Talk
JVM的な何か@JVM Operation Casual Talk
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
プログラマブルクラウドの薦め
プログラマブルクラウドの薦め
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
サンプルから見るMap reduceコード
サンプルから見るMap reduceコード
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
はやわかりHadoop
はやわかりHadoop
T2 Web Framework
T2 Web Framework
T2 Hacks
T2 Hacks
T2 webframework
T2 webframework
Struts2を始めよう!
Struts2を始めよう!
Struts2 in a nutshell
Struts2 in a nutshell
ASP.NET MVC 1.0
ASP.NET MVC 1.0
Último
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Radu Cotescu
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
Rafal Los
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
hans926745
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
Product Anonymous
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
hans926745
Último
(20)
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
サンプルから見るMapReduceコード
1.
MapReduce @shot6
2.
Cloudera
Avro Sqoop Desktop Pig Hive HBase Chukwa Map Zoo HDFS Reduce Keeper Core
3.
Cloudera
Avro Sqoop Desktop Pig Hive HBase Chukwa Map Zoo HDFS Reduce Keeper Core
4.
•
MapReduce – Mapper/Reducer •
5.
MapReduce
• WordCount • • – Mapper/Reducer Job ⾏行行 – InputFormat/OutputFormat ⽅方 – HDFS(FileSystem) – Writable ⽅方
6.
WordCount • Hadoop
Hello World • API (org.apache.hadoop.mapreduce) • API
7.
Grep • grep
– grepJob/sortJob 2 ⾏行行 – JobConf/Mapper/Reducer ⽅方 – Mapper RegexMapper ⾏行行 <Text, Long> SequenceFileFormat – sortJob – ⼒力力 –
8.
Grep
- • JobConf • Mapper • Reducer
9.
o.a.hadoop.mapred.JobConf •
– mapred-default.xml – conf/mapred-site.xml – XML ⾝身 DOM – ⾃自 ⽬目 ⼿手 – ⼦子 • JobConf child = new JobConf( Conf, jar );
10.
mapred-site.xml <configuration> <!–
--> <property> <key>mapred.job.tracker</key> <value>your-site:9001</value> </property> </configuration>
11.
o.a.hadoop.mapred.Mapper • Mapper • InputSplit
Mapper • MapTask/MapRunner • map(KEY, VALUE, COLLECTOR, REPORTER) – KEY:Map VALUE:Map – COLLECTOR: – REPORTER: API • MapReduceBase
12.
o.a.hadoop.mapred.MapTask • Map • initiazlize
(Task Reducer ) – ⽣生 – (o.a.h.mapred.TaskStatus.State) • RUNNING, SUCCEEDED, FAILED, UNASSIGNED, KILLED, COMMIT_PENDING, FAILED_UNCLEAN, KILLED_UNCLEAN – OutputCommiter ⽣生 • Task ⼒力力 ⾏行行 • ⼒力力 – mapred.work.output.dir
13.
o.a.h.mapred.MapTask cont • run
runOldMapper • JobClient InputSplit • RecordReader
14.
o.a.h.mapred.MapTask cont2 • Reduce
– spill (* ) • $mapred.local.dir/taskTracker/jobcache/$ {taskid}/output/spill${spillNumber}.out – Reducer ⼒力力 • Combiner min.num.spills.for.combine combiner – RecordWriter ⼒力力 • MapRunner
15.
o.a.h.mapred.MapRunner • MapRunnable
– mapred.map.runner.class – Hadoop PipeMapRunner – Map MultiThreadedMapRunner
16.
o.a.h.mapred.MapRunner
cont • run(RecordReader, OutputCollector, Reporter) – RecordReader: InputFormat Split Reader(InputFormat/RecordReader ) • – RecordReader – ⾝身 –
17.
MapTask
MapRunner Mapper Record Output Reader Collector Input Split⽣生 Spill & run createKey() SpillThread createValue() next(key, value) EOF Map(key, value, Spill outputCollector, reporter)
18.
m(_ _)m
19.
• Mapper
– JobConf – Mapper/MapRunner/MapTask • – Reducer • Reducer ⾏行行 • Reducer ⾏行行 – InputFormat/RecordReader
20.
o.a.h.mapred.Reducer • Reducer • InputSplit
Mapper • ReduceTask/ReduceRunner • reduce(KEY, Iterator<VALUE>, COLLECTOR, REPORTER) – KEY: Iterator<VALUE>: – COLLECTOR: – REPORTER: API • MapReduceBase
21.
o.a.h.mapred.ReduceTask • SHUFFLE • ReduceTask.ReduceCopier
– fetchOutputs( Merger.MergeQueue) • Map x mapred.reduce.parallel.copies – MapOutputCopier • Map ⾏行行 LocalFSMerger • ⾏行行 InMemFSMergeThread • GetMapEventsThread – Map – < , MapOutputLocation(taskId, host, httpUrl)> • ⼀一 TaskTracker ⼯工
22.
o.a.h.mapred.ReduceTask • run(RecordReader, OutputCollector,
Reporter) • SORT – Memory, disk ⽣生 • RowKeyValueItetator – Reducer ⽣生 – RecordWriter ⽣生 – ReduceValuesIterator ⾏行行
Baixar agora