O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

“WordCount”, a “Hello World” for Getting Started on Hadoop

21.181 visualizações

Publicada em

“WordCount”, a “Hello World” for MapReduce

Definition: count how often each word appears within a collection
of text documents.

A simple program which illustrates a pretty good test case for what
MapReduce can perform, since it incorporates:

• minimal amount code
• document feature extraction (where words are “terms”)
• symbolic and numeric values
• potential use of a combiner
• bipartite graph of (doc, term) tuples
• not so many steps away from useful indexing…
When a framework can run “WordCount” in parallel at scale, then it
can handle much larger, more interesting compute problems as well.

Publicada em: Tecnologia
  • Seja o primeiro a comentar