Map reduce

MapReduce
INTRODUCTION:
Seek time is improving more slowly than transfer rate. If the data access pattern is dominated
by seeks, it will take longer to read or write large portions of the dataset than streaming through
it, which operates at the transfer rate. That’s why, MapReduce is here to serve.
MAPREDUCE:
Some key concepts on MapReduce are laid down below:
Linearly scalable programming model.
Updating a small portion of records RDBMS
Updating the majority of records MapReduce
MapReduce is good for Ad hoc analysis.
MapReduce is suitable for applications where the data is written once, read many
times, whereas RDBMS is good for datasets that are continually updated.
MapReduce works well on unstructured or semi-structured data.
[Structured Data: A defined format, e.g. XML documents.
Semi-structured Data: There may be a schema; it is often ignored so it may be used only as a
guide to the structure of the data, e.g. a Spreadsheet.
Unstructured Data: Does not have any particular internal structure, e.g. Plain text or Image
data.]
MapReduce interprets the data at processing time.
Keys and Values of MapReduce are chosen by the person analyzing the data.
Use of high-level query languages, e.g. Pig, Hive etc.
If double the size of the input data, a job will run twice as slow. But, if you also double
the size of the cluster, a job will run as fast as the original one – true for MapReduce,
not for SQL Queries.
“Data Locality” is the heart of MapReduce.

CONCLUSION:
Here, we have tried to figure out some specialties of MapReduce which makes it a far better
option in handling Big Data. Still the document is somewhat abstract for sure. Some more
analysis will clarify the concepts.

Map reduce

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (9)

Semelhante a Map reduce

Semelhante a Map reduce (20)

Mais de Md. Mahedi Mahfuj

Mais de Md. Mahedi Mahfuj (17)

Último

Último (20)

Map reduce