hadoop overview.pptx

•Transferir como PPTX, PDF•

0 gostou•7 visualizações

SachinSingh217687

bigdata

Tecnologia

Programming with Hadoop
Presented By
Dr. Jyoti Parsola

Contents
• Introduction
• Characteristics
• Hadoop Components
• Word Count Example

Introduction
Hadoop is Apache’s open source software framework for
distributed processing and distributed storage of large datasets
across large clusters of computers build on commodity
hardware.

Hadoop
supports
Hadoop Running applications
on big data

Big Data
Large and growing files
Unstructured data
(does not fit on relational
model)
Coming from users
,application,systems,
sensors

Big Data Challenges
Velocity Volume Variety

Traditional Approach
Big data Powerful
computer
Processing

Hadoop Approach
Big data
Broken into piece

Hadoop Approach
Big data
Computation
Computation
Computation
Computation
Combined
Result

Distributed Model
Linux
Low cost
computer
Works on
linux based
machine
Linux Linux Linux

Distributed Model
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Task
Tracker
Data
Node
Slaves

Batch Processing
Slaves
Master
Application
Queue

Hadoop Computing Model
• Two main phases: Map and Reduce
• Any job is converted into map and reduce tasks
• Developers need ONLY to implement the Map and Reduce classes
Data Flow
Shuffling and
Sorting
Blocks of
the input
file in HDFS
Map tasks (one
for each block)
Reduce tasks
Output is written
to HDFS

Hadoop tools
• Hive
• Hbase
• Pig
• Oozie
• Flume
• Scoop

Mais conteúdo relacionado

Semelhante a hadoop overview.pptx

Hadoop and Big DataHarshdeep Kaur

Hadoop_arunam_pptjerrin joseph

Hadoopchandinisanz

SQL Server 2012 and Big DataMicrosoft TechNet - Belgium and Luxembourg

Big-Data Hadoop Tutorials - MindScripts Technologies, Pune amrutupre

List of Engineering Colleges in UttarakhandRoorkee College of Engineering, Roorkee

Hadoop.pptxarslanhaneef

Hadoop.pptxsonukumar379092

Microsoft's Big Play for Big DataAndrew Brust

Getting started big dataKibrom Gebrehiwot

02 Hadoop.pptx HADOOP VENNELA DONTHIREDDYVenneladonthireddy1

Talend for big_data_intorductionLakshman Dhullipalla

Unit IV.pdfKennyPratheepKumar

Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Andrew Brust

Architecting the Future of Big Data and SearchHortonworks

2. hadoop fundamentalsLokesh Ramaswamy

Apache Hadoop HiveSome corner at the Laboratory

Big Data in the Microsoft PlatformJesus Rodriguez

BIGDATA pptsKrisshhna Daasaarii

Apache hadoop, hdfs and map reduce OverviewNisanth Simon

Semelhante a hadoop overview.pptx (20)

Hadoop and Big Data

Hadoop_arunam_ppt

Hadoop

SQL Server 2012 and Big Data

Big-Data Hadoop Tutorials - MindScripts Technologies, Pune

List of Engineering Colleges in Uttarakhand

Hadoop.pptx

Microsoft's Big Play for Big Data

Getting started big data

02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY

Talend for big_data_intorduction

Unit IV.pdf

Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012

Architecting the Future of Big Data and Search

2. hadoop fundamentals

Apache Hadoop Hive

Big Data in the Microsoft Platform

BIGDATA ppts

Apache hadoop, hdfs and map reduce Overview

Último

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

"ML in Production",Oleksandr BaganFwdays

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Search Engine Optimization SEO PDF for 2024.pdfRankYa

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

WordPress Websites for Engineers: Elevate Your Brandgvaughan

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

How to write a Business Continuity PlanDatabarracks

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Advanced Computer Architecture – An IntroductionDilum Bandara