SlideShare uma empresa Scribd logo
1 de 22
Technical Seminar
on
HADOOP TECHNOLOGY
Under the Guidance of
P.V.R.K.MURTHY, M.Tech
Assistant Professor
What is hadoop Technology??
Why hadoop?
Developers of hadoop Technology
Famous hadoop users
Hadoop Features
Hadoop Architectures
Core-Components of Hadoop
Hadoop High Level Architechture
Hadoop cluster
CONTENTS
What is HDFS
HDFS – Name Node features:
HDFS-name node architecture
HDFS-data node
Hadoop MAPREDUCE
Benefits of Hadoop…
Conclusion
Reference
CONTENTS…
HADOOP TECHNOLOGY
What is Hadoop Technology??
•The most well known technology used for Big Data is
Hadoop.
•It is actually a large scale batch data processing system
Why Hadoop ??
•Distributed cluster system
•Platform for massively scalable applications
•Enables parallel data processing
Developers of Hadoop Technology:
Michael j. cafarella
Doug cutting
Famous Hadoop users
Hadoop Features
•Hadoop provides access to the file systems
• The Hadoop Common package contains the
necessary JAR files and scripts
•The package also provides source code,
documentation and a contribution section that includes
projects from the Hadoop Community.
HADOOPARCHITECTURE
Core-Components of Hadoop:
Hadoop distributive file system.
Map reduce.
What is HDFS ?
•Distributed file system
•Traditional hierarchical file organization
•Single namespace for the entire cluster
•Write-once-read-many access model
•Aware of the network topology
Hadoop High Level Architechture
Hadoop cluster
•A Small Hadoop Cluster Include a single master &
multiple worker nodes
Master node:
Data Node
Job Tracker
Task Tracker
Name Node
Slave node:
Data Node
Task Tracke
HDFS – Name Node Features
Metadata in main memory:
•List of files
•List of blocks for each file
•List of Data Nodes for each block
•File attributes
•Creation time
•Records every change in the
metadata
HDFS-name node architecture
Secondary name node
3.Store to HDD
Primary name-node
RAM
HDD
RAM
HDD
1. Pull transaction log
4.Push
2. Merge changes
HDFS-Data node
•Block Server Stores data in the local file system
•Periodic validation of checksums
•Periodically sends a report of all existing blocks
to the Name Node
Hadoop MAPREDUCE
Job Tracker:
Splitting into map and reduce tasks
Scheduling tasks on a cluster node
Task Tracker:
Runs Map Reduce tasks periodically
Map reduce implementation:
Benefits of Hadoop…
•Cost Saving and efficient and reliable data processing
•Provides an economically scalable solution
•Storing and processing of large amount of data
•Data grid operating system
•It is deployed on industry standard servers rather than expensive
specialized data storage systems.
• Parallel processing of huge amounts of data across inexpensive,
industry-standard servers.
Why commodity hw ?
because cheaper
designed to tolerate faults
Why HDFS ?
network bandwidth vs seek latency
Why Map reduce programming model?
parallel programming
large data sets
moving computation to data
single compute + data cluster
CONCLUSION
REFERENCES
•Apache Hadoop!
(http://hadoop.apache.org)
•Hadoop on Wikipedia
(http://en.wikipedia.org/wiki/Hadoop)
•Cloudera - Apache Hadoop for the Enterprise
(http://www.cloudera.com
HADOOP  TECHNOLOGY ppt
HADOOP  TECHNOLOGY ppt

Mais conteúdo relacionado

Mais procurados

Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Simplilearn
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
Varun Narang
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Simplilearn
 

Mais procurados (20)

Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
 
Lecture6 introduction to data streams
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streams
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Internet of Things (IoT) and Big Data
Internet of Things (IoT) and Big DataInternet of Things (IoT) and Big Data
Internet of Things (IoT) and Big Data
 
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
Introduction To Hadoop | What Is Hadoop And Big Data | Hadoop Tutorial For Be...
 
Yarn.ppt
Yarn.pptYarn.ppt
Yarn.ppt
 
Big data
Big dataBig data
Big data
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 

Destaque

Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
Milind Bhandarkar
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
 

Destaque (20)

Hadoop Technologies
Hadoop TechnologiesHadoop Technologies
Hadoop Technologies
 
Hadoop Technology
Hadoop TechnologyHadoop Technology
Hadoop Technology
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Apache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXApache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOX
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Hadoop
HadoopHadoop
Hadoop
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data mining
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 

Semelhante a HADOOP TECHNOLOGY ppt

Lecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptxLecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
NIKHILGR3
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
bhargavi804095
 

Semelhante a HADOOP TECHNOLOGY ppt (20)

Hadoop
HadoopHadoop
Hadoop
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
 
An Introduction of Apache Hadoop
An Introduction of Apache HadoopAn Introduction of Apache Hadoop
An Introduction of Apache Hadoop
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
 
Unit IV.pdf
Unit IV.pdfUnit IV.pdf
Unit IV.pdf
 
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptxLecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 

Mais de sravya raju

Mais de sravya raju (7)

Secure shell ppt
Secure shell pptSecure shell ppt
Secure shell ppt
 
BIOMETRIC IDENTIFICATION IN ATM’S PPT
BIOMETRIC IDENTIFICATION IN ATM’S  PPTBIOMETRIC IDENTIFICATION IN ATM’S  PPT
BIOMETRIC IDENTIFICATION IN ATM’S PPT
 
Hawk Eye Technology ppt
Hawk Eye Technology pptHawk Eye Technology ppt
Hawk Eye Technology ppt
 
fog computing ppt
fog computing ppt fog computing ppt
fog computing ppt
 
Fog computing document
Fog computing documentFog computing document
Fog computing document
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
PERSON DE-IDENTIFICATION IN VIDEOS ppt
PERSON DE-IDENTIFICATION IN VIDEOS  pptPERSON DE-IDENTIFICATION IN VIDEOS  ppt
PERSON DE-IDENTIFICATION IN VIDEOS ppt
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

HADOOP TECHNOLOGY ppt

  • 1. Technical Seminar on HADOOP TECHNOLOGY Under the Guidance of P.V.R.K.MURTHY, M.Tech Assistant Professor
  • 2. What is hadoop Technology?? Why hadoop? Developers of hadoop Technology Famous hadoop users Hadoop Features Hadoop Architectures Core-Components of Hadoop Hadoop High Level Architechture Hadoop cluster CONTENTS
  • 3. What is HDFS HDFS – Name Node features: HDFS-name node architecture HDFS-data node Hadoop MAPREDUCE Benefits of Hadoop… Conclusion Reference CONTENTS…
  • 4. HADOOP TECHNOLOGY What is Hadoop Technology?? •The most well known technology used for Big Data is Hadoop. •It is actually a large scale batch data processing system
  • 5. Why Hadoop ?? •Distributed cluster system •Platform for massively scalable applications •Enables parallel data processing
  • 6. Developers of Hadoop Technology: Michael j. cafarella Doug cutting
  • 8. Hadoop Features •Hadoop provides access to the file systems • The Hadoop Common package contains the necessary JAR files and scripts •The package also provides source code, documentation and a contribution section that includes projects from the Hadoop Community.
  • 10. Core-Components of Hadoop: Hadoop distributive file system. Map reduce.
  • 11. What is HDFS ? •Distributed file system •Traditional hierarchical file organization •Single namespace for the entire cluster •Write-once-read-many access model •Aware of the network topology
  • 12. Hadoop High Level Architechture
  • 13. Hadoop cluster •A Small Hadoop Cluster Include a single master & multiple worker nodes Master node: Data Node Job Tracker Task Tracker Name Node Slave node: Data Node Task Tracke
  • 14. HDFS – Name Node Features Metadata in main memory: •List of files •List of blocks for each file •List of Data Nodes for each block •File attributes •Creation time •Records every change in the metadata
  • 15. HDFS-name node architecture Secondary name node 3.Store to HDD Primary name-node RAM HDD RAM HDD 1. Pull transaction log 4.Push 2. Merge changes
  • 16. HDFS-Data node •Block Server Stores data in the local file system •Periodic validation of checksums •Periodically sends a report of all existing blocks to the Name Node
  • 17. Hadoop MAPREDUCE Job Tracker: Splitting into map and reduce tasks Scheduling tasks on a cluster node Task Tracker: Runs Map Reduce tasks periodically Map reduce implementation:
  • 18. Benefits of Hadoop… •Cost Saving and efficient and reliable data processing •Provides an economically scalable solution •Storing and processing of large amount of data •Data grid operating system •It is deployed on industry standard servers rather than expensive specialized data storage systems. • Parallel processing of huge amounts of data across inexpensive, industry-standard servers.
  • 19. Why commodity hw ? because cheaper designed to tolerate faults Why HDFS ? network bandwidth vs seek latency Why Map reduce programming model? parallel programming large data sets moving computation to data single compute + data cluster CONCLUSION
  • 20. REFERENCES •Apache Hadoop! (http://hadoop.apache.org) •Hadoop on Wikipedia (http://en.wikipedia.org/wiki/Hadoop) •Cloudera - Apache Hadoop for the Enterprise (http://www.cloudera.com