SlideShare uma empresa Scribd logo
1 de 14
 Introduction to Distributed Programming
› Background of Hadoop
› What is Hadoop ?
› How Hadoop works ?
 Installing Hadoop
› Setting up SSH
› Setting up Environment Variables
› Running Hadoop
› Web-Based Cluster
 Components of Hadoop
› Working with Hadoop File-System
› Understanding Hadoop Map-Reduce
› Reading and Writing
 Writing Basic Map Reduce Program
› Getting the Patent Data Set
› Constructing Basic Map-Reduce Program
› Working with Hadoop Streaming
› Improving Performance with Combiners
 Advanced MapReduce
› Summarization Patterns
› Filtering Patterns
› Data Organization Patterns
› Join Patterns
› Meta Patterns
› Input and Output Patterns
 Programming Practices
› Developing Map-Reduce Programs
› Monitoring and Debugging on a cluster
› Tuning for performance
 Hadoop Cookbook
› Passing Job-Specific Parameters to your tasks
› Probing for Task-Specific Parameters
› Partitioning into multiple output files
› Inputting from and output to database
› Keeping Output in Sorted Order
 Managing Hadoop
› Checking System’s Health
› Setting permissions
› Managing Quotas , Enabling Trash ,
Adding/Deleting Nodes, Recovering from a
failed NameNode
 Running Hadoop in the Cloud
› Introducing Amazon Web Services
› Setting up AWS and Setting up cloud on EC2
› Running Map-Reduce Programs on EC2
› Cleaning up and Shutting down your EC2
instances.
› Amazon Elastic Map-Reduce and other AWS
Services
 Programming with Pig
› Thinking like a pig
› Installing Pig
› Running Pig
› Learning Pig Latin through Grunt
› Pig Latin Syntax
› Working with UDF
› Working with Scripts
 Getting Started on Hive
 Data Types and File Formats
 HiveQL – Data Definition
 HiveQL - Data Manipulation
 HiveQL – Queries, Views and Indexes
 Schema Design , Tuning & Record
Formats
 Hive Integration with Oozie
 Hive and Amazon Web Services
 NoSQL Database
› Why No SQL ?
› Aggregate Data Models
› Distribution Models
› Consistency
 No SQL DBs
› Key-Value DataBases
› Document Databases
› Column Family Stores
› Graph Databases
 MongoDB
› Introduction
› MongoDB through JavaScript Shell
› Writing Programs using MongoDB
› Document Oriented Data
› Queries and Aggregation
› Updates, Atomic Operations and Deletes
› Indexing, Replication and Sharding
 Mahout – Machine Learning
› Introduction
› Recommenders
 Representing Recommender Data
 Making Recommendations
› Clustering
 Clustering Algorithms in Mahout
› Classification
 Training a Classifier
 Evaluating and Tuning a Classifier
 Moving Data in and out of Hadoop
› Flume
› Oozie
› Sqoop
› Hbase
 Data Serialization Formats
› XML, JSON
› SequenceFiles, Protocol Buffers, Thrift and
Avro
 Utilizing Data Structures and Algorithms
› Modelling Data & Solving Problems with
Graphs
› Parallelized Bloom Filter Creation in Map-
Reduce
 Programming Pipelines with Pig
› Using Pig to find malicious actors in log data.
› Optimizing user workflow with Pig.
 Crunch
 Cascading
 Puppet
 Unit Testing Map-Reduce
 Heavyweight Job Testing using
LocalJobRunner
 Debugging User-Space Problems

Mais conteúdo relacionado

Mais procurados

Introduction to apache spark
Introduction to apache sparkIntroduction to apache spark
Introduction to apache spark
UserReport
 

Mais procurados (19)

Hadoop
HadoopHadoop
Hadoop
 
Introduction to Apache Spark Ecosystem
Introduction to Apache Spark EcosystemIntroduction to Apache Spark Ecosystem
Introduction to Apache Spark Ecosystem
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Big data
Big dataBig data
Big data
 
Hadoop and Distributed Computing
Hadoop and Distributed ComputingHadoop and Distributed Computing
Hadoop and Distributed Computing
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Introduction to apache spark
Introduction to apache sparkIntroduction to apache spark
Introduction to apache spark
 
Nextag talk
Nextag talkNextag talk
Nextag talk
 
Cloud Optimized Big Data
Cloud Optimized Big DataCloud Optimized Big Data
Cloud Optimized Big Data
 
An introduction to Apache Hadoop Hive
An introduction to Apache Hadoop HiveAn introduction to Apache Hadoop Hive
An introduction to Apache Hadoop Hive
 
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
 
Basic Hadoop Architecture V1 vs V2
Basic  Hadoop Architecture  V1 vs V2Basic  Hadoop Architecture  V1 vs V2
Basic Hadoop Architecture V1 vs V2
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
 
R and-hadoop
R and-hadoopR and-hadoop
R and-hadoop
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
 
An Introduction of Apache Hadoop
An Introduction of Apache HadoopAn Introduction of Apache Hadoop
An Introduction of Apache Hadoop
 
Intro to Spark
Intro to SparkIntro to Spark
Intro to Spark
 

Semelhante a Hadoop course curriculm

Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
Jesus Rodriguez
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Andrew Brust
 
Hadoop course contents latest
Hadoop course contents latestHadoop course contents latest
Hadoop course contents latest
sandsys technologies
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
Cloudera, Inc.
 
Haoop ppt
Haoop pptHaoop ppt
Haoop ppt
orsenit
 

Semelhante a Hadoop course curriculm (20)

Hadoop online trainings
Hadoop online trainingsHadoop online trainings
Hadoop online trainings
 
Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
Hadoop Training in Hyderabad
Hadoop Training in HyderabadHadoop Training in Hyderabad
Hadoop Training in Hyderabad
 
Hadoop Training in Hyderabad
Hadoop Training in HyderabadHadoop Training in Hyderabad
Hadoop Training in Hyderabad
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
 
Couch db
Couch dbCouch db
Couch db
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
 
Hadoop course contents latest
Hadoop course contents latestHadoop course contents latest
Hadoop course contents latest
 
Prashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEWPrashanth Kumar_Hadoop_NEW
Prashanth Kumar_Hadoop_NEW
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Hadoop 80hr v1.0
Hadoop 80hr v1.0Hadoop 80hr v1.0
Hadoop 80hr v1.0
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
Haoop ppt
Haoop pptHaoop ppt
Haoop ppt
 

Último

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Último (20)

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

Hadoop course curriculm

  • 1.
  • 2.  Introduction to Distributed Programming › Background of Hadoop › What is Hadoop ? › How Hadoop works ?  Installing Hadoop › Setting up SSH › Setting up Environment Variables › Running Hadoop › Web-Based Cluster
  • 3.  Components of Hadoop › Working with Hadoop File-System › Understanding Hadoop Map-Reduce › Reading and Writing  Writing Basic Map Reduce Program › Getting the Patent Data Set › Constructing Basic Map-Reduce Program › Working with Hadoop Streaming › Improving Performance with Combiners
  • 4.  Advanced MapReduce › Summarization Patterns › Filtering Patterns › Data Organization Patterns › Join Patterns › Meta Patterns › Input and Output Patterns  Programming Practices › Developing Map-Reduce Programs › Monitoring and Debugging on a cluster › Tuning for performance
  • 5.  Hadoop Cookbook › Passing Job-Specific Parameters to your tasks › Probing for Task-Specific Parameters › Partitioning into multiple output files › Inputting from and output to database › Keeping Output in Sorted Order  Managing Hadoop › Checking System’s Health › Setting permissions › Managing Quotas , Enabling Trash , Adding/Deleting Nodes, Recovering from a failed NameNode
  • 6.  Running Hadoop in the Cloud › Introducing Amazon Web Services › Setting up AWS and Setting up cloud on EC2 › Running Map-Reduce Programs on EC2 › Cleaning up and Shutting down your EC2 instances. › Amazon Elastic Map-Reduce and other AWS Services
  • 7.  Programming with Pig › Thinking like a pig › Installing Pig › Running Pig › Learning Pig Latin through Grunt › Pig Latin Syntax › Working with UDF › Working with Scripts
  • 8.  Getting Started on Hive  Data Types and File Formats  HiveQL – Data Definition  HiveQL - Data Manipulation  HiveQL – Queries, Views and Indexes  Schema Design , Tuning & Record Formats  Hive Integration with Oozie  Hive and Amazon Web Services
  • 9.  NoSQL Database › Why No SQL ? › Aggregate Data Models › Distribution Models › Consistency  No SQL DBs › Key-Value DataBases › Document Databases › Column Family Stores › Graph Databases
  • 10.  MongoDB › Introduction › MongoDB through JavaScript Shell › Writing Programs using MongoDB › Document Oriented Data › Queries and Aggregation › Updates, Atomic Operations and Deletes › Indexing, Replication and Sharding
  • 11.  Mahout – Machine Learning › Introduction › Recommenders  Representing Recommender Data  Making Recommendations › Clustering  Clustering Algorithms in Mahout › Classification  Training a Classifier  Evaluating and Tuning a Classifier
  • 12.  Moving Data in and out of Hadoop › Flume › Oozie › Sqoop › Hbase  Data Serialization Formats › XML, JSON › SequenceFiles, Protocol Buffers, Thrift and Avro
  • 13.  Utilizing Data Structures and Algorithms › Modelling Data & Solving Problems with Graphs › Parallelized Bloom Filter Creation in Map- Reduce  Programming Pipelines with Pig › Using Pig to find malicious actors in log data. › Optimizing user workflow with Pig.
  • 14.  Crunch  Cascading  Puppet  Unit Testing Map-Reduce  Heavyweight Job Testing using LocalJobRunner  Debugging User-Space Problems