Geek Trainings started by a group of Trainers and HR Specialists team is truly a pioneer in the field of Training on different technologies with a proven track record of successfully undertaking Corporate, Class Room and Online Trainings with brilliant and qualitative professionals Trainers in multifarious positions in the ever-expanding arena of Information Technology ( IT ) in India.
1. Apache Hadoop and Amazon EMR Course Details
APACHE HADOOP AND AMAZON EMR COURSE
Introduction to HADOOP
• Distributed computing , cloud computing
• Big data Basics and Need for Parallel Processing
• How Hadoop works ?
• Introduction to HDFS and Map Reduce
Hadoop Architecture Details
• Name Node
• Data Node
• Secondary Name Node
• Job Tracker
• Task Tracker
HDFS ( Hadoop - Distributed File System)
• Hadoop Distributed file system , Background, GFS
• Data Replication
• Data Storage
• Data Retrieval
• Additional HDFS commands
MapReduce Programming
2. • MapReduce, Background
• Writing MapReduce Programs
• Writable and WritableComparable
• Input Format, Output Format
• Input Split and Block size
• Combiner
• Partitioner
• Number of Mappers and Reducers
• Counters
Map Reduce Algorithms and Exercises
• Word Count
• Distributed GREP
• Sorting Data
• Log file Analysis
Hadoop Streaming
• How Streaming Works ?
• Writing MapReduce programs in other languages
• Amazon MR based exercise session.
Introduction to Amazon Map Reduce (AWS-EMR)
• Hadoop using Amaozon Web Service
• AWS MapReduce and EC2
• AWS-MR Architecture.
• Multipl Cluster Deployment using AWS-S3
Hadoop Ecosystem and Other Related Projects
• Hive Introduction
• Hive Installation
• Hive Exercises and Samples
• Pig Installation
• Pig Scripts execution
• HBase Installation
• Hbase Exercises
• Sqoop Installation
3. • Using SQOOP for RDBMS to HDFS data flow.
Hadoop Deployment
• Basic Hadoop deployment techniques
• Directory Layout and component details
• Networking challenges in Hadoop Deployment
• Disaster Recovery ( DR ) in Hadoop .
Hadoop Cluster Configuration and Monitoring
• Master / Salve Configuration
• Important Directroires
• Small, Medium and Large Cluster considerations
• Hadoop Monitoring - GANGLIA ,NAGIOS
Hadoop Business Case
• Why Hadoop is NOT a Silver Bullet for all your problems.
• When to use Hadoop- Business Cases
• When NOT to use Hadoop - Business Case
Hadoop and Cloud Computing
• Using Cloud technologies for distributed processing
• Hadoop on Amazon Web Service.
• Hadoop in Oracle Cloud / RackSpace
4. ===========================================
HADOOP AND AWS EXERCISES:
• Hadoop Virtual Machine Setup.
• Configuring Hadoop in Single Cluster.
• Loading/UnLoading Data in Distributed HDFS System.
• Map Reduce Programs - WordCount, Grep, Sort,etc
• Amazon Map Reduce Programs- Hadoop Streaming.
• Process and Metrics Analysis for Hadoop Output.
• Apache Pig Installation and script execution.
• Hadoop and Flume examples.
• HiveQL commands and scripts .
• HBASE Installation and samples.
• Many more examples and exercises /assignments
=====================================================