2. Slide 2Slide 2Slide 2 www.edureka.co/hadoop-admin
At the end of this webinar we will Know about:
The daily tasks a Hadoop Admin do
Cluster Monitor tools
How Fault tolerance is maintained in cluster
Demo on Hadoop High Availability
Demo on YARN High Availability
Agenda
4. Slide 4Slide 4Slide 4 www.edureka.co/hadoop-admin
First thing on morning checking the monitor console (cloudera manager,Nagios,ganglia etc …) and the
jobtracker UI.
Cluster Monitoring
5. Slide 5Slide 5Slide 5 www.edureka.co/hadoop-admin
Few Cluster Monitoring Tools
6. Slide 6Slide 6Slide 6 www.edureka.co/hadoop-admin
Planning the day and reviewing past task in a meeting
Cluster Plan
7. Slide 7Slide 7Slide 7 www.edureka.co/hadoop-admin
Midline configuration (all around, deep storage, 1 Gb Ethernet)
CPU 2 × 6 core 2.9 Ghz/15 MB cache
Memory 64 GB DDR3-1600 ECC
Disk controller SAS 6 Gb/s
Disks 12 × 3 TB LFF SATA II 7200 RPM
Network controller 2 × 1 Gb Ethernet
Notes
CPU features such as Intel’s Hyper-Threading
and QPI are desirable. Allocate memory to
take advantage of triple- or quad-channel
memory configurations.
Typical slave node hardware configurations
Cluster Plan
8. Slide 8Slide 8Slide 8 www.edureka.co/hadoop-admin
High end configuration (high memory, spindle dense, 10 Gb Ethernet)
CPU 2 × 6 core 2.9 Ghz/15 MB cache
Memory 96 GB DDR3-1600 ECC
Disk controller 2 × SAS 6 Gb/s
Disks 24 × 1 TB SFF Nearline/MDL SAS 7200 RPM
Network controller 1 × 10 Gb Ethernet
Notes Same as the midline configuration
High end configuration (high memory, spindle dense, 10 Gb Ethernet)
Cluster Plan
9. Slide 9Slide 9Slide 9 www.edureka.co/hadoop-admin
Developing and running files merger so that the small files and directories our data suppliers create would
become bigger and fewer.
Execute Few Regular Utility Tasks
12. Slide 12Slide 12Slide 12 www.edureka.co/hadoop-admin
Keep the farm working – we build monitoring, managing resources between our users and our tools, tuning
configurations for the farm stack, for mapreduce, spark jobs and for the servers of course.
Job Scheduling And Configuration
13. Slide 13Slide 13Slide 13 www.edureka.co/hadoop-admin
Analyzing too heavy or failed jobs and Fixing problems
Analyzing Failed Tasks
15. Slide 15Slide 15Slide 15 www.edureka.co/hadoop-admin
Collecting and Defining requirements for new hosts
Evaluating New Host Requests
16. Slide 16Slide 16Slide 16 www.edureka.co/hadoop-admin
Upgrading and updating the farm from time to time
Updates And Upgrades
17. Slide 17Slide 17Slide 17 www.edureka.co/hadoop-admin
Trying to test and benchmark new projects.
Try And Finalize New Solutions
18. Slide 18Slide 18Slide 18 www.edureka.co/hadoop-admin
Set a configuration management tool for our test and production environments
Be In Touch With New Configuration Tools
19. Slide 19Slide 19Slide 19 www.edureka.co/hadoop-admin
Developing an easy infrastructure to insert data to the cluster and into hive and hbase
Execute Few DWH Responsibilities
20. Slide 20Slide 20Slide 20 www.edureka.co/hadoop-admin
Daily support for developers who use the hadoop stack
Assisting Hadoop Developers
24. Slide 24Slide 24Slide 24 www.edureka.co/hadoop-admin
NameNode startup fails
Exception when initializing the filesystem
Could only be replicated to 0 nodes instead of 1
Server not available
Could not obtain block blk_-4157273618194597760_1160 from any node
Could not get block locations. Aborting...
Common Error Messages
26. Slide 26
Your feedback is vital for us, be it a compliment, a suggestion or a complaint. It helps us to make your
experience better!
Please spare few minutes to take the survey after the webinar.
Survey