SlideShare a Scribd company logo
1 of 2
Download to read offline
Hadoop Course Contents
(Includes theoretical as well as practical sessions)


Table of Contents

    1. Basics of Parallel Programming (4 hours)
           a. Multi-Threading
           b. OpenMP (Open Multiprocessing)and MPI (Message Passing Interface)
           c. Performance tuning and optimization
                     i. Matrix Multiplication
                    ii. Unique word count problem
    2. Distributed computing concepts (2 hours)

    3. Hadoop Overview (6 hours)

            a.   Why Hadoop?
            b.   Brief history of hadoop
            c.   Architecture of Hadoop
            d.   Overview of HDFS (Hadoop Distributed File System) and MR (Map Reduce) framework
            e.   Overview of problems solved by Hadoop
                       i. Data Mining
                      ii. Web Mining
                     iii. Natural Language Processing
                     iv. K-means clustering
                      v. Sentimental Analysis
    4.   Map Reduce Programming Model (8 hours)
             a. Details of execution of Map Reduce frame work
             b. Word count problem solved using MapReduce programming model.
             c. Data Mining on Wikipedia data set.
    5.   Hadoop ecosystem (2 hours)
    6.   Hadoop Programming Languages (4 hours)
             a. Pig
             b. Hadoop Pipes (C++)
             c. Hadoop Streaming
             d. Hadoop and R
    7.   Distributed data base concepts (4 hours)
             a. RDBMS v/s NoSQL DB
             b. Overview of HBase and Cassandra
8. Advance MapReduce Programming (chaining Mapper and Reducer)



    9. Case Studies
       A. Data Mining on Wikipedia data set using
          a. Batch Mode Processing (MR )
          b. Using Hive
          c. Using HBase and Hive
       B. Web Mining using Apache Nutch, Apache Solr and Hadoop
       C. Web Log processing using Flume and Hadoop
       D. Complex Event processing using Flume, Hadoop and EPL ( Event Processing Language)
       E. Integrating Hadoop and RDBMS

Prerequisites:

    (1) Hands-on Core java programming / C++/ R/Python
    (2) Hands on parallel/multithreaded programming
    (3) Query Language (SQL or EPL) (Optional)

More Related Content

What's hot

Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
Jeremy Hanna
 

What's hot (18)

Introduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJugIntroduction to Hadoop - FinistJug
Introduction to Hadoop - FinistJug
 
Multidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGISMultidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGIS
 
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFViewHDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
 
Pilot Project for HDF5 Metadata Structures for SWOT
Pilot Project for HDF5 Metadata Structures for SWOTPilot Project for HDF5 Metadata Structures for SWOT
Pilot Project for HDF5 Metadata Structures for SWOT
 
Big data
Big dataBig data
Big data
 
Kick starting projects crud
Kick starting projects  crudKick starting projects  crud
Kick starting projects crud
 
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoop
 
Hadoop 1 vs hadoop2
Hadoop 1 vs hadoop2Hadoop 1 vs hadoop2
Hadoop 1 vs hadoop2
 
GDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS ProjectGDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS Project
 
SPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth ObservationSPD and KEA: HDF5 based file formats for Earth Observation
SPD and KEA: HDF5 based file formats for Earth Observation
 
Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon Cassandra + Hadoop @ApacheCon
Cassandra + Hadoop @ApacheCon
 
Hadoop 2 cluster architecture
Hadoop 2 cluster architectureHadoop 2 cluster architecture
Hadoop 2 cluster architecture
 
Mapreduce Tutorial
Mapreduce TutorialMapreduce Tutorial
Mapreduce Tutorial
 
ACAT 2017: GooFit 2.0
ACAT 2017: GooFit 2.0ACAT 2017: GooFit 2.0
ACAT 2017: GooFit 2.0
 
Rubyで始めるGTD
Rubyで始めるGTDRubyで始めるGTD
Rubyで始めるGTD
 
Efficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAPEfficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
 
JOSA TechTalks - Big Data on Hadoop
JOSA TechTalks - Big Data on HadoopJOSA TechTalks - Big Data on Hadoop
JOSA TechTalks - Big Data on Hadoop
 

Viewers also liked (15)

Mercadeo en red
Mercadeo en redMercadeo en red
Mercadeo en red
 
Unidad III.
Unidad III.Unidad III.
Unidad III.
 
ctw70.pdf
ctw70.pdfctw70.pdf
ctw70.pdf
 
'Dokumen.tips pemanfaatan ti-di-bidang-militer-makalah-ti
'Dokumen.tips pemanfaatan ti-di-bidang-militer-makalah-ti'Dokumen.tips pemanfaatan ti-di-bidang-militer-makalah-ti
'Dokumen.tips pemanfaatan ti-di-bidang-militer-makalah-ti
 
Parques naturales 1
Parques naturales 1Parques naturales 1
Parques naturales 1
 
jcc resume tsr
jcc resume tsrjcc resume tsr
jcc resume tsr
 
Passive voice
Passive voicePassive voice
Passive voice
 
La voluntad del_lider_daniela_arias[1]
La voluntad del_lider_daniela_arias[1]La voluntad del_lider_daniela_arias[1]
La voluntad del_lider_daniela_arias[1]
 
Production log.docxhh
Production log.docxhhProduction log.docxhh
Production log.docxhh
 
Safe Contractor 2017
Safe Contractor 2017Safe Contractor 2017
Safe Contractor 2017
 
Loquera
LoqueraLoquera
Loquera
 
Repteis
RepteisRepteis
Repteis
 
Lausl Player Registration Form
Lausl Player Registration FormLausl Player Registration Form
Lausl Player Registration Form
 
Capítulo 5
Capítulo 5Capítulo 5
Capítulo 5
 
Cuentos didacticos de fisica hernan verdugo f
Cuentos didacticos de fisica   hernan verdugo fCuentos didacticos de fisica   hernan verdugo f
Cuentos didacticos de fisica hernan verdugo f
 

Similar to Hadoop course varnaaz

Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
Jesus Rodriguez
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
sriram0233
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
Anil Sokhal
 
Hadoop course content
Hadoop course contentHadoop course content
Hadoop course content
udareddy
 

Similar to Hadoop course varnaaz (20)

Apache Hadoop - Big Data Engineering
Apache Hadoop - Big Data EngineeringApache Hadoop - Big Data Engineering
Apache Hadoop - Big Data Engineering
 
Best hadoop-online-training
Best hadoop-online-trainingBest hadoop-online-training
Best hadoop-online-training
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Lecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptxLecture 2 Hadoop.pptx
Lecture 2 Hadoop.pptx
 
Hadoop online course content (pdf)
Hadoop online  course content (pdf)Hadoop online  course content (pdf)
Hadoop online course content (pdf)
 
Hadoop online course content (pdf)
Hadoop online  course content (pdf)Hadoop online  course content (pdf)
Hadoop online course content (pdf)
 
Hadoop_content_by_sasidhar2
Hadoop_content_by_sasidhar2Hadoop_content_by_sasidhar2
Hadoop_content_by_sasidhar2
 
Big Data in the Microsoft Platform
Big Data in the Microsoft PlatformBig Data in the Microsoft Platform
Big Data in the Microsoft Platform
 
Hadoop Online training from www. Imaginelife.in
Hadoop Online training from www. Imaginelife.inHadoop Online training from www. Imaginelife.in
Hadoop Online training from www. Imaginelife.in
 
Hadoop online training
Hadoop online trainingHadoop online training
Hadoop online training
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopHadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoop
 
Hadoop online training by certified trainer
Hadoop online training by certified trainerHadoop online training by certified trainer
Hadoop online training by certified trainer
 
hadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptxhadoop-ecosystem-ppt.pptx
hadoop-ecosystem-ppt.pptx
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
 
Hadoop And Their Ecosystem
 Hadoop And Their Ecosystem Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
 
Anil_BigData Resume
Anil_BigData ResumeAnil_BigData Resume
Anil_BigData Resume
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Hadoop course content
Hadoop course contentHadoop course content
Hadoop course content
 

Hadoop course varnaaz

  • 1. Hadoop Course Contents (Includes theoretical as well as practical sessions) Table of Contents 1. Basics of Parallel Programming (4 hours) a. Multi-Threading b. OpenMP (Open Multiprocessing)and MPI (Message Passing Interface) c. Performance tuning and optimization i. Matrix Multiplication ii. Unique word count problem 2. Distributed computing concepts (2 hours) 3. Hadoop Overview (6 hours) a. Why Hadoop? b. Brief history of hadoop c. Architecture of Hadoop d. Overview of HDFS (Hadoop Distributed File System) and MR (Map Reduce) framework e. Overview of problems solved by Hadoop i. Data Mining ii. Web Mining iii. Natural Language Processing iv. K-means clustering v. Sentimental Analysis 4. Map Reduce Programming Model (8 hours) a. Details of execution of Map Reduce frame work b. Word count problem solved using MapReduce programming model. c. Data Mining on Wikipedia data set. 5. Hadoop ecosystem (2 hours) 6. Hadoop Programming Languages (4 hours) a. Pig b. Hadoop Pipes (C++) c. Hadoop Streaming d. Hadoop and R 7. Distributed data base concepts (4 hours) a. RDBMS v/s NoSQL DB b. Overview of HBase and Cassandra
  • 2. 8. Advance MapReduce Programming (chaining Mapper and Reducer) 9. Case Studies A. Data Mining on Wikipedia data set using a. Batch Mode Processing (MR ) b. Using Hive c. Using HBase and Hive B. Web Mining using Apache Nutch, Apache Solr and Hadoop C. Web Log processing using Flume and Hadoop D. Complex Event processing using Flume, Hadoop and EPL ( Event Processing Language) E. Integrating Hadoop and RDBMS Prerequisites: (1) Hands-on Core java programming / C++/ R/Python (2) Hands on parallel/multithreaded programming (3) Query Language (SQL or EPL) (Optional)