SlideShare uma empresa Scribd logo
1 de 12
Bigdata And Hadoop
Big Data Hadoop Training
What is Hadoop?
Hadoop is a free, Java -based programming framework that supports the processing of large data sets in a
distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation.
Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes
of storage capacity. Its distributed file system facilitates rapid data transfer rates among nodes and allows the
system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic
system failure, even if a significant number of nodes become inoperative.
Why Hadoop?
Large Volumes of Data:
Ability to store and process huge amounts of variety (structure, unstructured and semi structured) of data, quickly.
With data volumes and varieties constantly increasing, especially from social media and the Internet of Things
(IoT), that’s a key consideration.
Fault Tolerance:
Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically
redirected to other nodes to make sure the distributed computing does not fail. Multiple copies of all data are stored
automatically.
Flexibility:
Unlike traditional rdecideelational database, you don’t have to process data before storing it, You can store as much
data as you want and how to use it later. That includes unstructured data like text, images and videos etc.
Low Cost:
The open-source framework is free and used commodity hardware to store large quantities of data.
Scalability:
You can easily grow your system to handle more data simply by adding nodes. Little administration is required.
Copyright @ 2018 Learntek. All Rights Reserved. 3
The following topics will be covered in our Big Data and Hadoop Online Training
Copyright @ 2018 Learntek. All Rights Reserved.
4
Big Data Hadoop Training Topics :
Hadoop Introduction:
Big Data Hadoop Training : Introduction to Data and System
Types of Data
Traditional way of dealing large data and its problems
Types of Systems & Scaling
What is Big Data
Challenges in Big Data
Challenges in Traditional Application
New Requirements
What is Hadoop? Why Hadoop?
Brief history of Hadoop
Features of Hadoop
Hadoop and RDBMS
Hadoop Ecosystem’s overview
Copyright @ 2018 Learntek. All Rights Reserved. 5
Hadoop Installation :
Installation in detail
Creating Ubuntu image in VMware Downloading Hadoop
Installing SSH
Configuring Hadoop, HDFS & MapReduce
Download, Installation & Configuration Hive
Download, Installation & Configuration Pig
Download, Installation & Configuration Sqoop
Download, Installation & Configuration Hive
Configuring Hadoop in Different Modes
Copyright @ 2018 Learntek. All Rights Reserved. 6
Hadoop Distribute File System (HDFS) :
File System – Concepts
Blocks
Replication Factor
Version File
Safe mode
Namespace IDs
Purpose of Name Node
Purpose of Data Node
Purpose of Secondary Name Node
Purpose of Job Tracker
Purpose of Task Tracker
HDFS Shell Commands – copy, delete, create directories etc.
Reading and Writing in HDFS
Difference of Unix Commands and HDFS commands
Read / Write in HDFS – Internal Process between
Client, Name Node & Data Nodes.
Accessing HDFS using Java API
Various Ways of Accessing HDFS
Understanding HDFS Java classes and methods
Admin: 1. Commissioning / Decommissioning Data
Node
Balancer
Replication Policy
Network Distance / Topology Script
Copyright @ 2018 Learntek. All Rights Reserved. 7
Map Reduce Programming :
About MapReduce
Understanding block and input splits
MapReduce Data types
Understanding Writable
Data Flow in MapReduce Application
Understanding MapReduce problem on datasets
MapReduce and Functional Programming
Writing MapReduce Application
Understanding Mapper function
Understanding Reducer Function
Understanding Driver
Usage of Combiner
Understanding Partitioned
Usage of Distributed Cache
Passing the parameters to mapper and reducer
Analyzing the Results
Log files
Input Formats and Output Formats
Counters, Skipping Bad and unwanted Records
Writing Join’s in MapReduce with 2 Input files. Join Types.
Execute MapReduce Job – Insights.
Exercise’s on MapReduce.
Job Scheduling: Type of Schedulers.
Copyright @ 2018 Learntek. All Rights Reserved.
8
Hive
Hive concepts
Schema on Read VS Schema on Write
Hive architecture
Install and configure hive on cluster
Meta Store – Purpose & Type of Configurations
Different type of tables in Hive
Buckets
Partitions
Joins in hive
Hive Query Language
Hive Data Types
Data Loading into Hive Tables
Hive Query Execution
Hive library functions
Hive UDF
Hive Limitations
Pig
Pig basics
Install and configure PIG on a cluster
PIG Library functions
Pig Vs Hive
Write sample Pig Latin scripts
Modes of running PIG
Running in Grunt shell
Running as Java program
PIG UDFs
Copyright @ 2018 Learntek. All Rights Reserved. 9
HBase :
HBase concepts
HBase architecture
Region server architecture
File storage architecture
HBase basics
Column access
Scans
HBase use cases
Install and configure HBase on a multi node cluster
Create database, Develop and run sample applications
Access data stored in HBase using Java API
Sqoop :
Install and configure Sqoop on cluster
Connecting to RDBMS
Installing MySQL
Import data from MySQL to hive
Export data to MySQL
Internal mechanism of import/export
Copyright @ 2018 Learntek. All Rights Reserved. 10
Oozie :
Introduction to OOZIE
Oozie architecture
XML file specifications
Specifying Work flow
Control nodes
Oozie job coordinator
Flume
Introduction to Flume
Configuration and Setup
Flume Sink with example
Channel
Flume Source with example
Complex flume architecture
Copyright @ 2018 Learntek. All Rights Reserved. 11
Zookeeper :
Introduction to Zookeeper
Challenges in distributed Applications
Coordination
ZooKeeper : Design Goals
Data Model and Hierarchical namespace
Client APIs
YARN
Hadoop 1.0 Limitations
MapReduce Limitations
History of Hadoop 2.0
HDFS 2: Architecture
HDFS 2: Quorum based storage
HDFS 2: High availability
HDFS 2: Federation
YARN Architecture
Classic vs YARN
YARN Apps
YARN multitenancy
YARN Capacity Scheduler
Prerequisites :
Knowledge in any programming language, Database knowledge and Linux Operating system. Core Java or Python
knowledge helpful.
Copyright @ 2018 Learntek. All Rights Reserved. 12

Mais conteúdo relacionado

Mais procurados

Apache Hadoop - Big Data Engineering
Apache Hadoop - Big Data EngineeringApache Hadoop - Big Data Engineering
Apache Hadoop - Big Data EngineeringBADR
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1Giovanna Roda
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and HadoopFlavio Vit
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop IntroductionDzung Nguyen
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache HadoopAjit Koti
 
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs ApacheComparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs ApacheSandeepTaksande
 
Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use casesJoey Echeverria
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
 
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solution
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solutionHadoop- A Highly Available and Secure Enterprise DataWarehousing solution
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solutionEdureka!
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune amrutupre
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPTAnand Pandey
 
Big data Hadoop presentation
Big data  Hadoop  presentation Big data  Hadoop  presentation
Big data Hadoop presentation Shivanee garg
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architectureHarikrishnan K
 

Mais procurados (20)

Apache Hadoop - Big Data Engineering
Apache Hadoop - Big Data EngineeringApache Hadoop - Big Data Engineering
Apache Hadoop - Big Data Engineering
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop Introduction
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
 
Comparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs ApacheComparison - RDBMS vs Hadoop vs Apache
Comparison - RDBMS vs Hadoop vs Apache
 
Hadoop in three use cases
Hadoop in three use casesHadoop in three use cases
Hadoop in three use cases
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solution
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solutionHadoop- A Highly Available and Secure Enterprise DataWarehousing solution
Hadoop- A Highly Available and Secure Enterprise DataWarehousing solution
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
Big data Hadoop presentation
Big data  Hadoop  presentation Big data  Hadoop  presentation
Big data Hadoop presentation
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architecture
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 

Semelhante a Big data and hadoop product page

Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1Thanh Nguyen
 
Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabsSiva Sankar
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampSpotle.ai
 
Data infrastructure at Facebook
Data infrastructure at Facebook Data infrastructure at Facebook
Data infrastructure at Facebook AhmedDoukh
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online TrainingLearntek1
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with HadoopNalini Mehta
 
Big Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxBig Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxAnonymous9etQKwW
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookAmr Awadallah
 

Semelhante a Big data and hadoop product page (20)

Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 
Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabs
 
paper
paperpaper
paper
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop Bootcamp
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Big data
Big dataBig data
Big data
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
Data infrastructure at Facebook
Data infrastructure at Facebook Data infrastructure at Facebook
Data infrastructure at Facebook
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
 
Big Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxBig Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptx
 
Hadoop .pdf
Hadoop .pdfHadoop .pdf
Hadoop .pdf
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and FacebookHow Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
How Hadoop Revolutionized Data Warehousing at Yahoo and Facebook
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 

Mais de Janu Jahnavi

Analytics using r programming
Analytics using r programmingAnalytics using r programming
Analytics using r programmingJanu Jahnavi
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platformJanu Jahnavi
 
Google cloud Platform
Google cloud PlatformGoogle cloud Platform
Google cloud PlatformJanu Jahnavi
 
Apache spark with java 8
Apache spark with java 8Apache spark with java 8
Apache spark with java 8Janu Jahnavi
 
Apache spark with java 8
Apache spark with java 8Apache spark with java 8
Apache spark with java 8Janu Jahnavi
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonJanu Jahnavi
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonJanu Jahnavi
 

Mais de Janu Jahnavi (20)

Analytics using r programming
Analytics using r programmingAnalytics using r programming
Analytics using r programming
 
Software testing
Software testingSoftware testing
Software testing
 
Software testing
Software testingSoftware testing
Software testing
 
Spring
SpringSpring
Spring
 
Stack skills
Stack skillsStack skills
Stack skills
 
Ui devopler
Ui devoplerUi devopler
Ui devopler
 
Apache flink
Apache flinkApache flink
Apache flink
 
Apache flink
Apache flinkApache flink
Apache flink
 
Angular js
Angular jsAngular js
Angular js
 
Mysql python
Mysql pythonMysql python
Mysql python
 
Mysql python
Mysql pythonMysql python
Mysql python
 
Ruby with cucmber
Ruby with cucmberRuby with cucmber
Ruby with cucmber
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platform
 
Google cloud Platform
Google cloud PlatformGoogle cloud Platform
Google cloud Platform
 
Apache spark with java 8
Apache spark with java 8Apache spark with java 8
Apache spark with java 8
 
Apache spark with java 8
Apache spark with java 8Apache spark with java 8
Apache spark with java 8
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk python
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk python
 

Último

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 

Último (20)

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Big data and hadoop product page

  • 2. Big Data Hadoop Training What is Hadoop? Hadoop is a free, Java -based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes of storage capacity. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative. Why Hadoop? Large Volumes of Data: Ability to store and process huge amounts of variety (structure, unstructured and semi structured) of data, quickly. With data volumes and varieties constantly increasing, especially from social media and the Internet of Things (IoT), that’s a key consideration.
  • 3. Fault Tolerance: Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. Multiple copies of all data are stored automatically. Flexibility: Unlike traditional rdecideelational database, you don’t have to process data before storing it, You can store as much data as you want and how to use it later. That includes unstructured data like text, images and videos etc. Low Cost: The open-source framework is free and used commodity hardware to store large quantities of data. Scalability: You can easily grow your system to handle more data simply by adding nodes. Little administration is required. Copyright @ 2018 Learntek. All Rights Reserved. 3 The following topics will be covered in our Big Data and Hadoop Online Training
  • 4. Copyright @ 2018 Learntek. All Rights Reserved. 4 Big Data Hadoop Training Topics : Hadoop Introduction: Big Data Hadoop Training : Introduction to Data and System Types of Data Traditional way of dealing large data and its problems Types of Systems & Scaling What is Big Data Challenges in Big Data Challenges in Traditional Application New Requirements What is Hadoop? Why Hadoop? Brief history of Hadoop Features of Hadoop Hadoop and RDBMS Hadoop Ecosystem’s overview
  • 5. Copyright @ 2018 Learntek. All Rights Reserved. 5 Hadoop Installation : Installation in detail Creating Ubuntu image in VMware Downloading Hadoop Installing SSH Configuring Hadoop, HDFS & MapReduce Download, Installation & Configuration Hive Download, Installation & Configuration Pig Download, Installation & Configuration Sqoop Download, Installation & Configuration Hive Configuring Hadoop in Different Modes
  • 6. Copyright @ 2018 Learntek. All Rights Reserved. 6 Hadoop Distribute File System (HDFS) : File System – Concepts Blocks Replication Factor Version File Safe mode Namespace IDs Purpose of Name Node Purpose of Data Node Purpose of Secondary Name Node Purpose of Job Tracker Purpose of Task Tracker HDFS Shell Commands – copy, delete, create directories etc. Reading and Writing in HDFS Difference of Unix Commands and HDFS commands Read / Write in HDFS – Internal Process between Client, Name Node & Data Nodes. Accessing HDFS using Java API Various Ways of Accessing HDFS Understanding HDFS Java classes and methods Admin: 1. Commissioning / Decommissioning Data Node Balancer Replication Policy Network Distance / Topology Script
  • 7. Copyright @ 2018 Learntek. All Rights Reserved. 7 Map Reduce Programming : About MapReduce Understanding block and input splits MapReduce Data types Understanding Writable Data Flow in MapReduce Application Understanding MapReduce problem on datasets MapReduce and Functional Programming Writing MapReduce Application Understanding Mapper function Understanding Reducer Function Understanding Driver Usage of Combiner Understanding Partitioned Usage of Distributed Cache Passing the parameters to mapper and reducer Analyzing the Results Log files Input Formats and Output Formats Counters, Skipping Bad and unwanted Records Writing Join’s in MapReduce with 2 Input files. Join Types. Execute MapReduce Job – Insights. Exercise’s on MapReduce. Job Scheduling: Type of Schedulers.
  • 8. Copyright @ 2018 Learntek. All Rights Reserved. 8 Hive Hive concepts Schema on Read VS Schema on Write Hive architecture Install and configure hive on cluster Meta Store – Purpose & Type of Configurations Different type of tables in Hive Buckets Partitions Joins in hive Hive Query Language Hive Data Types Data Loading into Hive Tables Hive Query Execution Hive library functions Hive UDF Hive Limitations Pig Pig basics Install and configure PIG on a cluster PIG Library functions Pig Vs Hive Write sample Pig Latin scripts Modes of running PIG Running in Grunt shell Running as Java program PIG UDFs
  • 9. Copyright @ 2018 Learntek. All Rights Reserved. 9 HBase : HBase concepts HBase architecture Region server architecture File storage architecture HBase basics Column access Scans HBase use cases Install and configure HBase on a multi node cluster Create database, Develop and run sample applications Access data stored in HBase using Java API Sqoop : Install and configure Sqoop on cluster Connecting to RDBMS Installing MySQL Import data from MySQL to hive Export data to MySQL Internal mechanism of import/export
  • 10. Copyright @ 2018 Learntek. All Rights Reserved. 10 Oozie : Introduction to OOZIE Oozie architecture XML file specifications Specifying Work flow Control nodes Oozie job coordinator Flume Introduction to Flume Configuration and Setup Flume Sink with example Channel Flume Source with example Complex flume architecture
  • 11. Copyright @ 2018 Learntek. All Rights Reserved. 11 Zookeeper : Introduction to Zookeeper Challenges in distributed Applications Coordination ZooKeeper : Design Goals Data Model and Hierarchical namespace Client APIs YARN Hadoop 1.0 Limitations MapReduce Limitations History of Hadoop 2.0 HDFS 2: Architecture HDFS 2: Quorum based storage HDFS 2: High availability HDFS 2: Federation YARN Architecture Classic vs YARN YARN Apps YARN multitenancy YARN Capacity Scheduler Prerequisites : Knowledge in any programming language, Database knowledge and Linux Operating system. Core Java or Python knowledge helpful.
  • 12. Copyright @ 2018 Learntek. All Rights Reserved. 12