SlideShare uma empresa Scribd logo
1 de 75
ABOUT THE COMPANY:
• LINUX WORLD-TRAINING AND DEVELOPING CENTER.
• ISO 9001:2008 Certified Organization working dedicatedly on Linux
& Open Source Technologies in entire Rajasthan.
• Centre who has maximum number of students who scores 100% in
RHCSA & RHCE Global Exam.
CONTENTS:
• 1. INTRODUCTION OF BIG DATA.
• 2. SOLUTIONS TO BIG DATA PROBLEM.
• 3. HADOOP.
• 4. HDFS(DISTRIBUTED STORAGE).
• 5.MAPREDUCE(DISTRIBUTED STORAGE).
• 6. HADOOP AUTOMATION TOOL FOR LINUX(TUI).
• 7. HADOOP AUTOMATION TOOL FOR iOS(TUI).
• 8. HADOOP AUTOMATION TOOL FOR LINUX(GUI).
• 9. ON DEMAND CLUSTER.
• 10.OPENSTACK-SAVANA(HADOOP WITH CLOUD).
• 11. INNOVATION.
INTRODUCTION TO HADOOP:
• Hadoop is a free, Java-based programming framework that supports
the processing of large data sets in a distributed computing
environment. It is part of the Apache project sponsored by the
Apache Software Foundation.
• Open-source software. Open-source software is created and
maintained by a network of developers from around the globe. It's
free to download, use and contribute to, though more and more
commercial versions of Hadoop are becoming available.
• Framework. In this case, it means that everything you need to
develop and run software applications is provided – programs,
connections, etc.
• Massive storage. The Hadoop framework breaks big data into blocks,
which are stored on clusters of commodity hardware.
BENEFITS OF HADOOP:
• Computing power. Its distributed computing model quickly processes
big data. The more computing nodes you use, the more processing
power you have.
• Flexibility. Unlike traditional relational databases, you don’t have to
preprocess data before storing it. You can store as much data as you
want and decide how to use it later. That includes unstructured data
like text, images and videos.
• Fault tolerance. Data and application processing are protected
against hardware failure. If a node goes down, jobs are automatically
redirected to other nodes to make sure the distributed computing
does not fail. And it automatically stores multiple copies of all data.
• Low cost. The open-source framework is free and uses commodity
hardware to store large quantities of data.
ARCHITECTURE OF DISTRIBUTED STORAGE:
ARCHITECTURE OF DISTRIBUTED COMPUTING:
ABOUT THE PROJECT:
• High performance distributed computing implements for Big Data using Hadoop
Framework and running application on large cluster.
• This project is dealing with distributive storage and distributed computing.
• Using this project we can create an environment of super computing.
• This environment can also access using IOS device.
HARDWARE REQUIREMENTS:
• PROCESSOR : Pentium 4 ,i3 or later.
• MEMORY(RAM) : 1GB normally, 512MB in virtual machine.
• HARD DRIVE SPACE : 512MB
SOFTWARE REQUIREMENTS:
• OPERATING SYSTEM : REDHAT ENTERPRISE LINUX version 5 or later , Fedora , Centos,
IOS 8.4 or later.
• Python 2.6 or later.
• JDK.
• Hadoop RPM.
• PIG, HIVE , SQOOP RPMs.
• iTUNES(for iOS)
• CYDIA(for iOS)
• SERVERAUDITOR(for iOS).
• CLOUDSERVICES(for iOS).
• DIALOG BOX(for TUI).
TECHNOLOGY USED:
• HADOOP V1 or V2.
• PYTHON V2.
• SHELL SCRIPT.
• iOS HACKING.
• SERVERS-SSH,FTP,SCP,HTTP,MYSQL,NFS,NTP.
• FRAMEWORK-PIG,HIVE,SQOOP.
• SCHEDULAR-FIFO,FAIR,OOZIE.
• DATABASE USED-SQL,NOSQL.
• MINIMAL CLUSTER,MULTINODE CLUSTER.
• AMAZON WEB SERVICES,OPENSTACK.
HADOOP AUTOMATION TOOL:
• This tool can be used by two type- Custom and Typical.
• In Custom you can make whole cluster according to yourself.
• In typical the code will make whole cluster automatically.
• It also provide the uses of frameworks like PIG, HIVE, SQOOP.
Access Automation Tool Using iOS Device:
• Requirement-iOS device must be connected to same network of
namenode.
• iOS device must be jailbroken.
• we need to create a security file .pem on namenode.
• Tweeks needed to install on device to read and access this security
file.
• Using .pem file we can access the namenode.
CLUSTER-ON-DEMAND:
• A virtual cluster is a group of physical or virtual machines configured
for a common purpose, with associated user accounts and storage
resources.
• Cluster-on-Demand (COD) is a system to enable rapid, automated, on-
the-fly partitioning of a physical cluster into multiple independent
virtual clusters.
• You no longer need to build your own compute cluster in order to
tackle your High Performance Computing projects.
INNOVATION:
• Hadoop version-1 does not Support the replication of namenode and
it is also very necessary for security purpose but in this tool
replication of namenode is also supported.
• Using iOS device we can deploy cluster and install the frameworks,
currently this option is also not available in market.
Hadoop-Automation-Tool_RamkishorTak

Mais conteúdo relacionado

Mais procurados

Hadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionHadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionErik Krogen
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraCeph Community
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014NoSQLmatters
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatOpenStack
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red_Hat_Storage
 
Red Hat Storage Day Dallas - Why Software-defined Storage Matters
Red Hat Storage Day Dallas - Why Software-defined Storage MattersRed Hat Storage Day Dallas - Why Software-defined Storage Matters
Red Hat Storage Day Dallas - Why Software-defined Storage MattersRed_Hat_Storage
 
Hadoop for sys_admin
Hadoop for sys_adminHadoop for sys_admin
Hadoop for sys_adminJustin Miller
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed_Hat_Storage
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightColleen Corrice
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valleymarkgrover
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorialmarkgrover
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...SpringPeople
 
Building a lightweight discovery interface for Chinese patents
Building a lightweight discovery interface for Chinese patentsBuilding a lightweight discovery interface for Chinese patents
Building a lightweight discovery interface for Chinese patentsOpenSource Connections
 
Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red_Hat_Storage
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed_Hat_Storage
 

Mais procurados (20)

Hadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionHadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop Encryption
 
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix BarbeiraBackup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
 
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
Citrix Synergy 2014 - Syn233 Building and operating a Dev Ops cloud: best pra...
 
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
 
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red HatThe Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
The Future of Cloud Software Defined Storage with Ceph: Andrew Hatfield, Red Hat
 
Lecture 2 part 2
Lecture 2 part 2Lecture 2 part 2
Lecture 2 part 2
 
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
Red Hat Storage Day Seattle: Stretching A Gluster Cluster for Resilient Messa...
 
Ceph c01
Ceph c01Ceph c01
Ceph c01
 
Red Hat Storage Day Dallas - Why Software-defined Storage Matters
Red Hat Storage Day Dallas - Why Software-defined Storage MattersRed Hat Storage Day Dallas - Why Software-defined Storage Matters
Red Hat Storage Day Dallas - Why Software-defined Storage Matters
 
Hadoop for sys_admin
Hadoop for sys_adminHadoop for sys_admin
Hadoop for sys_admin
 
Red Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and FutureRed Hat Ceph Storage: Past, Present and Future
Red Hat Ceph Storage: Past, Present and Future
 
Ceph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer SpotlightCeph Deployment at Target: Customer Spotlight
Ceph Deployment at Target: Customer Spotlight
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
 
Intro to hadoop tutorial
Intro to hadoop tutorialIntro to hadoop tutorial
Intro to hadoop tutorial
 
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
Best Practices for Administering Hadoop with Hortonworks Data Platform (HDP) ...
 
Hazelcast 101
Hazelcast 101Hazelcast 101
Hazelcast 101
 
Building a lightweight discovery interface for Chinese patents
Building a lightweight discovery interface for Chinese patentsBuilding a lightweight discovery interface for Chinese patents
Building a lightweight discovery interface for Chinese patents
 
Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance Red Hat Storage Day Dallas - Defiance of the Appliance
Red Hat Storage Day Dallas - Defiance of the Appliance
 
Red Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super StorageRed Hat Storage Day Boston - Supermicro Super Storage
Red Hat Storage Day Boston - Supermicro Super Storage
 
Big Search 4 Big Data War Stories
Big Search 4 Big Data War StoriesBig Search 4 Big Data War Stories
Big Search 4 Big Data War Stories
 

Destaque

Acil İlaclar
Acil İlaclarAcil İlaclar
Acil İlaclargopacil
 
Adli̇ rehber
Adli̇ rehberAdli̇ rehber
Adli̇ rehbergopacil
 
Savitri bai fule.सावित्री बाई फुले।
Savitri bai fule.सावित्री बाई फुले।Savitri bai fule.सावित्री बाई फुले।
Savitri bai fule.सावित्री बाई फुले।swapnil sahu
 
Renal Enfarkt
Renal EnfarktRenal Enfarkt
Renal Enfarktgopacil
 
Humerus Başı çıkığı
Humerus Başı çıkığıHumerus Başı çıkığı
Humerus Başı çıkığıgopacil
 
dönem 1 eğitim
dönem 1 eğitimdönem 1 eğitim
dönem 1 eğitimgopacil
 
MCA Anevrizması
MCA AnevrizmasıMCA Anevrizması
MCA Anevrizmasıgopacil
 
Dens MCA
Dens MCADens MCA
Dens MCAgopacil
 
serebellar kitle
serebellar kitleserebellar kitle
serebellar kitlegopacil
 
Aort Diseksiyonu
Aort DiseksiyonuAort Diseksiyonu
Aort Diseksiyonugopacil
 
Palliative Care Integration into the Emergency Medicine
Palliative Care Integration into the Emergency MedicinePalliative Care Integration into the Emergency Medicine
Palliative Care Integration into the Emergency Medicinegopacil
 
intraparankimal hemoraji
intraparankimal hemorajiintraparankimal hemoraji
intraparankimal hemorajigopacil
 
Anafilaksi
AnafilaksiAnafilaksi
Anafilaksigopacil
 
Abdominal Aort Diseksiyonu
Abdominal Aort DiseksiyonuAbdominal Aort Diseksiyonu
Abdominal Aort Diseksiyonugopacil
 
Omurga travmaları
Omurga travmalarıOmurga travmaları
Omurga travmalarıgopacil
 

Destaque (18)

Certificates
CertificatesCertificates
Certificates
 
Ca
CaCa
Ca
 
Ca
CaCa
Ca
 
Acil İlaclar
Acil İlaclarAcil İlaclar
Acil İlaclar
 
Adli̇ rehber
Adli̇ rehberAdli̇ rehber
Adli̇ rehber
 
Savitri bai fule.सावित्री बाई फुले।
Savitri bai fule.सावित्री बाई फुले।Savitri bai fule.सावित्री बाई फुले।
Savitri bai fule.सावित्री बाई फुले।
 
Renal Enfarkt
Renal EnfarktRenal Enfarkt
Renal Enfarkt
 
Humerus Başı çıkığı
Humerus Başı çıkığıHumerus Başı çıkığı
Humerus Başı çıkığı
 
dönem 1 eğitim
dönem 1 eğitimdönem 1 eğitim
dönem 1 eğitim
 
MCA Anevrizması
MCA AnevrizmasıMCA Anevrizması
MCA Anevrizması
 
Dens MCA
Dens MCADens MCA
Dens MCA
 
serebellar kitle
serebellar kitleserebellar kitle
serebellar kitle
 
Aort Diseksiyonu
Aort DiseksiyonuAort Diseksiyonu
Aort Diseksiyonu
 
Palliative Care Integration into the Emergency Medicine
Palliative Care Integration into the Emergency MedicinePalliative Care Integration into the Emergency Medicine
Palliative Care Integration into the Emergency Medicine
 
intraparankimal hemoraji
intraparankimal hemorajiintraparankimal hemoraji
intraparankimal hemoraji
 
Anafilaksi
AnafilaksiAnafilaksi
Anafilaksi
 
Abdominal Aort Diseksiyonu
Abdominal Aort DiseksiyonuAbdominal Aort Diseksiyonu
Abdominal Aort Diseksiyonu
 
Omurga travmaları
Omurga travmalarıOmurga travmaları
Omurga travmaları
 

Semelhante a Hadoop-Automation-Tool_RamkishorTak

Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureRiccardo Romani
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE Project
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)Amazon Web Services
 
Defending Your Network
Defending Your NetworkDefending Your Network
Defending Your NetworkAdam Getchell
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologiesneeraj rathore
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Abdul Nasir
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecasesudhakara st
 

Semelhante a Hadoop-Automation-Tool_RamkishorTak (20)

Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
SCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation EnvironmentsSCAPE - Scalable Preservation Environments
SCAPE - Scalable Preservation Environments
 
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
AWS re:Invent 2016: Bringing Deep Learning to the Cloud with Amazon EC2 (CMP314)
 
Defending Your Network
Defending Your NetworkDefending Your Network
Defending Your Network
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Big Data Open Source Technologies
Big Data Open Source TechnologiesBig Data Open Source Technologies
Big Data Open Source Technologies
 
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
A Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data ImplementationA Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data Implementation
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 

Hadoop-Automation-Tool_RamkishorTak

  • 1. ABOUT THE COMPANY: • LINUX WORLD-TRAINING AND DEVELOPING CENTER. • ISO 9001:2008 Certified Organization working dedicatedly on Linux & Open Source Technologies in entire Rajasthan. • Centre who has maximum number of students who scores 100% in RHCSA & RHCE Global Exam.
  • 2. CONTENTS: • 1. INTRODUCTION OF BIG DATA. • 2. SOLUTIONS TO BIG DATA PROBLEM. • 3. HADOOP. • 4. HDFS(DISTRIBUTED STORAGE). • 5.MAPREDUCE(DISTRIBUTED STORAGE). • 6. HADOOP AUTOMATION TOOL FOR LINUX(TUI). • 7. HADOOP AUTOMATION TOOL FOR iOS(TUI). • 8. HADOOP AUTOMATION TOOL FOR LINUX(GUI). • 9. ON DEMAND CLUSTER. • 10.OPENSTACK-SAVANA(HADOOP WITH CLOUD). • 11. INNOVATION.
  • 3. INTRODUCTION TO HADOOP: • Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. • Open-source software. Open-source software is created and maintained by a network of developers from around the globe. It's free to download, use and contribute to, though more and more commercial versions of Hadoop are becoming available. • Framework. In this case, it means that everything you need to develop and run software applications is provided – programs, connections, etc. • Massive storage. The Hadoop framework breaks big data into blocks, which are stored on clusters of commodity hardware.
  • 4. BENEFITS OF HADOOP: • Computing power. Its distributed computing model quickly processes big data. The more computing nodes you use, the more processing power you have. • Flexibility. Unlike traditional relational databases, you don’t have to preprocess data before storing it. You can store as much data as you want and decide how to use it later. That includes unstructured data like text, images and videos. • Fault tolerance. Data and application processing are protected against hardware failure. If a node goes down, jobs are automatically redirected to other nodes to make sure the distributed computing does not fail. And it automatically stores multiple copies of all data. • Low cost. The open-source framework is free and uses commodity hardware to store large quantities of data.
  • 7.
  • 8. ABOUT THE PROJECT: • High performance distributed computing implements for Big Data using Hadoop Framework and running application on large cluster. • This project is dealing with distributive storage and distributed computing. • Using this project we can create an environment of super computing. • This environment can also access using IOS device.
  • 9. HARDWARE REQUIREMENTS: • PROCESSOR : Pentium 4 ,i3 or later. • MEMORY(RAM) : 1GB normally, 512MB in virtual machine. • HARD DRIVE SPACE : 512MB
  • 10. SOFTWARE REQUIREMENTS: • OPERATING SYSTEM : REDHAT ENTERPRISE LINUX version 5 or later , Fedora , Centos, IOS 8.4 or later. • Python 2.6 or later. • JDK. • Hadoop RPM. • PIG, HIVE , SQOOP RPMs. • iTUNES(for iOS) • CYDIA(for iOS) • SERVERAUDITOR(for iOS). • CLOUDSERVICES(for iOS). • DIALOG BOX(for TUI).
  • 11. TECHNOLOGY USED: • HADOOP V1 or V2. • PYTHON V2. • SHELL SCRIPT. • iOS HACKING. • SERVERS-SSH,FTP,SCP,HTTP,MYSQL,NFS,NTP. • FRAMEWORK-PIG,HIVE,SQOOP. • SCHEDULAR-FIFO,FAIR,OOZIE. • DATABASE USED-SQL,NOSQL. • MINIMAL CLUSTER,MULTINODE CLUSTER. • AMAZON WEB SERVICES,OPENSTACK.
  • 12. HADOOP AUTOMATION TOOL: • This tool can be used by two type- Custom and Typical. • In Custom you can make whole cluster according to yourself. • In typical the code will make whole cluster automatically. • It also provide the uses of frameworks like PIG, HIVE, SQOOP.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. Access Automation Tool Using iOS Device: • Requirement-iOS device must be connected to same network of namenode. • iOS device must be jailbroken. • we need to create a security file .pem on namenode. • Tweeks needed to install on device to read and access this security file. • Using .pem file we can access the namenode.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63. CLUSTER-ON-DEMAND: • A virtual cluster is a group of physical or virtual machines configured for a common purpose, with associated user accounts and storage resources. • Cluster-on-Demand (COD) is a system to enable rapid, automated, on- the-fly partitioning of a physical cluster into multiple independent virtual clusters. • You no longer need to build your own compute cluster in order to tackle your High Performance Computing projects.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74. INNOVATION: • Hadoop version-1 does not Support the replication of namenode and it is also very necessary for security purpose but in this tool replication of namenode is also supported. • Using iOS device we can deploy cluster and install the frameworks, currently this option is also not available in market.