SlideShare uma empresa Scribd logo
1 de 52
陳柏翰
                CS13 http://about.me/sihalon
Computer System Administration 2011
只有天上在
更無山與齊
舉頭紅日近
回首白雲低

宋 寇準(華山)
Outlines
 現有雲端服務
 Hadoop 背後概念
 Hadoop 單節點安裝
 簡單範例
什麼是雲端?
 Gmail
 YouTube
 Google   Docs
…
簡單來說

即

凡能透過 網際網路

    能享受到的   應用服務
現有的雲端運算服務
• Windows
• Google
• Amazon
• Yahoo     他們的背後?
• Plurk
• ……
Hadoop
Hadoop is a software platform that lets one easily write and run
applications that process vast amounts of data
What is Hadoop ?

   一種開放源碼雲端平台(框架)
   巨量資料計算解決方案
   穩定可擴充
Yahoo : Hadoop
   Apache 項目,Yahoo 資助、開發與運用
     2006年 開始參與 Hadoop。
     2008年 2千臺伺服器。
          執行超過1萬個Hadoop虛擬機器。
          5 Petabytes的網頁內容
          分析1兆個網路連結
Feature
•   巨量
    – 擁有儲存與處理大量資料的能力

•   經濟
    – 可以用在由一般PC所架設的叢集環境內

•   效率
    – 平行分散檔案的處理以得到快速的回應

•   可靠
    – 當某節點發生錯誤,系統能即時自動的取
    得備份資料及佈署運算資源
架構
 HDFS
 - Hadoop 專案中的檔案系統

 MapReduce
 - 平行處理P級別以上的資料集

 Hbase
 - 巨量資料庫系統
Divide and Conquer
   演算法(Algorithms):
     Divide and Conquer
     分而治之


   在程式設計的軟體架構內,適合使用在大
    規模數據的運算中
Divide and Conquer

範例一:方格法求面積   範例二:鋪滿 L 形磁磚
Divide and Conquer
I am a tiger, you are also a tiger                a,2
                                                  also,1
       I,1                              a,2       am,1
       am,1          a, 1               also,1    are,1
map    a,1                              am,1
                     a,1       reduce             I,1
                     also,1             are,1
                                                  tiger,2
       tiger,1       am,1
       you,1         are,1                        you,1
map
       are,1         I,1
                     tiger,1            I, 1
                     tiger,1            tiger,2
       also,1        you,1     reduce   you,1
map    a, 1
       tiger,1
各種身份
Building Hadoop
  Namenode


  JobTracker



Data            Task   Data           Task   Data          Task



       Java                   Java                  Java


       Linuux                 Linuux                Linuux


       Node1                  Node2                 Node3
一起飛上雲端吧

     - Demo Time
Supported Platforms
 GNU/Linux is supported as a
  development and production platform.
  Hadoop has been demonstrated on
  GNU/Linux clusters with 2000 nodes.
 Win32 is supported as a development
  platform. Distributed operation has not
  been well tested on Win32, so it is not
  supported as a production platform.
Environment
 Ubuntu Linux 10.04 LTS
 Hadoop 0.20.2
 - released on February 2010
Required Software
   JavaTM 1.6.x, preferably
    from Sun, must be installed.

   ssh must be installed and
    sshd must be running to
    use the Hadoop scripts
    that manage remote
    Hadoop daemons.
Sun Java 6
1. Add repository to your apt repositories:
2. Update the source list

   $ sudo add-apt-repository "deb
    http://archive.canonical.com/ lucid partner"
   $ sudo apt-get update
Sun Java 6
3. Install sun-java6-jdk
4. Select Sun’s Java as the default on your
machine.

   $ sudo apt-get install sun-java6-jdk
   $ sudo update-java-alternatives -s java-6-sun
Sun Java 6
5. Check whether it’s success !

   $ java -version
Configuring SSH
( You can find ssh software in Software Center by searhing “ssh”)
Configuring SSH
1. generate an SSH key for current user.
2. enable SSH access to your local machine
with this newly created key.

   $ ssh-keygen -t rsa -P “”
   $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
                        ( cat test1.txt >> test2.txt 轉向附加)
Configuring SSH
3. Test by connecting to your local machine
  ( You should install ssh first )

   $ ssh localhost
Disabling IPv6
 $ sudo joe /etc/sysctl.conf
 #disable ipv6
  net.ipv6.conf.all.disable_ipv6 = 1
  net.ipv6.conf.default.disable_ipv6 = 1
  net.ipv6.conf.lo.disable_ipv6 = 1

   $ reboot
Disabling IPv6
check whether IPv6 is enabled on your machine
       ( 0 means enabled, 1 means disabled )

   $ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
Hadoop Installation
Download Hadoop from the Apache Mirrors
http://www.apache.org/dyn/closer.cgi/hadoop/core


 $ cd /home/csa
 $ wget
  http://apache.ntu.edu.tw/hadoop/core/ha
  doop-0.20.2/hadoop-0.20.2.tar.gz
Hadoop Installation
 $ sudo tar xzf hadoop-0.20.2.tar.gz
 $ sudo mv hadoop-0.20.2 hadoop
Hadoop Package Topology
   bin / 各執行檔:如 start-all.sh 、stop-all.sh 、 hadoop
   conf / 預設的設定檔目錄:設定環境變數、工作節點
    slaves。
   docs / Hadoop API 與說明文件。
   contrib / 額外有用的功能套件,如:eclipse的擴充外掛。
   lib / 開發 hadoop 專案或編譯 hadoop 程式所需要的所
    有函式庫,如:jetty、kfs。
   src / Hadoop 的原始碼。
   build / 開發Hadoop 編譯後的資料夾。
   logs / 預設的日誌檔所在目錄。(可更改路徑)
Update to who want to use Hadoop
   $ sudo joe /home/csa/.bashrc



   # Set Hadoop-related environment variables
    export HADOOP_HOME=/home/csa/hadoop
   # Add Hadoop bin/ directory to PATH export
    PATH=$PATH:$HADOOP_HOME/bin
Configuration
Change the Sun JDK/JRE 6 directory

   $ joe /hadoop/conf/hadoop-env.sh

   # The java implementation to use. Required.
   export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24
Configuration
   In file conf/core-site.xml

   In file conf/core-site.xml

   In file conf/mapred-site.xml
<!-- In: conf/core-site.xml -->
<property>
          <name>hadoop.tmp.dir</name>
          <value>/app/hadoop/tmp</value>
          <description>A base for other temporary irectories.</description>
</property>
<property>
          <name>fs.default.name</name>
          <value>hdfs://localhost:9000</value>
          <description>The name of the default file system. </description>
</property>
<!-- In: conf/mapred-site.xml -->
<property>
          <name>mapred.job.tracker</name>
          <value>localhost:54311</value>
          <description> For MapReduce job tracker </description>
</property>
<!-- In: conf/hdfs-site.xml -->
<property>
          <name>dfs.replication</name>
          <value>1</value>
          <description>Default block replication. The actual number of
replications can be specified when the file is created. The default is used
if replication is not specified in create time. </description>
</property>
Formatting the name node!
   $ /home/csa/bin/hadoop namenode -format
Starting your single-node cluster

 $ /home/csa/hadoop/bin/start-all.sh
 $ jps
Jps
JobTracker
TaskTracker
NameNode
DataNode
Congratulation!
 You   just setup a single-node cluster
Hadoop Web Interfaces
 http://localhost:50030/
– web UI for MapReduce job tracker(s)
 http://localhost:50060/
– web UI for task tracker(s)
 http://localhost:50070/
– web UI for HDFS name node(s)
常用指令
 操作 hadoop 檔案系統指令
 $ bin/hadoop fs -Instruction …
MapReduce Demo
   WordCount
Divide and Conquer
I am a tiger, you are also a tiger                a,2
                                                  also,1
       I,1                              a,2       am,1
       am,1          a, 1               also,1    are,1
map    a,1                              am,1
                     a,1       reduce             I,1
                     also,1             are,1
                                                  tiger,2
       tiger,1       am,1
       you,1         are,1                        you,1
map
       are,1         I,1
                     tiger,1            I, 1
                     tiger,1            tiger,2
       also,1        you,1     reduce   you,1
map    a, 1
       tiger,1
Why wordcount ?
 Google
 Facebook
參考資料來源
           Thanks for …
   NCHC Cloud Computing Research
    Group ( Link here ! )
Thanks   for your listening

Mais conteúdo relacionado

Mais procurados

Unleash your cluster with YARN
Unleash your cluster with YARNUnleash your cluster with YARN
Unleash your cluster with YARNFerran Galí Reniu
 
データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)Takumi Asai
 
Introduction to the Oakforest-PACS Supercomputer in Japan
Introduction to the Oakforest-PACS Supercomputer in JapanIntroduction to the Oakforest-PACS Supercomputer in Japan
Introduction to the Oakforest-PACS Supercomputer in Japaninside-BigData.com
 
InfiniCortex and the Renaissance in Polish Supercomputing
InfiniCortex and the Renaissance in Polish Supercomputing InfiniCortex and the Renaissance in Polish Supercomputing
InfiniCortex and the Renaissance in Polish Supercomputing inside-BigData.com
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configurationGerrit van Vuuren
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingMitsuharu Hamba
 
Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2ovarene
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsChien Chung Shen
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsJim Dowling
 
Scaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceScaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceDataWorks Summit
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Yahoo Developer Network
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsJim Dowling
 

Mais procurados (19)

MesosCon 2018
MesosCon 2018MesosCon 2018
MesosCon 2018
 
Unleash your cluster with YARN
Unleash your cluster with YARNUnleash your cluster with YARN
Unleash your cluster with YARN
 
データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)データ解析技術入門(Hadoop編)
データ解析技術入門(Hadoop編)
 
Introduction to the Oakforest-PACS Supercomputer in Japan
Introduction to the Oakforest-PACS Supercomputer in JapanIntroduction to the Oakforest-PACS Supercomputer in Japan
Introduction to the Oakforest-PACS Supercomputer in Japan
 
InfiniCortex and the Renaissance in Polish Supercomputing
InfiniCortex and the Renaissance in Polish Supercomputing InfiniCortex and the Renaissance in Polish Supercomputing
InfiniCortex and the Renaissance in Polish Supercomputing
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configuration
 
Hadoop
HadoopHadoop
Hadoop
 
Ruby on hadoop
Ruby on hadoopRuby on hadoop
Ruby on hadoop
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
 
Introduction to Mongodb
Introduction to MongodbIntroduction to Mongodb
Introduction to Mongodb
 
Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2Big Data @ Orange - Dev Day 2013 - part 2
Big Data @ Orange - Dev Day 2013 - part 2
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle Professionals
 
Odsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on HopsOdsc workshop - Distributed Tensorflow on Hops
Odsc workshop - Distributed Tensorflow on Hops
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Scaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceScaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter Experience
 
Bigdata roundtable-storm
Bigdata roundtable-stormBigdata roundtable-storm
Bigdata roundtable-storm
 
GTC Japan 2014
GTC Japan 2014GTC Japan 2014
GTC Japan 2014
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUsScaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
Scaling out Tensorflow-as-a-Service on Spark and Commodity GPUs
 

Destaque

Spring 3.x - Spring MVC
Spring 3.x - Spring MVCSpring 3.x - Spring MVC
Spring 3.x - Spring MVCGuy Nir
 
Java Spring MVC Framework with AngularJS by Google and HTML5
Java Spring MVC Framework with AngularJS by Google and HTML5Java Spring MVC Framework with AngularJS by Google and HTML5
Java Spring MVC Framework with AngularJS by Google and HTML5Tuna Tore
 
大數據的獲利模式
大數據的獲利模式大數據的獲利模式
大數據的獲利模式Chang Chiao Hui
 
Play Framework + Docker + CircleCI + AWS + EC2 Container Service
Play Framework + Docker + CircleCI + AWS + EC2 Container ServicePlay Framework + Docker + CircleCI + AWS + EC2 Container Service
Play Framework + Docker + CircleCI + AWS + EC2 Container ServiceJosh Padnick
 
Choosing the Right Framework for Running Docker Containers in Prod
Choosing the Right Framework for Running Docker Containers in ProdChoosing the Right Framework for Running Docker Containers in Prod
Choosing the Right Framework for Running Docker Containers in ProdJosh Padnick
 
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Kuo-Chun Su
 

Destaque (6)

Spring 3.x - Spring MVC
Spring 3.x - Spring MVCSpring 3.x - Spring MVC
Spring 3.x - Spring MVC
 
Java Spring MVC Framework with AngularJS by Google and HTML5
Java Spring MVC Framework with AngularJS by Google and HTML5Java Spring MVC Framework with AngularJS by Google and HTML5
Java Spring MVC Framework with AngularJS by Google and HTML5
 
大數據的獲利模式
大數據的獲利模式大數據的獲利模式
大數據的獲利模式
 
Play Framework + Docker + CircleCI + AWS + EC2 Container Service
Play Framework + Docker + CircleCI + AWS + EC2 Container ServicePlay Framework + Docker + CircleCI + AWS + EC2 Container Service
Play Framework + Docker + CircleCI + AWS + EC2 Container Service
 
Choosing the Right Framework for Running Docker Containers in Prod
Choosing the Right Framework for Running Docker Containers in ProdChoosing the Right Framework for Running Docker Containers in Prod
Choosing the Right Framework for Running Docker Containers in Prod
 
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
Hadoop, the Apple of Our Eyes (這些年,我們一起追的 Hadoop)
 

Semelhante a Hadoop

Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an exampleNikita Kesharwani
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)npinto
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderDmitry Makarchuk
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightGert Drapers
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - OverviewJay
 
The Family of Hadoop
The Family of HadoopThe Family of Hadoop
The Family of HadoopNam Nham
 
GOTO 2011 preso: 3x Hadoop
GOTO 2011 preso: 3x HadoopGOTO 2011 preso: 3x Hadoop
GOTO 2011 preso: 3x Hadoopfvanvollenhoven
 
Elephant in the cloud
Elephant in the cloudElephant in the cloud
Elephant in the cloudrhatr
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceDr Ganesh Iyer
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopfann wu
 
Hadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdfHadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdfSheetal Jain
 
Pig power tools_by_viswanath_gangavaram
Pig power tools_by_viswanath_gangavaramPig power tools_by_viswanath_gangavaram
Pig power tools_by_viswanath_gangavaramViswanath Gangavaram
 
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Big Data Joe™ Rossi
 
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkHadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkThoughtWorks
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)outstanding59
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)outstanding59
 

Semelhante a Hadoop (20)

Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
 
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
[Harvard CS264] 08b - MapReduce and Hadoop (Zak Stone, Harvard)
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsight
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - Overview
 
The Family of Hadoop
The Family of HadoopThe Family of Hadoop
The Family of Hadoop
 
GOTO 2011 preso: 3x Hadoop
GOTO 2011 preso: 3x HadoopGOTO 2011 preso: 3x Hadoop
GOTO 2011 preso: 3x Hadoop
 
Elephant in the cloud
Elephant in the cloudElephant in the cloud
Elephant in the cloud
 
Hadoop description
Hadoop descriptionHadoop description
Hadoop description
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Facing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoopFacing enterprise specific challenges – utility programming in hadoop
Facing enterprise specific challenges – utility programming in hadoop
 
Hadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdfHadoop tutorial-pdf.pdf
Hadoop tutorial-pdf.pdf
 
Pig power tools_by_viswanath_gangavaram
Pig power tools_by_viswanath_gangavaramPig power tools_by_viswanath_gangavaram
Pig power tools_by_viswanath_gangavaram
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2
 
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software FrameworkHadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Hadoop

  • 1. 陳柏翰 CS13 http://about.me/sihalon Computer System Administration 2011
  • 3. Outlines  現有雲端服務  Hadoop 背後概念  Hadoop 單節點安裝  簡單範例
  • 5. 簡單來說 即 凡能透過 網際網路 能享受到的 應用服務
  • 6. 現有的雲端運算服務 • Windows • Google • Amazon • Yahoo 他們的背後? • Plurk • ……
  • 7. Hadoop Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data
  • 8. What is Hadoop ?  一種開放源碼雲端平台(框架)  巨量資料計算解決方案  穩定可擴充
  • 9. Yahoo : Hadoop  Apache 項目,Yahoo 資助、開發與運用  2006年 開始參與 Hadoop。  2008年 2千臺伺服器。 執行超過1萬個Hadoop虛擬機器。 5 Petabytes的網頁內容 分析1兆個網路連結
  • 10. Feature • 巨量 – 擁有儲存與處理大量資料的能力 • 經濟 – 可以用在由一般PC所架設的叢集環境內 • 效率 – 平行分散檔案的處理以得到快速的回應 • 可靠 – 當某節點發生錯誤,系統能即時自動的取 得備份資料及佈署運算資源
  • 11. 架構  HDFS - Hadoop 專案中的檔案系統  MapReduce - 平行處理P級別以上的資料集  Hbase - 巨量資料庫系統
  • 12. Divide and Conquer  演算法(Algorithms):  Divide and Conquer  分而治之  在程式設計的軟體架構內,適合使用在大 規模數據的運算中
  • 13. Divide and Conquer 範例一:方格法求面積 範例二:鋪滿 L 形磁磚
  • 14. Divide and Conquer I am a tiger, you are also a tiger a,2 also,1 I,1 a,2 am,1 am,1 a, 1 also,1 are,1 map a,1 am,1 a,1 reduce I,1 also,1 are,1 tiger,2 tiger,1 am,1 you,1 are,1 you,1 map are,1 I,1 tiger,1 I, 1 tiger,1 tiger,2 also,1 you,1 reduce you,1 map a, 1 tiger,1
  • 16. Building Hadoop Namenode JobTracker Data Task Data Task Data Task Java Java Java Linuux Linuux Linuux Node1 Node2 Node3
  • 17. 一起飛上雲端吧 - Demo Time
  • 18. Supported Platforms  GNU/Linux is supported as a development and production platform. Hadoop has been demonstrated on GNU/Linux clusters with 2000 nodes.  Win32 is supported as a development platform. Distributed operation has not been well tested on Win32, so it is not supported as a production platform.
  • 19. Environment  Ubuntu Linux 10.04 LTS  Hadoop 0.20.2 - released on February 2010
  • 20. Required Software  JavaTM 1.6.x, preferably from Sun, must be installed.  ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.
  • 21. Sun Java 6 1. Add repository to your apt repositories: 2. Update the source list  $ sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"  $ sudo apt-get update
  • 22. Sun Java 6 3. Install sun-java6-jdk 4. Select Sun’s Java as the default on your machine.  $ sudo apt-get install sun-java6-jdk  $ sudo update-java-alternatives -s java-6-sun
  • 23. Sun Java 6 5. Check whether it’s success !  $ java -version
  • 24. Configuring SSH ( You can find ssh software in Software Center by searhing “ssh”)
  • 25. Configuring SSH 1. generate an SSH key for current user. 2. enable SSH access to your local machine with this newly created key.  $ ssh-keygen -t rsa -P “”  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ( cat test1.txt >> test2.txt 轉向附加)
  • 26.
  • 27. Configuring SSH 3. Test by connecting to your local machine ( You should install ssh first )  $ ssh localhost
  • 28.
  • 29. Disabling IPv6  $ sudo joe /etc/sysctl.conf  #disable ipv6 net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1  $ reboot
  • 30. Disabling IPv6 check whether IPv6 is enabled on your machine ( 0 means enabled, 1 means disabled )  $ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
  • 31.
  • 32.
  • 33. Hadoop Installation Download Hadoop from the Apache Mirrors http://www.apache.org/dyn/closer.cgi/hadoop/core  $ cd /home/csa  $ wget http://apache.ntu.edu.tw/hadoop/core/ha doop-0.20.2/hadoop-0.20.2.tar.gz
  • 34. Hadoop Installation  $ sudo tar xzf hadoop-0.20.2.tar.gz  $ sudo mv hadoop-0.20.2 hadoop
  • 35. Hadoop Package Topology  bin / 各執行檔:如 start-all.sh 、stop-all.sh 、 hadoop  conf / 預設的設定檔目錄:設定環境變數、工作節點 slaves。  docs / Hadoop API 與說明文件。  contrib / 額外有用的功能套件,如:eclipse的擴充外掛。  lib / 開發 hadoop 專案或編譯 hadoop 程式所需要的所 有函式庫,如:jetty、kfs。  src / Hadoop 的原始碼。  build / 開發Hadoop 編譯後的資料夾。  logs / 預設的日誌檔所在目錄。(可更改路徑)
  • 36. Update to who want to use Hadoop  $ sudo joe /home/csa/.bashrc  # Set Hadoop-related environment variables export HADOOP_HOME=/home/csa/hadoop  # Add Hadoop bin/ directory to PATH export PATH=$PATH:$HADOOP_HOME/bin
  • 37. Configuration Change the Sun JDK/JRE 6 directory  $ joe /hadoop/conf/hadoop-env.sh  # The java implementation to use. Required.  export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24
  • 38. Configuration  In file conf/core-site.xml  In file conf/core-site.xml  In file conf/mapred-site.xml
  • 39. <!-- In: conf/core-site.xml --> <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary irectories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> <description>The name of the default file system. </description> </property>
  • 40. <!-- In: conf/mapred-site.xml --> <property> <name>mapred.job.tracker</name> <value>localhost:54311</value> <description> For MapReduce job tracker </description> </property>
  • 41. <!-- In: conf/hdfs-site.xml --> <property> <name>dfs.replication</name> <value>1</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property>
  • 42. Formatting the name node!  $ /home/csa/bin/hadoop namenode -format
  • 43. Starting your single-node cluster  $ /home/csa/hadoop/bin/start-all.sh  $ jps
  • 45. Congratulation!  You just setup a single-node cluster
  • 46. Hadoop Web Interfaces  http://localhost:50030/ – web UI for MapReduce job tracker(s)  http://localhost:50060/ – web UI for task tracker(s)  http://localhost:50070/ – web UI for HDFS name node(s)
  • 47. 常用指令  操作 hadoop 檔案系統指令  $ bin/hadoop fs -Instruction …
  • 48. MapReduce Demo  WordCount
  • 49. Divide and Conquer I am a tiger, you are also a tiger a,2 also,1 I,1 a,2 am,1 am,1 a, 1 also,1 are,1 map a,1 am,1 a,1 reduce I,1 also,1 are,1 tiger,2 tiger,1 am,1 you,1 are,1 you,1 map are,1 I,1 tiger,1 I, 1 tiger,1 tiger,2 also,1 you,1 reduce you,1 map a, 1 tiger,1
  • 50. Why wordcount ?  Google  Facebook
  • 51. 參考資料來源 Thanks for …  NCHC Cloud Computing Research Group ( Link here ! )
  • 52. Thanks for your listening