14. Divide and Conquer
I am a tiger, you are also a tiger a,2
also,1
I,1 a,2 am,1
am,1 a, 1 also,1 are,1
map a,1 am,1
a,1 reduce I,1
also,1 are,1
tiger,2
tiger,1 am,1
you,1 are,1 you,1
map
are,1 I,1
tiger,1 I, 1
tiger,1 tiger,2
also,1 you,1 reduce you,1
map a, 1
tiger,1
18. Supported Platforms
GNU/Linux is supported as a
development and production platform.
Hadoop has been demonstrated on
GNU/Linux clusters with 2000 nodes.
Win32 is supported as a development
platform. Distributed operation has not
been well tested on Win32, so it is not
supported as a production platform.
20. Required Software
JavaTM 1.6.x, preferably
from Sun, must be installed.
ssh must be installed and
sshd must be running to
use the Hadoop scripts
that manage remote
Hadoop daemons.
21. Sun Java 6
1. Add repository to your apt repositories:
2. Update the source list
$ sudo add-apt-repository "deb
http://archive.canonical.com/ lucid partner"
$ sudo apt-get update
22. Sun Java 6
3. Install sun-java6-jdk
4. Select Sun’s Java as the default on your
machine.
$ sudo apt-get install sun-java6-jdk
$ sudo update-java-alternatives -s java-6-sun
25. Configuring SSH
1. generate an SSH key for current user.
2. enable SSH access to your local machine
with this newly created key.
$ ssh-keygen -t rsa -P “”
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
( cat test1.txt >> test2.txt 轉向附加)
26.
27. Configuring SSH
3. Test by connecting to your local machine
( You should install ssh first )
$ ssh localhost
30. Disabling IPv6
check whether IPv6 is enabled on your machine
( 0 means enabled, 1 means disabled )
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
31.
32.
33. Hadoop Installation
Download Hadoop from the Apache Mirrors
http://www.apache.org/dyn/closer.cgi/hadoop/core
$ cd /home/csa
$ wget
http://apache.ntu.edu.tw/hadoop/core/ha
doop-0.20.2/hadoop-0.20.2.tar.gz
36. Update to who want to use Hadoop
$ sudo joe /home/csa/.bashrc
# Set Hadoop-related environment variables
export HADOOP_HOME=/home/csa/hadoop
# Add Hadoop bin/ directory to PATH export
PATH=$PATH:$HADOOP_HOME/bin
37. Configuration
Change the Sun JDK/JRE 6 directory
$ joe /hadoop/conf/hadoop-env.sh
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24
38. Configuration
In file conf/core-site.xml
In file conf/core-site.xml
In file conf/mapred-site.xml
39. <!-- In: conf/core-site.xml -->
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary irectories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. </description>
</property>
41. <!-- In: conf/hdfs-site.xml -->
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication. The actual number of
replications can be specified when the file is created. The default is used
if replication is not specified in create time. </description>
</property>
46. Hadoop Web Interfaces
http://localhost:50030/
– web UI for MapReduce job tracker(s)
http://localhost:50060/
– web UI for task tracker(s)
http://localhost:50070/
– web UI for HDFS name node(s)
49. Divide and Conquer
I am a tiger, you are also a tiger a,2
also,1
I,1 a,2 am,1
am,1 a, 1 also,1 are,1
map a,1 am,1
a,1 reduce I,1
also,1 are,1
tiger,2
tiger,1 am,1
you,1 are,1 you,1
map
are,1 I,1
tiger,1 I, 1
tiger,1 tiger,2
also,1 you,1 reduce you,1
map a, 1
tiger,1