Ten Organizational Design Models to align structure and operations to busines...
2012 09-08-josug-jeff
1. OSC 2012 Tokyo
openstack
Open source software to build public and private clouds.
Hadoop on OpenStack Swift
- Experiment of using swift as storage
for Apache Hadoop
2012.09.08
OpenStack Japan
Zheng Xu
1
2. Self introduction
●
Software designer(engineer) for
embedded system and web
system(60%hobbit, 40%job).
●
Major: openstack, linux, web browser,
html, epub, OSS
●
Contact
●
@xz911
●
https://www.facebook.com/xuzheng2001
2
3. Abstract
●
This slide is to introduce how to use OpenStack
Swift as storage service for Apache Hadoop
instead of HDFS(which is storage service of
Hadoop project).
●
This slide is based on
http://bigdatacraft.com/archives/349, and really
appreciate Constantine Peresypkin and David
Gruzman for providing their idea and
implementation.
3
4. Agenda
●
OpenStack Swift
●
Apache Hadoop and HDFS
●
Experiment of replacing HDFS by OpenStack
Swift
4
6. What is OpenStack and Swift
User Application
http
Proxy Server Proxy Server
http
Account Server Account Server Account Server
http
Container Server Container Server Container Server
Object Server Object Server Object Server Object Server
6
7. What is OpenStack and Swift
●
OpenSource written in Python
●
diversity
●
Swift can be a part of OpenStack or an individual
service it self.
●
zones, devices, partitions, and replicas
●
No SPOF
7
8. Agenda
●
OpenStack Swift
●
Apache Hadoop and HDFS
●
Experiment of replacing HDFS by OpenStack
Swift
8
16. Experiment(install swift)
●
Install swift based on
http://docs.openstack.org/developer/swift/development_saio.html
●
Do not forget to set bind_ip of proxy-server.conf
●
192.168.0.9 in my case
●
Suppose we have username as "test:tester" with
password as "testing", the account name is
AUTH_test and have some container based on
steps in above Url.
16
17. Experiment (cloudfiles)
●
Run "ant compile"
●
Change cloudfiles.properties to following
# Auth info
auth_url=http://192.168.0.9:8080/auth/v1.0
auth_token_name=X-Auth-Token
#auth_user_header=X-Storage-User
#auth_pass_header=X-Storage-Pass
# user properties
username=test:tester
password=testing
# cloudfs properties
version=v1
connection_timeout=15000 17
18. Experiment(cloudfiles)
●
Connect cloudfiles to swift(this is option)
●
Change cloudfiles.sh as following and run it to try
connection with swift
#!/bin/sh
export CLASSPATH=lib/httpcore-4.1.4.jar:lib/commons-cli-
1.1.jar:lib/httpclient-4.1.3.jar:lib/commons-lang-
2.4.jar:lib/junit.jar:lib/commons-codec-1.3.jar:lib/commons-io-
1.4.jar:lib/commons-logging-1.1.1.jar:lib/log4j-1.2.15.jar:dist/java-
cloudfiles.jar:.
java com.rackspacecloud.client.cloudfiles.sample.FilesCli $@
18
19. Experiment (cloudfiles)
●
Packaging java-cloudfiles to jar file for Apache
Hadoop (clone java-cloudfiles to ~/java-
cloudfiles)
●
We need to put *.properties into java-cloudfiles.jar
$ ant package
$ cd cloudfiles/dist
$ cp ../*.properties .
$ rm java-cloudfiles.jar
$ jar cvf java-cloudfiles.jar ./*
19
20. Experiment (hadoop)
●
Prepare
●
download hadoop to ~/hadoop-1.0.3 (newest stable
version of original hadoop) and git clone
https://github.com/Dazo-org/hadoop-common.git to
~/hadoop-common (old hadoop source code with
swift fs plugin)
●
At ~/hadoop-1.0.3 (copy java-cloudfiles and related
library to hadoop lib folder)
– cd lib;cp ~/java-cloudfiles/cloudfiles/dist/java-cloudfiles.jar .
– cp ~/java-cloudfiles/lib/httpc* .
20
21. Experiment (setting hadoop)
●
./hadoop-1.0.3/src/core/core-default.xml
●
Add following to make hadoop can recognize
handle "swift://" schema to SwiftFileSystem class
<property>
<name>fs.swift.impl</name>
<value>org.apache.hadoop.fs.swift.SwiftFileSystem</value>
<description>The FileSystem for swift: uris.</description>
</property>
21
22. Experiment (hadoop)
●
Copy implementation for swift fs to hadoop
1.0.3 and build
●
cp -R ../hadoop-
common/src/core/org/apache/hadoop/fs/swift
./src/core/org/apache/hadoop/fs
●
ant
22
23. Experiment(hadoop setting)
●
./conf/core-site.xml (part1)
●
Add following property for example
<property>
<name>fs.swift.userName</name>
<value>test:tester</value>
</property>
23
24. Experiment (hadoop setting)
●
./conf/core-site.xml (part2)
●
Add following property for example
<property>
<name>fs.swift.userPassword</name>
<value>testing</value>
</property>
<property>
<name>fs.swift.acccountname</name>
<value>AUTH_test</value>
</property>
24
25. Experiment (hadoop setting)
●
./conf/core-site.xml (part3)
●
Add following property for example
<property>
<name>fs.swift.authUrl</name>
<value>http://192.168.0.9:8080/auth/v1.0</value>
</property>
<property>
<name>fs.default.name</name>
<value>swift://192.168.0.9:8080/v1/AUTH_test</value>
</property>
25
26. Experiment (check swift fs)
●
At this time, we should can list account
information via following command
●
./bin/hadoop -fs -ls /
●
or ./bin/hadoop fs -put ./conf/core-site.xml
/test_container/core-site.xml (test_container is a test
container created after swift installed)
26
27. Finally
●
We installed swift for storage service of hadoop
●
We built origin java-cloudfiles and created
packages for hadoop
●
We copied fs.swift plugin from
https://github.com/Dazo-org/hadoop-common.git
to new hadoop source tree and build hadoop
●
We set up core-site.xml of hadoop to connect to
swift via java-cloudfiles
27