2. Who am I
• Evans Ye @
• Dumbo Team
• http://dumbointaiwan.blogspot.tw/
12/14/2013
Copyright 2013 Trend Micro Inc.
3. Agenda
• Building your own Hadoop version
• Hadoop Deployment
• Hadoop release engineering
• The development environment
• Bigtop puppet
12/14/2013
Copyright 2013 Trend Micro Inc.
4. Why Build our own version
• Add your own patch at any time
– From community perspective, they need to take care about
backward complicity,
which need much more time and effort on it.
• Fetch official patches in to current adopted version
– You may not upgrade your Hadoop version frequently,
But there’s a specific need for that patch.
• Flexibility, Business needed features
12/14/2013
Copyright 2013 Trend Micro Inc.
8. Brute force
• git clone
• Make some changes
• Builde binary tarball
How to do version control?
core-site.xml
hdfs-site.xml
mapred-site.xml
…
12/14/2013
Copyright 2013 Trend Micro Inc.
10. How bigtop helps you
• Apache Hadoop App developers:
– Run pseudo-distributed Hadoop cluster to test your code on.
• Vendors:
– Build your own Apache Hadoop distribution, customized from
Apache Bigtop bits.
• Packaging, Deployment, Integration Testing
12/14/2013
Copyright 2013 Trend Micro Inc.
12. Build
• Build hadoop-common (see BUILDING.txt)
– hadoop-common$ mvn package –Pdist,docs,src,native -Dtar
• Prepare your src tar in bigtop
• Bigtop$ make hadoop-rpm
12/14/2013
Copyright 2013 Trend Micro Inc.
17. Problems to solve
• Lots of nodes need to be configured
• Less human involved, less mistake made
• Configuration changed quite often
– adjust fair scheduler
– enable/disable short circuit
– try more performance improvement configurations
12/14/2013
Copyright 2013 Trend Micro Inc.
19. What is puppet ?
• A IT automation tool to help system administrators
automate the many repetitive tasks
• You need to only define the desired state
12/14/2013
Copyright 2013 Trend Micro Inc.
20. What is Hadooppet ?
• A general hadoop cluster deployment tool based on
puppet
• Kerberos / ldap auto configured
• A set of hadoop / kerberos management tool
• A set of sanity check scripts for trend hadoop related
services
• Manage configuration on puppetmaster
12/14/2013
Copyright 2013 Trend Micro Inc.
21. Design
• Abstract environment specific configurations in a single
configuration file
• setup.sh
–
–
–
–
–
–
12/14/2013
namenode_fqdns=(“dev1.example.com” “dev2.example.com”)
namenode_dirs=(“/name/1” “/name/2”)
namenode_heap=32g
map_slots=5
reduce_slots=3
…
Copyright 2013 Trend Micro Inc.
22. Benifits
• Can be used to setup any kind of hadoop cluster
• When doing main version upgarade, minimal the
downtime
– hadoop1 hadoop2
Namenode
Secondarynamenode
12/14/2013
Copyright 2013 Trend Micro Inc.
Active/Standby Namenode
Journalnodes
ZKFC
28. give-me-vm
• Pycon 2012
– Small Python Tools for Software Release Engineering
• An automation tool to manage
VM lifecycle
• Use Python XenAPI
• Create temporary VM for testing
by self service
• Destroy it when the testing
is finished
12/14/2013
Copyright 2013 Trend Micro Inc.
29. Build auto deployment on Hadooppet
• ./give_me_vm.py
• setup passphraseless ssh between each VM
• set hostname
• Install Hadooppet on master
• run deployment
• run sanity checks
• ./destroy_vm.py
12/14/2013
Copyright 2013 Trend Micro Inc.
32. For hadoop service developers…
• No enough hadoop client for each developers
• Developer can not reach server side while developing
hadoop related services
• Can not experiment new technology like impala spark
flume
• CI on Hadoop related services
12/14/2013
Copyright 2013 Trend Micro Inc.
33. give-me-vm + Hadoop all-in-one VM
• Use Hadooppet to setup a peudo-distributed hadoop
VM as Xenserver template
• get a Hadoop all-in-one VM via give-me-vm
• Services integrate its CI test with hadoop all-in-one VM
12/14/2013
Copyright 2013 Trend Micro Inc.
35. Bigtop puppet
• Bigtop also has a set of puppet scripts to deploy
Hadoop ecosystem
12/14/2013
Copyright 2013 Trend Micro Inc.
36. Bigtop puppet
• Preparation:
– A VM with jdk, puppet installed
– mkdir –p /data/{1,2}
– git clone https://github.com/apache/bigtop.git
12/14/2013
Copyright 2013 Trend Micro Inc.
37. Conclusion
• There’re many great deployment tool exist
– Ambari, CM, ETU appliance
– Choose suitable distribution by your business need
• If you want to do it by yourself
– Bigtop can do packaging for you easily
– Leverage bigtop puppet module for your deployment
12/14/2013
Copyright 2013 Trend Micro Inc.