1. Status of Hadoop 0.23
Operations at Yahoo!
From Inception to Customer Validation
Charles Wimmer, Staff Site Reliability
Engineer at LinkedIn
2. Summary of This Talk
● Includes
○ Operational changes required to support 0.23
● Does not include
○ Specifics about customer testing
○ Deployment into Research or Production clusters
3. Scope of This Change at Yahoo!
● 42,000+ Hadoop servers
● 20+ clusters
● Three tiers: Sandbox, Research, Production
● 0.20.205.x
4. Overview of the Process
● Provide customers a 0.23 Sandbox cluster
● Provide customers enough data to test their
applications
● Provide developer support to address
application issues quickly
● Upgrade Research and Production clusters
as applications are certified to work with 0.23
5. Test Cluster
● 420 Nodes
● 2 x Westmere 4 core processors
● 24G RAM
● 12 x 2T Disks
● No Federation
14. DataNode/NameNode
start_20(){
. . .
}
start_next(){
. . .
}
if [ -x /home/gs/hadoop/current/bin/hdfs ] ; then
start_next $@
else
start_20 $@
fi
15. SecondaryNameNode
function clean_checkpoint_dir {
CHECKPOINT_DIR=/grid/0/tmp/hadoop-
hdfs/dfs/namesecondary/current
if [ -d "$CHECKPOINT_DIR" ] ; then
DELETE_DIR=`mktemp -p /grid/0/tmp -d delete-XXXXXX`
if [ $? -eq 0 ] ; then
echo "moving $CHECKPOINT_DIR to ${DELETE_DIR}/ "
mv $CHECKPOINT_DIR ${DELETE_DIR}/
cat<<EOF | at now+1min 2>/dev/null
if [ -d $DELETE_DIR ] ; then
rm -rf --preserve-root $DELETE_DIR
fi
EOF
fi
fi
}
16. HistoryServer
case "$1" in
start)
su $HADOOP_USER -s /bin/sh -c
"$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh
--config $HADOOP_CONF_DIR start
historyserver"
RET=$?
;;
17. ResourceManager/NodeManager
case "$1" in
start)
su $HADOOP_USER -s /bin/sh -c
"$HADOOP_PREFIX/sbin/yarn-daemon.sh --config
$HADOOP_CONF_DIR start $PROC"
RET=$?
;;