Improving Hadoop Cluster Performance via Linux Configuration

Improving
Hadoop

Cluster
Performance
via

Linux
Conﬁgura:on

DevIgni:on
2014
–
Dulles,
Virginia

Alex
Moundalexis
//
@technmsg

2
©
Cloudera,
Inc.
All
rights
reserved.

Tips

from
a
former
system
administrator

3
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Richard
Bumgardner

Been
there,
done
that.

4
©
Cloudera,
Inc.
All
rights
reserved.

Tips

from
a
former
system
administrator
ﬁeld
guy

5
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Alex
Moundalexis

Home
sweet
home.

6
©
Cloudera,
Inc.
All
rights
reserved.

Tips

Easy
steps
to
take…

7
©
Cloudera,
Inc.
All
rights
reserved.

Tips

Easy
steps
to
take…
that
most
people
don’t.

8
©
Cloudera,
Inc.
All
rights
reserved.

What
this
talk
isn’t
about

•  Deploying

• Puppet,
Chef,
Ansible,
homegrown
scripts,
intern
labor

•  Sizing
&
Tuning

• Depends
heavily
on
data
and
workload

•  Coding

• Unless
you
count
STDOUT
redirec:on

•  Algorithms

• I
suck
at
math,
but
we’ll
try
some
mul:plica:on
later

9
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

10
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

it
depends.”

11
©
Cloudera,
Inc.
All
rights
reserved.

“The
answer
to
most
Hadoop

ques:ons
is…

it
depends.”

(helpful,
right?)

12
©
Cloudera,
Inc.
All
rights
reserved.

So
what
ARE
we
talking
about?

•  Seven
simple
things

• Quick

• Safe

• Viable
for
most
environments
and
use
cases

•  Iden:fy
issue,
then
oﬀer
solu:on

•  Note:
Commands
run
as
root
or
sudo

13
©
Cloudera,
Inc.
All
rights
reserved.

1.
Swapping

Bad
news,
best
not
to.

14
©
Cloudera,
Inc.
All
rights
reserved.

Swapping

•  A
form
of
memory
management

•  When
OS
runs
low
on
memory…

• write
blocks
to
disk

• use
now-‐free
memory
for
other
things

• read
blocks
back
into
memory
from
disk
when
needed

•  Also
known
as
paging

15
©
Cloudera,
Inc.
All
rights
reserved.

Swapping

•  Problem:
Disks
are
slow,
especially
to
seek

•  Hadoop
is
about
maximizing
IO

• spend
less
:me
acquiring
data

• operate
on
data
in
place

• large
streaming
reads/writes
from
disk

•  Memory
usage
is
somewhat
limited
within
JVM

• we
should
be
able
to
manage
our
memory

• account
for
JVM
overhead

16
©
Cloudera,
Inc.
All
rights
reserved.

Limit
swapping
in
kernel

•  Well,
as
much
as
possible.

•  Immediate:

#
echo
1
>
/proc/sys/vm/swappiness

•  Persist
amer
reboot:

#
echo
"vm.swappiness
=
1"
>>
/etc/sysctl.conf

17
©
Cloudera,
Inc.
All
rights
reserved.

Swapping
peculiari:es

•  Behavior
varies
based
on
Linux
kernel

• CentOS
6.4+
/
Ubuntu
10.10+

• For
you
kernel
gurus,
that’s
Linux
2.6.32-‐303+

•  Prior

• We
don’t
swap,
except
to
avoid
OOM
condi:on.

•  Amer

• We
don’t
swap,
ever.

•  Details:
hpp://:ny.cloudera.com/noswap

18
©
Cloudera,
Inc.
All
rights
reserved.

2.
File
Access
Time

Disable
this
too.

19
©
Cloudera,
Inc.
All
rights
reserved.

File
access
:me

•  Linux
tracks
access
:me

• writes
to
disk
even
if
all
you
did
was
read

•  Problem

• more
disk
seeks

• HDFS
is
write-‐once,
read-‐many

• NameNode
tracks
access
informa:on
for
HDFS

20
©
Cloudera,
Inc.
All
rights
reserved.

Don’t
track
access
:me

•  Mount
volumes
with
noatime
op:on

• In
/etc/fstab:

/dev/sdc
/data01
ext3
defaults,noatime
0

• Note:
noatime
assumes
nodirtime
as
well

•  What
about
relatime?

• Faster
than
atime
but
slower
than
noatime

•  No
reboot
required

• #
mount
-‐o
remount
/data01

21
©
Cloudera,
Inc.
All
rights
reserved.

3.
Root
Reserved
Space

Reclaim
it,
impress
your
bosses!

22
©
Cloudera,
Inc.
All
rights
reserved.

Root
reserved
space

•  EXT3/4
reserve
5%
of
disk
for
root-‐owned
ﬁles

• On
an
OS
disk,
sure

• System
logs,
kernel
panics,
etc

23
©
Cloudera,
Inc.
All
rights
reserved.

Click
to
edit
Master
:tle
style

CC
BY
2.0
/
Alex
Moundalexis

Disks
used
to
be
much
smaller,
right?

24
©
Cloudera,
Inc.
All
rights
reserved.

Do
the
math

•  Conserva:ve

• 5%
of
1
TB
disk
=
46
GB

• 5
data
disks
per
server
=
230
GB

• 5
servers
per
rack
=
1.15
TB

•  Quasi-‐Aggressive

• 5%
of
4
TB
disk
=
186
GB

• 12
data
disks
per
server
=
2.23
TB

• 18
servers
per
rack
=
40.1
TB

•  That’s
a
LOT
of
unused
storage!

25
©
Cloudera,
Inc.
All
rights
reserved.

Root
reserved
space

•  On
a
Hadoop
data
disk,
no
root-‐owned
ﬁles

•  When
crea:ng
a
par::on

#
mkfs.ext3
–m
0
/dev/sdc

•  On
exis:ng
par::ons

#
tune2fs
-‐m
0
/dev/sdc

• 0
is
safe,
1
is
for
the
ultra-‐paranoid

26
©
Cloudera,
Inc.
All
rights
reserved.

4.
Name
Service
Cache

Turn
it
on,
already!

27
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Daemon
that
caches
name
service
requests

• Passwords

• Groups

• Hosts

•  Helps
weather
network
hiccups

•  Helps
more
with
high
latency
LDAP,
NIS,
NIS+

•  Small
footprint

•  Zero
conﬁgura:on
required

28
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Hadoop
nodes

• largely
a
network-‐based
applica:on

• on
the
network
constantly

• issue
lots
of
name
lookups,
especially
HBase
&
distcp

• can
thrash
name
servers

•  Reducing
latency
of
service
requests?
Smart.

•  Reducing
impact
on
shared
infrastructure?
Smart.

29
©
Cloudera,
Inc.
All
rights
reserved.

Name
Service
Cache
Daemon

•  Turn
it
on,
let
it
work,
leave
it
alone:

#
chkconfig
-‐-‐level
345
nscd
on

#
service
nscd
start

•  Check
on
it
later:

#
nscd
-‐g

•  Unless
using
Red
Hat
SSSD;
modify
nscd
conﬁg
ﬁrst!

• Don’t
use
nscd
to
cache
passwd,
group,
or
netgroup

• Red
Hat,
Using
NSCD
with
SSSD.
hpp://goo.gl/68HTMQ

30
©
Cloudera,
Inc.
All
rights
reserved.

5.
File
Handle
Limits

Not
a
problem,
un:l
they
are.

31
©
Cloudera,
Inc.
All
rights
reserved.

File
handle
limits

•  Kernel
refers
to
ﬁles
via
a
handle

• Also
called
descriptors

•  Linux
is
a
mul:-‐user
system

•  File
handles
protect
the
system
from

• Poor
coding

• Malicious
users

• Poor
coding
of
malicious
users

• Pictures
of
cats
on
the
Internet

32
©
Cloudera,
Inc.
All
rights
reserved.
32

Microsom
Oﬃce
EULA.
Really.

java.io.FileNotFoundExcep:on:
(Too
many
open
ﬁles)

33
©
Cloudera,
Inc.
All
rights
reserved.

File
handle
limits

•  Linux
defaults
usually
not
enough

•  Increase
maximum
open
ﬁles
(default
1024)

#
echo
hdfs
–
nofile
32768
>>
/etc/security/limits.conf

#
echo
mapred
–
nofile
32768
>>

#
echo
hbase
–
nofile
32768
>>

•  Bonus:
Increase
maximum
processes
too

#
echo
hdfs
–
nproc
32768
>>

#
echo
mapred
–
nproc
32768
>>

#
echo
hbase
–
nproc
32768
>>

•  Note:
Cloudera
Manager
will
do
this
for
you.

34
©
Cloudera,
Inc.
All
rights
reserved.

6.
Dedicated
Disks

Don’t
be
tempted
to
share,
even
with
monster
disks.

35
©
Cloudera,
Inc.
All
rights
reserved.

The
Situa:on

1.  Your
new
server
has
a
dozen
1
TB
disks

2.  Eleven
disks
are
used
to
store
data

3.  One
disk
is
used
for
the
OS

• 20
GB
for
the
OS

• 980
GB
sits
unused

4.  Someone
asks
“can
we
store
data
there
too?”

5.  Seems
reasonable,
lots
of
space…
“OK,
why
not.”

Sound
familiar?

36
©
Cloudera,
Inc.
All
rights
reserved.

Microsom
Oﬃce
EULA.
Really.

“I
don’t
understand
it,
there’s

no
consistency
to
these
run
>mes!”

37
©
Cloudera,
Inc.
All
rights
reserved.

No
love
for
shared
disk

•  Our
quest
for
data
gets
interrupted
a
lot:

• OS
opera:ons

• OS
logs

• Hadoop
logging,
quite
chapy

• Hadoop
execu:on

• userspace
execu:on

•  Disk
seeks
are
slow,
remember?

38
©
Cloudera,
Inc.
All
rights
reserved.

Dedicated
disk
for
OS
and
logs

•  At
install
:me

• Disk
0,
OS
&
logs

• Disk
1-‐n,
Hadoop
data

•  Amer
install,
more
complicated
eﬀort,
requires
manual
HDFS
block
rebalancing:

1.  Take
down
HDFS

•  If
you
can
do
it
in
under
10
minutes,
just
the
DataNode

2.  Move
or
distribute
blocks
from
disk0/dir
to
disk[1-‐n]/dir

3.  Remove
dir
from
HDFS
conﬁg
(dfs.data.dir)

4.  Start
HDFS

39
©
Cloudera,
Inc.
All
rights
reserved.

7.
Name
Resolu:on

Sane,
both
forward
and
reverse.

40
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
op:ons

1.  Hosts
ﬁle,
if
you
must

2.  DNS,
much
preferred

41
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
with
hosts
ﬁle

•  Set
canonical
names
properly

•  Right

10.1.1.1

r01m01.cluster.org
r01m01
master1

10.1.1.2

r01w01.cluster.org

r01w01
worker1

•  Wrong

10.1.1.1

r01m01

r01m01.cluster.org
master1

10.1.1.2

r01w01

r01w01.cluster.org
worker1

42
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
with
hosts
ﬁle

•  Set
loopback
address
properly

• Ensure
127.0.0.1
resolves
to
“localhost,”
NOT
hostname

•  Right

127.0.0.1
localhost

•  Wrong

127.0.0.1
r01m01

45
©
Cloudera,
Inc.
All
rights
reserved.

Name
resolu:on
errata

•  Mismatches?
Expect
odd
results.

• Problems
star:ng
DataNodes

• Non-‐FQDN
in
Web
UI
links

• Security
features
are
extra
sensi:ve
to
FQDN

•  Errors
so
common
that
link
to
FAQ
is
included
in
logs!

• hpp://wiki.apache.org/hadoop/UnknownHost

•  Get
name
resolu:on
working
BEFORE
enabling
nscd!

49
©
Cloudera,
Inc.
All
rights
reserved.

Summary

1.  disable
vm.swappiness

2.  data
disks:
mount
with
noatime
op:on

3.  data
disks:
disable
root
reserve
space

4.  enable
nscd

5.  increase
file
handle
limits

6.  use
dedicated
OS/logging
disk

7.  sane
name
resolu:on

hpp://:ny.cloudera.com/7steps

54
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Disk
IO

• hdparm

•  #
hdparm
-‐Tt
/dev/sdc

•  Looking
for
at
least
70
MB/s
from
7200
RPM
disks

•  Slower
could
indicate
a
failing
drive,
disk
controller,
array,
etc.

• dd

•  hpp://romanrm.ru/en/dd-‐benchmark

55
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Disable
Red
Hat
Transparent
Huge
Pages
(RH6+
un:l
6.5)

• Can
reduce
elevated
CPU
usage

• In
rc.local:

echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/defrag

echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/enabled

• Reference:
Linux
6
Transparent
Huge
Pages
and
Hadoop
Workloads,
hpp://
goo.gl/WSF2qC

57
©
Cloudera,
Inc.
All
rights
reserved.

Other
things
to
check

•  Enable
Jumbo
Frames

• Only
if
your
network
infrastructure
supports
it!

• Can
easily
(and
arguably)
boost
throughput
by
10-‐20%

•  Monitor
and
Chart
Everything

• How
else
will
you
know
what’s
happening?

• Nagios

• Ganglia

Improving Hadoop Cluster Performance via Linux Configuration

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Improving Hadoop Cluster Performance via Linux Configuration

Semelhante a Improving Hadoop Cluster Performance via Linux Configuration (20)

Último

Último (20)

Improving Hadoop Cluster Performance via Linux Configuration