Mais conteúdo relacionado Semelhante a Improving Hadoop Cluster Performance via Linux Configuration (20) Improving Hadoop Cluster Performance via Linux Configuration1. Improving
Hadoop
Cluster
Performance
via
Linux
Configura:on
DevIgni:on
2014
–
Dulles,
Virginia
Alex
Moundalexis
//
@technmsg
2. 2
©
Cloudera,
Inc.
All
rights
reserved.
Tips
from
a
former
system
administrator
3. 3
©
Cloudera,
Inc.
All
rights
reserved.
Click
to
edit
Master
:tle
style
CC
BY
2.0
/
Richard
Bumgardner
Been
there,
done
that.
4. 4
©
Cloudera,
Inc.
All
rights
reserved.
Tips
from
a
former
system
administrator
field
guy
5. 5
©
Cloudera,
Inc.
All
rights
reserved.
Click
to
edit
Master
:tle
style
CC
BY
2.0
/
Alex
Moundalexis
Home
sweet
home.
6. 6
©
Cloudera,
Inc.
All
rights
reserved.
Tips
Easy
steps
to
take…
7. 7
©
Cloudera,
Inc.
All
rights
reserved.
Tips
Easy
steps
to
take…
that
most
people
don’t.
8. 8
©
Cloudera,
Inc.
All
rights
reserved.
What
this
talk
isn’t
about
• Deploying
• Puppet,
Chef,
Ansible,
homegrown
scripts,
intern
labor
• Sizing
&
Tuning
• Depends
heavily
on
data
and
workload
• Coding
• Unless
you
count
STDOUT
redirec:on
• Algorithms
• I
suck
at
math,
but
we’ll
try
some
mul:plica:on
later
9. 9
©
Cloudera,
Inc.
All
rights
reserved.
“The
answer
to
most
Hadoop
ques:ons
is…
10. 10
©
Cloudera,
Inc.
All
rights
reserved.
“The
answer
to
most
Hadoop
ques:ons
is…
it
depends.”
11. 11
©
Cloudera,
Inc.
All
rights
reserved.
“The
answer
to
most
Hadoop
ques:ons
is…
it
depends.”
(helpful,
right?)
12. 12
©
Cloudera,
Inc.
All
rights
reserved.
So
what
ARE
we
talking
about?
• Seven
simple
things
• Quick
• Safe
• Viable
for
most
environments
and
use
cases
• Iden:fy
issue,
then
offer
solu:on
• Note:
Commands
run
as
root
or
sudo
13. 13
©
Cloudera,
Inc.
All
rights
reserved.
1.
Swapping
Bad
news,
best
not
to.
14. 14
©
Cloudera,
Inc.
All
rights
reserved.
Swapping
• A
form
of
memory
management
• When
OS
runs
low
on
memory…
• write
blocks
to
disk
• use
now-‐free
memory
for
other
things
• read
blocks
back
into
memory
from
disk
when
needed
• Also
known
as
paging
15. 15
©
Cloudera,
Inc.
All
rights
reserved.
Swapping
• Problem:
Disks
are
slow,
especially
to
seek
• Hadoop
is
about
maximizing
IO
• spend
less
:me
acquiring
data
• operate
on
data
in
place
• large
streaming
reads/writes
from
disk
• Memory
usage
is
somewhat
limited
within
JVM
• we
should
be
able
to
manage
our
memory
• account
for
JVM
overhead
16. 16
©
Cloudera,
Inc.
All
rights
reserved.
Limit
swapping
in
kernel
• Well,
as
much
as
possible.
• Immediate:
#
echo
1
>
/proc/sys/vm/swappiness
• Persist
amer
reboot:
#
echo
"vm.swappiness
=
1"
>>
/etc/sysctl.conf
17. 17
©
Cloudera,
Inc.
All
rights
reserved.
Swapping
peculiari:es
• Behavior
varies
based
on
Linux
kernel
• CentOS
6.4+
/
Ubuntu
10.10+
• For
you
kernel
gurus,
that’s
Linux
2.6.32-‐303+
• Prior
• We
don’t
swap,
except
to
avoid
OOM
condi:on.
• Amer
• We
don’t
swap,
ever.
• Details:
hpp://:ny.cloudera.com/noswap
18. 18
©
Cloudera,
Inc.
All
rights
reserved.
2.
File
Access
Time
Disable
this
too.
19. 19
©
Cloudera,
Inc.
All
rights
reserved.
File
access
:me
• Linux
tracks
access
:me
• writes
to
disk
even
if
all
you
did
was
read
• Problem
• more
disk
seeks
• HDFS
is
write-‐once,
read-‐many
• NameNode
tracks
access
informa:on
for
HDFS
20. 20
©
Cloudera,
Inc.
All
rights
reserved.
Don’t
track
access
:me
• Mount
volumes
with
noatime
op:on
• In
/etc/fstab:
/dev/sdc
/data01
ext3
defaults,noatime
0
• Note:
noatime
assumes
nodirtime
as
well
• What
about
relatime?
• Faster
than
atime
but
slower
than
noatime
• No
reboot
required
• #
mount
-‐o
remount
/data01
21. 21
©
Cloudera,
Inc.
All
rights
reserved.
3.
Root
Reserved
Space
Reclaim
it,
impress
your
bosses!
22. 22
©
Cloudera,
Inc.
All
rights
reserved.
Root
reserved
space
• EXT3/4
reserve
5%
of
disk
for
root-‐owned
files
• On
an
OS
disk,
sure
• System
logs,
kernel
panics,
etc
23. 23
©
Cloudera,
Inc.
All
rights
reserved.
Click
to
edit
Master
:tle
style
CC
BY
2.0
/
Alex
Moundalexis
Disks
used
to
be
much
smaller,
right?
24. 24
©
Cloudera,
Inc.
All
rights
reserved.
Do
the
math
• Conserva:ve
• 5%
of
1
TB
disk
=
46
GB
• 5
data
disks
per
server
=
230
GB
• 5
servers
per
rack
=
1.15
TB
• Quasi-‐Aggressive
• 5%
of
4
TB
disk
=
186
GB
• 12
data
disks
per
server
=
2.23
TB
• 18
servers
per
rack
=
40.1
TB
• That’s
a
LOT
of
unused
storage!
25. 25
©
Cloudera,
Inc.
All
rights
reserved.
Root
reserved
space
• On
a
Hadoop
data
disk,
no
root-‐owned
files
• When
crea:ng
a
par::on
#
mkfs.ext3
–m
0
/dev/sdc
• On
exis:ng
par::ons
#
tune2fs
-‐m
0
/dev/sdc
• 0
is
safe,
1
is
for
the
ultra-‐paranoid
26. 26
©
Cloudera,
Inc.
All
rights
reserved.
4.
Name
Service
Cache
Turn
it
on,
already!
27. 27
©
Cloudera,
Inc.
All
rights
reserved.
Name
Service
Cache
Daemon
• Daemon
that
caches
name
service
requests
• Passwords
• Groups
• Hosts
• Helps
weather
network
hiccups
• Helps
more
with
high
latency
LDAP,
NIS,
NIS+
• Small
footprint
• Zero
configura:on
required
28. 28
©
Cloudera,
Inc.
All
rights
reserved.
Name
Service
Cache
Daemon
• Hadoop
nodes
• largely
a
network-‐based
applica:on
• on
the
network
constantly
• issue
lots
of
name
lookups,
especially
HBase
&
distcp
• can
thrash
name
servers
• Reducing
latency
of
service
requests?
Smart.
• Reducing
impact
on
shared
infrastructure?
Smart.
29. 29
©
Cloudera,
Inc.
All
rights
reserved.
Name
Service
Cache
Daemon
• Turn
it
on,
let
it
work,
leave
it
alone:
#
chkconfig
-‐-‐level
345
nscd
on
#
service
nscd
start
• Check
on
it
later:
#
nscd
-‐g
• Unless
using
Red
Hat
SSSD;
modify
nscd
config
first!
• Don’t
use
nscd
to
cache
passwd,
group,
or
netgroup
• Red
Hat,
Using
NSCD
with
SSSD.
hpp://goo.gl/68HTMQ
30. 30
©
Cloudera,
Inc.
All
rights
reserved.
5.
File
Handle
Limits
Not
a
problem,
un:l
they
are.
31. 31
©
Cloudera,
Inc.
All
rights
reserved.
File
handle
limits
• Kernel
refers
to
files
via
a
handle
• Also
called
descriptors
• Linux
is
a
mul:-‐user
system
• File
handles
protect
the
system
from
• Poor
coding
• Malicious
users
• Poor
coding
of
malicious
users
• Pictures
of
cats
on
the
Internet
32. 32
©
Cloudera,
Inc.
All
rights
reserved.
32
Microsom
Office
EULA.
Really.
java.io.FileNotFoundExcep:on:
(Too
many
open
files)
33. 33
©
Cloudera,
Inc.
All
rights
reserved.
File
handle
limits
• Linux
defaults
usually
not
enough
• Increase
maximum
open
files
(default
1024)
#
echo
hdfs
–
nofile
32768
>>
/etc/security/limits.conf
#
echo
mapred
–
nofile
32768
>>
/etc/security/limits.conf
#
echo
hbase
–
nofile
32768
>>
/etc/security/limits.conf
• Bonus:
Increase
maximum
processes
too
#
echo
hdfs
–
nproc
32768
>>
/etc/security/limits.conf
#
echo
mapred
–
nproc
32768
>>
/etc/security/limits.conf
#
echo
hbase
–
nproc
32768
>>
/etc/security/limits.conf
• Note:
Cloudera
Manager
will
do
this
for
you.
34. 34
©
Cloudera,
Inc.
All
rights
reserved.
6.
Dedicated
Disks
Don’t
be
tempted
to
share,
even
with
monster
disks.
35. 35
©
Cloudera,
Inc.
All
rights
reserved.
The
Situa:on
1. Your
new
server
has
a
dozen
1
TB
disks
2. Eleven
disks
are
used
to
store
data
3. One
disk
is
used
for
the
OS
• 20
GB
for
the
OS
• 980
GB
sits
unused
4. Someone
asks
“can
we
store
data
there
too?”
5. Seems
reasonable,
lots
of
space…
“OK,
why
not.”
Sound
familiar?
36. 36
©
Cloudera,
Inc.
All
rights
reserved.
Microsom
Office
EULA.
Really.
“I
don’t
understand
it,
there’s
no
consistency
to
these
run
>mes!”
37. 37
©
Cloudera,
Inc.
All
rights
reserved.
No
love
for
shared
disk
• Our
quest
for
data
gets
interrupted
a
lot:
• OS
opera:ons
• OS
logs
• Hadoop
logging,
quite
chapy
• Hadoop
execu:on
• userspace
execu:on
• Disk
seeks
are
slow,
remember?
38. 38
©
Cloudera,
Inc.
All
rights
reserved.
Dedicated
disk
for
OS
and
logs
• At
install
:me
• Disk
0,
OS
&
logs
• Disk
1-‐n,
Hadoop
data
• Amer
install,
more
complicated
effort,
requires
manual
HDFS
block
rebalancing:
1. Take
down
HDFS
• If
you
can
do
it
in
under
10
minutes,
just
the
DataNode
2. Move
or
distribute
blocks
from
disk0/dir
to
disk[1-‐n]/dir
3. Remove
dir
from
HDFS
config
(dfs.data.dir)
4. Start
HDFS
39. 39
©
Cloudera,
Inc.
All
rights
reserved.
7.
Name
Resolu:on
Sane,
both
forward
and
reverse.
40. 40
©
Cloudera,
Inc.
All
rights
reserved.
Name
resolu:on
op:ons
1. Hosts
file,
if
you
must
2. DNS,
much
preferred
41. 41
©
Cloudera,
Inc.
All
rights
reserved.
Name
resolu:on
with
hosts
file
• Set
canonical
names
properly
• Right
10.1.1.1
r01m01.cluster.org
r01m01
master1
10.1.1.2
r01w01.cluster.org
r01w01
worker1
• Wrong
10.1.1.1
r01m01
r01m01.cluster.org
master1
10.1.1.2
r01w01
r01w01.cluster.org
worker1
42. 42
©
Cloudera,
Inc.
All
rights
reserved.
Name
resolu:on
with
hosts
file
• Set
loopback
address
properly
• Ensure
127.0.0.1
resolves
to
“localhost,”
NOT
hostname
• Right
127.0.0.1
localhost
• Wrong
127.0.0.1
r01m01
43. 43
©
Cloudera,
Inc.
All
rights
reserved.
Name
resolu:on
with
DNS
• Forward
• Reverse
• Hostname
should
match
the
FQDN
in
DNS
44. 44
©
Cloudera,
Inc.
All
rights
reserved.
This
is
what
you
ought
to
see
45. 45
©
Cloudera,
Inc.
All
rights
reserved.
Name
resolu:on
errata
• Mismatches?
Expect
odd
results.
• Problems
star:ng
DataNodes
• Non-‐FQDN
in
Web
UI
links
• Security
features
are
extra
sensi:ve
to
FQDN
• Errors
so
common
that
link
to
FAQ
is
included
in
logs!
• hpp://wiki.apache.org/hadoop/UnknownHost
• Get
name
resolu:on
working
BEFORE
enabling
nscd!
46. 46
©
Cloudera,
Inc.
All
rights
reserved.
Summary
Now
is
the
appropriate
:me
to
take
out
your
camera
phone.
47. 47
©
Cloudera,
Inc.
All
rights
reserved.
A
white
background
is
supposedly
beper
for
prin:ng.
(who
prints
things
anymore?)
48. 48
©
Cloudera,
Inc.
All
rights
reserved.
A
white
background
is
supposedly
beper
for
prin:ng.
(but
makes
for
very
pale
slides)
49. 49
©
Cloudera,
Inc.
All
rights
reserved.
Summary
1. disable
vm.swappiness
2. data
disks:
mount
with
noatime
op:on
3. data
disks:
disable
root
reserve
space
4. enable
nscd
5. increase
file
handle
limits
6. use
dedicated
OS/logging
disk
7. sane
name
resolu:on
hpp://:ny.cloudera.com/7steps
50. 50
©
Cloudera,
Inc.
All
rights
reserved.
Recommended
reading
• Hadoop
Opera:ons
hpp://amzn.to/1ydMrLf
51. 51
©
Cloudera,
Inc.
All
rights
reserved.
Ques:ons?
Preferably
related
to
the
talk…
52. 52
©
Cloudera,
Inc.
All
rights
reserved.
Thanks!
Alex
Moundalexis|
@technmsg
53. 53
©
Cloudera,
Inc.
All
rights
reserved.
8.
Bonus
Round
Because
we
have
enough
:me
(or
I
talked
really
fast)…
54. 54
©
Cloudera,
Inc.
All
rights
reserved.
Other
things
to
check
• Disk
IO
• hdparm
• #
hdparm
-‐Tt
/dev/sdc
• Looking
for
at
least
70
MB/s
from
7200
RPM
disks
• Slower
could
indicate
a
failing
drive,
disk
controller,
array,
etc.
• dd
• hpp://romanrm.ru/en/dd-‐benchmark
55. 55
©
Cloudera,
Inc.
All
rights
reserved.
Other
things
to
check
• Disable
Red
Hat
Transparent
Huge
Pages
(RH6+
un:l
6.5)
• Can
reduce
elevated
CPU
usage
• In
rc.local:
echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/defrag
echo
never
>
/sys/kernel/mm/redhat_transparent_hugepage/enabled
• Reference:
Linux
6
Transparent
Huge
Pages
and
Hadoop
Workloads,
hpp://
goo.gl/WSF2qC
56. 56
©
Cloudera,
Inc.
All
rights
reserved.
Other
things
to
check
• Enable
Jumbo
Frames
• Only
if
your
network
infrastructure
supports
it!
• Can
easily
(and
arguably)
boost
throughput
by
10-‐20%
57. 57
©
Cloudera,
Inc.
All
rights
reserved.
Other
things
to
check
• Enable
Jumbo
Frames
• Only
if
your
network
infrastructure
supports
it!
• Can
easily
(and
arguably)
boost
throughput
by
10-‐20%
• Monitor
and
Chart
Everything
• How
else
will
you
know
what’s
happening?
• Nagios
• Ganglia
58. 58
©
Cloudera,
Inc.
All
rights
reserved.
Ques:ons?
Preferably
related
to
the
talk…
59. 59
©
Cloudera,
Inc.
All
rights
reserved.
Thanks!
Alex
Moundalexis|
@technmsg