2. BIG
DATA
IS
A
STRATEGIC
DECISION:
CAPEX
AND
OPEX
Devices
2
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Apps
Cloud
3. HIGHER
PERFORMANCE,
LESS
POWER
AND
SPACE
Hadoop
Technology
Stack
Data
Warehouse
Data
AnalyRcs
Management
Data
Access
Data
Processing
Data
Storage
3
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
4. SEAMICRO
SM15000™
ACCELERATES
APACHE™
HADOOP™
DEPLOYMENTS
! Superior
high
availability
‒ AcRve/standby
NameNode
‒ AcRve/standby
JobTracker
‒ Highly
resilient
fabric
for
inter-‐node
east-‐west
traffic
! Reduced
down
Rme
‒ Remap
or
rezone
disks
to
recover
data
‒ Hot-‐swappable
upgrades
or
component
replacements
! Hardware
redundancy
‒ Power
supplies
‒ Network
I/O
4
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
5. SM15000
OVERVIEW
64
HDDs/SDDs
•
Share
drives
across
all
servers
•
Assign
one
server
to
one
or
more
drives
as
needed
•
In
service
upgrades
as
needed
64
Industry
standard
x86
servers
•
AMD
Opteron™,
Intel
Xeon®,
Atom™
•
Energy
efficient
processor
•
20
Gbps
per
socket,
16X
tradiRonal
servers
960
terabytes
Fabric
Storage
•
Extends
supercompute
fabric
to
external
storage
•
Up
to
3.84
PB
storage
capacity;
up
to
960
3.5”
SAS/
SATA
drives
•
Map
to
any
CPU—same
as
internal
drives
5
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
160
Gbps
Network
I/O
•
Share
network
I/O
across
all
servers
•
Eliminate
TOR
switch
•
Minimize
cabling
•
In
service
upgrades
as
needed
6. SEAMICRO
FREEDOM™
FABRIC
ASIC
PROVIDES
MASSIVE
PERFORMANCE,
REDUCES
POWER
AND
SPACE
B E N E F I T S
Freedom™
SeaMicro
IOVT
TIO
Freedom Supercompute
Fabric
6
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Eliminates 90% of the components
on a motherboard shrinking
power used, cost and space
Reduces the power used by
any CPU by consolidating and
shutting off unused functionality
Provides massive bandwidth
while eliminating power hungry
top of rack switches
7. FS
4060-‐L
FABRIC
STORAGE
ENCLOSURE
WITH
ZONING
CAPABILITY
! High
density,
power
opRmized
4U
enclosure
with
60
3.5”
drives
! Up
to
16
enclosures
per
SM15000,
960
drives,
and
3.84
PB
storage
capacity
! Redundant
controllers,
ports,
fans,
and
PSUs
! Support
cost
opRmized
24x7
operaRons
SATA
HDD
for
high
density
Big
Data
and
Object
Storage
deployments
! OpRonal
configuraRon
to
logically
parRRon
an
enclosure
into
two
30
3.5”
drive
enclosures
! Balanced
disk
to
core
raRo
(1:1)
for
opRmizing
Hadoop
performance
! Field
configurable
to
provide
utmost
flexibility
to
balance
density
and
performance
7
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
8. FREEDOM
FABRIC
DISAGGREGATES
SERVER
RESOURCES
PROVIDES
FLEXIBILITY
FOR
EXPANSION
AND
INFRASTRUCTURE
OPTIMIZATION
! SM15000
provides
independent
scaling
of
Compute,
Storage,
and
Network
! Centrally
managed
provisioning
of
storage
and
network
resources
to
compute
nodes
enabled
by
CLI
and
API
interfaces
SeaMicro
SM15000
Server
Hadoop
OpRmizaRon
Compute
and
Memory
Pool
CPU
CPU
CPU
CPU
Deploy
CPU
CPU
Fabric
Interconnect
Shared
Storage
Pool
8
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Network
Pool
Cost
and
Performance
Tune
and
OpWmize
Run
and
Analyze
9. SM15000
FLEXIBLE
STORAGE
ALLOWS
ITERATIVELY
OPTIMIZING
APACHE®
HADOOP™
DEPLOYMENT
! Flexible
shared
storage
with
commodity
hardware
enabled
by
SeaMicro
fabric
technology
! Decoupled
from
Compute
and
Network
to
grow
storage
independently
! IteraRvely
opRmize
Hadoop
disk
to
core
raRo
as
applicaRon
needs
evolve
Captive DAS with Rigid
Storage to Compute Ratio
Flexible scale-out Fabric Storage
up to 5PB
Freedom
Fabric
Traditional
Rackmount
9
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Intel
/AMD
X86
servers
10. APACHE®
HADOOP™
DEPLOYMENT
ON
THE
SEAMICRO
SM15000™
SM15000
ZooKeeper
HDFS
NameNode
ZooKeeper
MapReduce
JobTracker
MapReduce
JobTracker
Up
to
160
Gb/s
Redundant
NameNode
and
JobTracker
bandwidth
for
data
ingesWon
10
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
DataNode/
TaskTracker
DataNode/
TaskTracker
DataNode/
TaskTracker
DataNode/
TaskTracker
HDFS
NameNode
DataNode/
TaskTracker
DataNode/
TaskTracker
10Gb/s
network
bandwidth/node
DataNode/
TaskTracker
DataNode/
TaskTracker
60
DataNode/
TaskTracker
Nodes
• 8GB/s
storage
bandwidth
• Flexible
storage
capacity
11. SEAMICRO
REFERENCE
ARCHITECTURE
FOR
APACHE®
HADOOP™
! Hadoop
HA
with
NameNode
and
JobTracker
AcRve/Standby
ConfiguraRon
! Up
to
60
DataNode/TaskTracker
nodes,
512
x86
cores,
960
TB
raw
capacity,
10
Gb/s
Internode
bandwidth
and
160
Gb/s
uplink
bandwidth
in
28
RU
and
5.8
kW
SoluWon
Components
•
•
•
•
•
•
Highly
Available
AcRve/Standby
NameNode
Highly
Available
AcRve/Standby
JobTracker
DataNode/TaskTracker
SM15000
Internal
Drives
for
NameNode
and
JobTracker
11
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
•
•
SeaMicro
SM15000
AMD
Opteron
or
Intel
Xeon
(Ivy
Bridge)
CPU
chassis
32
GB
memory
per
node
2
10GbE
Network
cards
8
Storage
Controller
cards
4
FS
4060-‐L
enclosures
with
SAS
zoning
enabled
60
4TB
3.5”
SAS
or
SATA
drives
per
enclosure
Any
Hadoop
distribuRon
(CDH,
HDP,
MapR,
Apache
Hadoop
etc.)
ZooKeeper
for
NameNode
and
JobTracker
HA
12. SM15000
HADOOP
DEPLOYMENT
SoluWon
Components
SM15000
Intel
Xeon
SM15000
AMD
Opteron
Performance
OpRmized
Cost
OpRmized
Racks
<1
<1
Servers
64
64
Cores
256
512
DRAM
2TB
4
TB
240/960
TB
240/960
TB
Cable
Management
0
RU
0
RU
ToR
SwRches
None
None
Downlink
Network
Cables
None
None
60
60
10
Gb/s
10
Gb/s
Uplink
bandwidth
Up
to
160
Gb/s
Up
to
160
Gb/s
Storage
bandwidth
8
GigaBytes/s
8
GigaBytes/s
Use
case
Hard
Drives
Data
Nodes
Bandwidth
per
node
12
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
13. SM15000
–
INDUSTRY’S
ONLY
FLEXIBLE
SYSTEM
FOR
OPTIMIZING
HADOOP
CLUSTERS
Storage
Intensive
Compute
Intensive
Network
Intensive
Compute
Intensive
Storage
Intensive
Map
Reduce
Map
Reduce
Map
HDFS
Input
Up
to
512
x86
cores
with
4TB
DRAM
per
Fabric
Server
in
10RU
13
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Map
and
Intermediate
Data
Write
Flexible
scale-‐out
storage
with
over
1400
spindles
and
5
Petabytes
of
capacity
Shuffle
Reduce
10
Gpbs
Inter-‐
Node
Bandwidth
per
server
HDFS
Output
160
Gbps
shared
uplink
for
Inter-‐
Rack
traffic
14. SM15000
HADOOP
PERFORMANCE
BETTER
THAN
COMPETITIVE
OFFERINGS
! 77%
less
power
per
node
! 30%
less
power
per
core
! 63%
more
data
sorted
per
second
per
Wat
than
Large
Vendor
Large
Vendor
Terasort
CompleRon
7
min
13
seconds
8
min
33
seconds
Nodes
62
incl.
HA
18
248
216
5800
W
7200
W
MB/s
per
Wat
0.4
0.24
MB/s
per
CPU
core
9.3
9.0
Wats/Node
94
W
400
W
Wats/Core
23
W
33
W
SeaMicro
SM15000
with
64
Nodes
based
on
Intel
Xeon®
(Ivy
Bridge)
1265
L-‐v2
CPU,
32
GB
memory,
and
2
3.5”
3TB
SAS
drives
per
node
CPU
Cores
Power
CompeRRve
soluRon
consists
of
dual
socket
2U
rack-‐
mount
servers
with
Intel
Xeon
E5-‐2667
2.9
GHz
octal
core
CPUs
with
64
GB
memory,
16
disks,
and
4
GbE
network
links
per
node
14
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
Terasort
-‐
Sort
rate
per
Wa`
MB/s
per
Wa`
SM15000
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
SM15000
Large
Vendor
15. WAYFAIR.COM:
PERSONALIZED
SHOPPING
EXPERIENCE
FOR
“A
ZILLION
THINGS”
Applica'ons:
Apache®
Hadoop™,
SQL
server,
PHP
! Challenge
‒ Space
and
power
constraints
hindered
availability
of
“shared
nothing”
servers
for
development
‒ Too
costly
in
space
and
power
to
use
tradiRonal
servers
for
the
number
of
servers
required
and
accurately
test
applicaRon
performance
! SoluRon
‒ SeaMicro
SM
high
density
server
‒ 256
Intel®
Xeon®
cores
in
10
RU
system
‒ 64
servers,
1.28
Tbps
SeaMicro
Freedom™
Supercompute
Fabric
! Results
‒ Reduced
development
cycles
and
shortened
Rme
to
market
for
new
products
‒ Increased
producRvity
of
development
engineers
by
providing
abundant
access
to
“shared
nothing”
servers
versus
developing
on
virtualized
server
farms
‒ Eliminated
unnecessary
equipment
such
as
top
of
rack
switches
and
terminal
servers;
simplified
network
and
power
cabling
15
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
“The
SeaMicro
SM
server
is
helping
us
operate
at
a
large
scale
and
fast
pace.
The
key
benefits
are
reduced
operaRng
costs
and
increased
efficiency
for
our
big
data
development
infrastructure.
It
provides
the
highest
density
and
flexibility
while
slashing
energy
consumpRon:
256
Intel
Xeon
cores,
64
hosts.
It
consumes
50
percent
less
power
and
doubled
our
compuRng
capacity...”
Ben
Clark,
Director
of
So@ware
Engineering
16. EHARMONY:
INCREASE
COMPUTING
WHILE
REDUCING
TOTAL
COST
OF
OWNERSHIP
Applica'ons:
Apache®
Hadoop™
! Challenge
‒ Provide
cost
effecRve
compuRng
plaxorm
for
Apache
Hadoop
‒ Reduce
costs
incurred
from
external
cloud
compuRng
! SoluRon
‒ SeaMicro
SM10000-‐64
high
density
server
‒ 512
Intel®
Atom™
cores
in
10
RU
system
! Results
‒ Reduce
TCO
by
more
than
74
percent
‒ Save
thousands
per
month
spent
on
cloud
compuRng
service
‒ URlize
compuRng
resources
7
x
24
16
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
“We
purchased
SeaMicro
servers
and
immediately
reduced
our
operaRng
expenses…The
system
has
been
in
place
for
over
two
years,
and
we
have
had
zero
down
Rme.”
Cormac
Twomey,
Data
Center
Opera'ons
17. AMD
SEAMICRO
PARTNERS
WITH
INDUSTRY
LEADERS
Hadoop
Partner
17
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS
OS/Hypervisor
OpenStack
19. FS
4060-‐L
SAS
ZONING
! SAS
zoning
allows
the
logical
parRRoning
of
FS
4060-‐L
enclosure
into
two
30
disk
enclosures
in
4U
‒ Provides
fully
independent
end
to
end
path
to
all
30
drives
in
each
zone
‒ Up
to
2
S-‐cards
connected
to
a
storage
enclosure
SM15000
S-‐card
SM15000
S-‐card
Zone
1
FS
4060-‐L
Zone
2
FS
4060-‐L
with
SAS
zoning
19
|
RIGHT
SIZING
HADOOP
DEPLOYMENTS