3. OperaNng
Systems
• Manage
and
schedule
machine
resources
• CPU
• RAM
• Memory
• Provide
abstracNons
and
APIs
• Files
=
stream
of
bytes
• Process
=
instrucNons
+
private
memory
space
3
4. Distributed
OperaNng
System
• Same
thing,
but
over
a
cluster
of
networked
servers
• AddiNonal
concerns:
• Inter-‐process
and
inter-‐machine
communicaNon
• Data
locality
• Data
availability
• Data
processing
availability
4
5. Hadoop
• Defacto
Distributed
OperaNng
System
• Apache
HDFS
• Apache
MapReduce
and
Apache
YARN
5
6. Ecosystem
6
Key
Value
Stores
High
Level
Batch
Languages
Low
Latency
SQL
Engine
Graph
Processing
23. System
Logs
• Id
• Unique
id
for
an
acNon
• Timestamp
• Time
the
acNon
occured
• Actor
• User
or
system
performing
the
acNon
• AcNon
• The
acNon
taken
• Object
• The
object
of
the
acNon
• Info
• Free
form
informaNon
(e.g.
success/failure,
alribute
value,
etc.)
23
30. Recap
• Accumulo
1.3.x,
1.4.x,
and
1.5.x
all
work
with
CDH3
• Accumulo
1.5.x
should
work
with
CDH4
30
31. Cloudera
Support
• Naturally,
Cloudera
has
tested
and
packaged
Accumulo
1.5…
• But
1.5
is
rather
bleeding
edge…
• So,
we
instead
back
ported
Hadoop
2.0
support
from
1.5
onto
1.4.3
31
37. Recap
• What’s
available
today
• Beta
release
of
Accumulo
1.4.3
on
CDH4.3
• Beta
release
of
Accumulo
1.4.3
Pig
integraNon
• Semi-‐private
beta
• Contact
me
(joey@cloudera.com)
if
you’re
interested
in
trying
out
the
bits
37
39. What
next?
• Download
Hadoop!
• CDH
available
at
www.cloudera.com
• Cloudera
provides
pre-‐loaded
VMs
• hlps://ccp.cloudera.com/display/SUPPORT/Cloudera
+QuickStart+VM
• Reach
out
to
me
(joey@cloudera.com)
if
you
want
to
try
out
the
Accumulo
beta
• InstrucNons
to
replicate
the
demos
pending
40. My
personal
preference
• Cloudera
Manager
• hlps://ccp.cloudera.com/display/SUPPORT/Downloads
• Free
up
to
unlimited
nodes!
41. Shout
Out
• Jason
Trost
• @jason_trost
• covert.io
blog
posts
• hlp://www.covert.io/post/18414889381/accumulo-‐
nutch-‐and-‐gora
• hlp://www.covert.io/post/18605091231/accumulo-‐and-‐
pig