What Big Data Folks Need to Know About DevOps

What Big Data Folks Need
to Know About DevOps
Speaker:

Matt Ray Technical Evangelist
‣ matt@opscode.com
‣ @mattray
Copyright © 2011 Opscode, Inc - All Rights Reserved 1

Copyright © 2011 Opscode, Inc - All Rights Reserved
http://www.ﬂickr.com/photos/anotherphotograph/2100904507/sizes/o/ 2

Developer, SysAdmin, Hacker,
Community Manager


Community Manager
Many biz & dev environments


Community Manager
Many biz & dev environments
Opscode: Training, Services &
Evangelism


http://www.ﬂickr.com/photos/timyates/2854357446/sizes/l/


Developers?



Developers?
Systems Administrators?



Developers?
“BigData” Hacker?



Developers?
“BigData” Hacker?
“Business” People?



DevOps


DevOps
tools + culture


Culture


Trust


Trust
(but verify)


Automation


Infrastructure as
Code


Chef is an API for
your Infrastructure


Principles


Principles
Idempotent


Principles
Idempotent
Data-driven


Principles
Idempotent
Data-driven
Sane defaults


Principles
Idempotent
Data-driven
Sane defaults
Hackability


Principles
Idempotent
Data-driven
Sane defaults
Hackability
TMTOWTDI


Multiple applications of
an operation do not
change the result


We start with APIs, you
supply data


option :json_attribs,
:short => "-j JSON_ATTRIBS",
:long => "--json-attributes JSON_ATTRIBS",
:description => "Load attributes from a JSON ﬁle or
URL",
:proc => nil

option :node_name,
:short => "-N NODE_NAME",
:long => "--node-name NODE_NAME",
:description => "The node name for this client",
Defaults are sane, but
:proc => nil

easily changed

Open source and
community


TMTOWTDI


http://www.brooklynstreetart.com/theBlog/wp-content/uploads/2008/12/swedish_chef_bork-sleeper-cell.jpg

Chef Client runs on your
systems


Clients talk to a Chef
Server


RESTful API w/ JSON
responses


Opscode Platform
the central, highly scalable, multi-tenant
configuration service from Opscode...
a hosted Chef Server

Copyright © 2010 Opscode, Inc. – Conﬁdential – Do Not Redistribute 24

We call each system
you configure a Node
Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.ﬂickr.com/photos/peterrosbjerg/3913766224/ 25

Nodes have Attributes
{
"kernel": {
Kernel info!
"machine": "x86_64",
"name": "Darwin",
"os": "Darwin",
"version": "Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386",
"release": "10.4.0"
},
"platform_version": "10.6.4",
"platform": "mac_os_x",
"platform_build": "10F569",
"domain": "local", Platform info!
"os": "darwin",
"current_user": "jtimberman",
"ohai_time": 1278602661.60043,
"os_version": "10.4.0",
"uptime": "18 days 17 hours 49 minutes 18 seconds",
"ipaddress": "10.13.37.116",
"hostname": "cider",
"fqdn": "cider.local",
"uptime_seconds": 1619358
Hostname and IP!
}


Nodes have a Run List
What Roles or Recipes to apply
in Order


Nodes have Roles

Copyright © 2011 Opscode, Inc - All Rights Reserved http://www.ﬂickr.com/photos/laenulfean/374398044/ 28

Roles have a Run List

What Roles or Recipes to apply
in Order


remote_ﬁle
link
cookbook_ﬁle
service
ruby_block
template
execute

Chef manages
Resources on Nodes
package bash git log
deploy
user http_request

Resources...
Declare a description of the state a part of the node should be in

http://www.ﬂickr.com/photos/xiaming/382205902/sizes/l/

Resources...

package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
end

template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end


Resources...

‣ Have a type package "apache2" do
action :install
end

owner "root"
group "root"
mode 0644
action :create
end


Resources...

action :install
‣ Have a name end

owner "root"
group "root"
mode 0644
action :create
end


Resources...

action :install
‣ Have a name end

‣ Have parameters source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end


Resources...

action :install
‣ Have a name end

‣ Have parameters source "apache2.conf.erb"
owner "root"
‣ Take action to put the group "root"
mode 0644
resource in the action :create
declared state end


Resources take action
through Providers


Providers...
Know how to actually perform the actions speciﬁed by a resource.

http://www.ﬂickr.com/photos/affableslinky/562950216/

Providers...
Know how to actually perform the actions speciﬁed by a resource.

Apt, Yum, Rubygems,
Multiple providers
Portage, Macports,
per resource type.
FreeBSD Ports, etc.

http://www.ﬂickr.com/photos/affableslinky/562950216/

http://www.ﬂickr.com/photos/acurbelo/2628837104/sizes/o/

Resources


Resources

Platform


Resources

Platform

Provider

Recipes are lists of
Resources

http://www.ﬂickr.com/photos/roadsidepictures/2478953342/sizes/o/

Recipes...
Apply resources in the order they are speciﬁed


Recipes...

action :install
end

owner "root"
group "root"
mode 0644
action :create
end


Recipes...

action :install
1
‣ Evaluates resources in
end
the order they appear
owner "root"
group "root"
mode 0644
action :create
2
end


Recipes...

‣ Evaluates resources in
[
the order they appear "package[apache2]",
"template[/etc/apache2/apache2.conf]"
‣ Adds each resource to ]
the Resource
Collection


Order Matters


Order Matters

http://www.infrastructures.org/papers/turing/turing.html


Cookbooks are
packages for Recipes


Common Cookbook Components


recipes/
default.rb


recipes/
default.rb

files/


recipes/
default.rb

files/
templates/


recipes/
default.rb

files/
templates/
attributes/
default.rb


recipes/
default.rb

files/
templates/
attributes/
default.rb

metadata.rb


Cookbooks are
shareable!

cookbooks.opscode.com

Data bags store
arbitrary data


A user data bag item...
% knife data bag show users jtimberman
{
"comment": "Joshua Timberman",
"groups": "sysadmin",
"ssh_keys": "ssh-rsa SUPERSEKRATS jtimberman@cider",
"ﬁles": {
".zshrc": {
"mode": "0644",
"source": "dot-zshrc"
},
".vimrc": {
"mode": "0644",
"source": "dot-vimrc"
}
},
"id": "jtimberman",
"uid": 7004,
"shell": "/usr/bin/zsh",
"openid": "http://jtimberman.myopenid.com/"
}


Data bags make recipes
awesome-r (that’s
totally a word)


sysadmins = search(:users, 'groups:sysadmin')

sysadminss.each do |u|
user u['id'] do
uid u['id']
shell u['shell']
comment u['comment']
supports :manage_home => true
home "/home/#{u['id']}"
end

directory "/home/#{u['id']}/.ssh" do
owner u['id']
group u['id']
mode 0700
end

template "/home/#{u['id']}/.ssh/authorized_keys" do
source "authorized_keys.erb"
owner u['id']
group u['id']
mode 0600
variables :ssh_keys => u['ssh_keys']
end
end

Command-line API
utility, Knife

http://www.ﬂickr.com/photos/myklroventine/3474391066/

Nodes, Roles, Data
Bags are Searchable

% knife search node “role:webserver”

search(:users, “group:sysadmins”)


Cluster Chef


Hadoop

HDFS
NameNode

Secondary NN*

DataNode(s)


Hadoop

MapReduce
JobTracker

TaskTracker(s)


Let’s Get Cooking

Prerequisites are already in place right?
http://bit.ly/dda-chef


Push the Cookbooks

$ cd $CLUSTER_CHEF_PATH
$ knife cookbook upload --all

These run as root, kids.
Let’s not blindly trust the upstream too much!


Cookbooks

Recipes
datanode.rb
jobtracker.rb
namenode.rb
secondarynamenode.rb
tasktracker.rb
more!


Push the Roles

$ for foo in roles/*.rb ; do knife role from ﬁle $foo &
sleep 1 ; done


Cluster Chef’s Facets
Roles
hadoop_master
hadoop_namenode
hadoop_secondarynamenod
e
hadoop_jobtracker
hadoop_worker
hadoop_datanode
hadoop_tasktracker


Provisioning

Nodes
demohadoop-master-i-77f2661b

demohadoop-worker-i-e390148f
demohadoop-worker-i-ff901493


Is this thing on?
$ knife cluster
Available cluster subcommands: (for details, knife SUB-COMMAND --help)

** CLUSTER COMMANDS **
knife cluster launch CLUSTER_NAME FACET_NAME (options)
knife cluster show CLUSTER_NAME FACET_NAME (options)
knife cluster bootstrap CLUSTER_NAME FACET_NAME SERVER_FQDN (options)


Let’s launch our
Hadoop Cluster!

$ knife cluster launch demohadoop master --bootstrap


knife ec2 server create


Creates EC2 instance via API


Retrieves local configuration


SSH to instance


SSH to instance
‣ Cluster Chef extends this
security groups
picks the AMI
builds the number of speciﬁed nodes
‣ Writes chef configuration and authentication
‣ Installs Ruby and Chef
‣ Runs Chef with specified run list


Still a bit of tweaking

$ knife ssh "role:demohadoop_master" "sudo service hadoop-0.20-datanode
stop; sudo service hadoop-0.20-namenode stop; sudo service hadoop-0.20-
tasktracker stop; sudo service hadoop-0.20-jobtracker stop; sudo service
hadoop-0.20-secondarynamenode stop; sudo -u hdfs hadoop fs -chown -R
hbase:hbase /hadoop/hbase; sudo chef-client" -x ubuntu -a
ec2.public_hostname -i ~/.chef/keypairs/demohadoop.pem


Hadoop Workers!

$ knife cluster launch demohadoop worker --bootstrap


Is it really on?
‣ Configure your network settings to use a SOCKS proxy
‣ http://ec2-public-ip-
address.compute-1.amazonaws.com
‣ copy & paste the SSH command
‣ Profit!


Our Hadoop Cluster
is Operational...


Resources/Questions
www.opscode.com/chef
IRC and Mailing lists
‣ irc.freenode.net #chef
‣ lists.opscode.com

Twitter:
‣ @opscode, #opschef
‣ @mattray

Questions?


What Big Data Folks Need to Know About DevOps

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a What Big Data Folks Need to Know About DevOps

Semelhante a What Big Data Folks Need to Know About DevOps (20)

Mais de Matt Ray

Mais de Matt Ray (20)

Último

Último (20)

What Big Data Folks Need to Know About DevOps

Notas do Editor