44. Resources...
Declare a description of the state a part of the node should be in
package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
end
template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end
http://www.flickr.com/photos/xiaming/382205902/sizes/l/
45. Resources...
Declare a description of the state a part of the node should be in
‣ Have a type package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
end
template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end
http://www.flickr.com/photos/xiaming/382205902/sizes/l/
46. Resources...
Declare a description of the state a part of the node should be in
‣ Have a type package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
‣ Have a name end
template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end
http://www.flickr.com/photos/xiaming/382205902/sizes/l/
47. Resources...
Declare a description of the state a part of the node should be in
‣ Have a type package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
‣ Have a name end
template "/etc/apache2/apache2.conf" do
‣ Have parameters source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end
http://www.flickr.com/photos/xiaming/382205902/sizes/l/
48. Resources...
Declare a description of the state a part of the node should be in
‣ Have a type package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
‣ Have a name end
template "/etc/apache2/apache2.conf" do
‣ Have parameters source "apache2.conf.erb"
owner "root"
‣ Take action to put the group "root"
mode 0644
resource in the action :create
declared state end
http://www.flickr.com/photos/xiaming/382205902/sizes/l/
50. Providers...
Know how to actually perform the actions specified by a resource.
http://www.flickr.com/photos/affableslinky/562950216/
51. Providers...
Know how to actually perform the actions specified by a resource.
Apt, Yum, Rubygems,
Multiple providers
Portage, Macports,
per resource type.
FreeBSD Ports, etc.
http://www.flickr.com/photos/affableslinky/562950216/
57. Recipes...
Apply resources in the order they are specified
http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
58. Recipes...
Apply resources in the order they are specified
package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
end
template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
end
http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
59. Recipes...
Apply resources in the order they are specified
package "apache2" do
version "2.2.11-2ubuntu2.6"
action :install
1
‣ Evaluates resources in
end
the order they appear
template "/etc/apache2/apache2.conf" do
source "apache2.conf.erb"
owner "root"
group "root"
mode 0644
action :create
2
end
http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
60. Recipes...
Apply resources in the order they are specified
‣ Evaluates resources in
[
the order they appear "package[apache2]",
"template[/etc/apache2/apache2.conf]"
‣ Adds each resource to ]
the Resource
Collection
http://www.flickr.com/photos/roadsidepictures/2478953342/sizes/o/
I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
I’ve been a developer and system administrator for well over 10 years. In that time I’ve worked in a number of environments, from mom & pop startups to huge enterprise software shops. I’ve built fully automated infrastructures for internal and external use and hacked on everything in between. Now, I do training, services and evangelism for Opscode. \n
Why did you come today, what do you hope to learn?\n
Why did you come today, what do you hope to learn?\n
Why did you come today, what do you hope to learn?\n
Why did you come today, what do you hope to learn?\n
DevOps, more than just a buzzword. It’s Developers and Operations working together. That might sound obvious, but it’s not.\n
To quote Tim O’Reilly, DevOps is the ability to create and deploy reliable software to an unreliable platform platform that scales horizontally.\n
DevOps is a cultural movement in Development and Operations. It’s Agile realized at the business level, not just Development. It’s about building trust between Dev and Ops. Development can’t throw code over the fence and expect it to just work anymore, they need to be responsible for performance (and get those guys pagers). Operations can’t justify “uptime” above the business, they need to work with Development to make sure the business is rolling out features. Put them in the same space, they’re on the same team.\n\n
Once you have people working together, you’ve got to trust them to get things done. Enable each member of your team to have a voice and you’ll get better results. \n
Back it up with metrics. Don’t just monitor for health, monitor for production. Once you’ve got numbers you can make steady change and understand your results.\n
You need to be thinking in terms of automating everything you can, so value can be derived from development and operations and you can get down to business instead of tweaking and tinkering. Hand-tuning a dozen machines should not be your business’ edge.\n\n
Your infrastructure is not a unique snowflake. With very few exceptions, there is no secret sauce in building servers. Let’s focus on deploying applications in a repeatable, continuous fashion. Infrastructure as Code means that you can tear down and replace your business from version control, data backups and bare metal resources. Want to run on Rackspace instead of EC2? Let’s do it in an hour instead of weeks. How are you going to make this happen?\n
At a high level, Chef is a Ruby library for managing infrastructure primitives. It is a systems integration platform built for scale.\n
Chef gives you the tools primitives to answer the question... How do you want to model data?\nTo configure your systems.\nAnd integrate them together.\nAnd give you an API you can use to work with your infrastructure\n
\n
\n
\n
\n
\n
Idempotent\n
Data driven means\n
Most users start with the default configurations, because they’re field-tested and peer-reviewed.\n
Apache licensed, well over 200 external contributors. Thriving and active user base. \n
\n
There’s More Than One Way To Do It\nIt’s a Perl motto, but it holds true. We give you the tools, you decide how to work it.\n
Let’s talk about how Chef works.\n
Agent executable wrapping libraries\nConfigures your system with the libraries.\n\n\n
The Chef Server is a publishing system. You store data on the server, and it provides an API to access and search the data.\n
We use CouchDB because it stores JSON and has a nice REST API\n
Chef is open source, and we have a product called the Opscode Platform. It has the same API as the Open Source Chef Server.\n\n
Abstraction of a server. With the chef server, node state data is persisted between runs. The edge node does all the heavy lifting.\n
Attributes == data.\n
\n
Roles are another abstraction that describe a set of configuration functionality about nodes. webserver, loadbalancer, database master, etc.\n
\n
Resources are an abstraction we feed data into. When you write recipes in Chef, you create resources of things you want to configure.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
The abstraction over the commands or API calls that will configure the resource to be in the state you have defined.\n
These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
These actions are relevant to the provider\nCommands or API calls made to configure the resource.\nPackage resources can have many different providers.\n
Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
Providers can be platform specific.\nResources are mapped via the platform to the correct provider.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
The order of resources in a recipe, and the order of the recipes applied in run lists.\n\nSidebar about the “Why Order Matters: Turing Equivalence in Automated Systems Administration” paper and RPM installation\n
Cookbooks encapsulate all the components that recipes need to configure the infrastructure. \n
Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
Cookbooks are a directory of code components\nRecipes are the core libraries you use to configure something, but cookbooks can contain other libraries used in recipes.\n
Find and share cookbooks on cookbooks.opscode.com\n
Bags and items in the bags. Anyone play D&D, NWN, etc? Bag of holding!\n\nUsers, application information, network info, cabinet/rack locations. Describe components of your infrastructure with data, and use that data to configure systems.\n
Freeform, describes a user.\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
You can use data bags in recipes!\n\n
Knife is the “swiss army knife” tool of Chef. It primarily works with the Chef Server API, but it can also interact with other APIs such as cloud providers. \n
Knife can be used for many things, search is one of them.\n
Knife can be used for many things, search is one of them.\n
Knife can be used for many things, search is one of them.\n
Knife can be used for many things, search is one of them.\n
Knife can be used for many things, search is one of them.\n
Knife can be used for many things, search is one of them.\n
Now that we’ve covered the basics of Chef. Let’s see how to use Chef to automate deploying Hadoop clusters with Cluster Chef to Amazon EC2.\n
If you’re not familiar with Hadoop... well you came to the right place. HDFS, a distributed file system that provides high throughput access to application data and MapReduce is a programming framework for writing applications that rapidly process vast amounts of data in parallel.\n\nA typical Hadoop cluster consists of 2 master pieces. The NameNode and the JobTracker are the masters of the cluster. The NameNode manages the file system metadata and the DataNodes store the actual data. You can have 1 or more DataNodes in your cluster. The Secondary NameNode cleans up the data. It’s deprecated and has been replaced by other nodes in more recent versions of Hadoop. Flip can tell you more about that later.\n
The JobTracker manages the jobs queue, scheduling and organizing work for the TaskTracker nodes. You can have 1 or more TaskTracker nodes in your cluster.\n\n
Knife is ready to go, we setup Cluster Chef as outlined in the prerequisites.\n
We download the cookbooks that were shared on the cookbook site, but we upload them to the Chef Server. These are discrete and separate, the nodes running Chef don’t talk to the cookbooks site. The Cluster Chef repository bundles up several from Opscode’s repository and provides a number of its own cookbooks in “site-cookbooks”. You may remember editing this in the Prerequisites.\n
Cluster Chef has cookbooks for Hadoop, Cassandra, Hbase, R, Hive, Pig and more.\n\nCookbooks contain recipes, recipes are how systems are configured. You add recipes to Roles or the run_list to get the behavior you want. \n
Roles contain attributes and our run_lists. You add a role to your nodes to get the behavior you want. Ordering is important!\n
Cluster Chef adds a layer over Chef’s Roles, managing the creation and naming of the nodes and ensuring enough of them are created. Let’s just focus on the Roles though. The “master” facet uses the “hadoop_master” role, making our master a combination of the namenode, secondary namenode and jobtracker. For our example, our master is also a “hadoop_worker”. This works for our small-scale demo, but you could easily put different Hadoop components on different nodes as you scale up and need dedicated servers for each service.\n
Provisioning is the first step. Usually Chef is going to launch the machines individually, but Cluster Chef allows you to launch them in bulk. We need some computers on the internet. For our demonstration they’re going to be a Hadoop master and worker nodes. They could easily be load balancers, webservers, database servers or whatever. We launch those with a cloud API. Every cloud does this. Chef talks to clouds via the library Fog.\n
Test our knife cluster command. If all of our prerequisites are in place, this is going to work just fine.\n\n
Kinda exciting isn’t it? Let’s take a look at the output and see what’s going on... \n\nCluster Chef is going to create the EC2 Security Groups we need... get our vanilla Ubuntu 10.04 AMI launched and bootstrap it with Chef. It extends the functionality of “knife ec2 create”\n\n
knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
knife ec2 server create is the typical way to create our servers, we’re letting Cluster Chef manage them for us instead.\n
For some reason, the initial startup is still finicky, but is at least down to only two passes for hadoop. Flip can talk about this if he wants, for now it’s up so you can run it. We’re going to use knife to search for our hadoop_master and stop hadoop, fix some permissions and re-run our chef-client.\n\n
Now let’s get our workers working and ready to go. We’re going to use Cluster Chef to launch 2 workers, as outlined in our demohadoop.rb cluster file.\n
We need to open up the proxy server so we can spelunk a bit on the cluster.\n
We now have our 3 node cluster up and running, with minimal touch. The really exciting thing here is that this is easy to deploy and expand, it’s predictable and repeatable. We could add further instrumentation to automatically start working on our data. For now, Flip’s going to give us a couple minutes of hands-on demonstration.\n