Open source projects like OpenShift Origin make heavy use of Ansible for successful and repeatable deployments. See the best practices they've developed for implementing maintainable Ansible at scale and under public scrutiny.
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
Ansible Best Practices at Scale
1. ANSIBLE BEST PRACTICES AT SCALE
LEARNING THE 10 BEST PRACTICES USED
BY LEADING OSS THAT DEPEND ON
ANSIBLE
Keith Resar
@KeithResar
2. @KeithResar
Keith Resar: Bio
Wear many hats
@KeithResar Keith.Resar@RedHat.com
Coder
Open Source Contributor and Advocate
Infrastructure Architect
3. ANSIBLE WAS MADE TO HELP MORE
PEOPLE EXPERIENCE THE POWER OF
AUTOMATION SO THEY COULD WORK
BETTER AND FASTER TOGETHER
4. The Open Source Container Application
Platform.
Built around a core of Docker container
packaging and Kubernetes container cluster
management, Origin is also augmented by
application lifecycle management functionality
and DevOps tooling. Origin provides a
complete open source container application
platform.
6. ANSIBLE FILES SHOULD NOT USE JSON
(USE PURE YAML INSTEAD)
RULE
1
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-files-SHOULD-NOT-use-JSON-use-pure-YAML-instead
- foo
- bar:
- baz
- kwa
- 1.0
- 2
[
"foo",
{
"bar": [
"baz",
"kwa",
1,
2
]
}
]
JSON YAML
7. ANSIBLE FILES SHOULD NOT USE JSON
(USE PURE YAML INSTEAD)
RULE
1
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-files-SHOULD-NOT-use-JSON-use-pure-YAML-instead
YAML is a superset of JSON, which means that Ansible allows JSON
syntax to be interspersed. Even though YAML (and by extension Ansible)
allows for this, JSON SHOULD NOT be used.
Reasons:
● Ansible is able to give clearer error messages when the files are pure
YAML
● YAML makes for nicer diffs as YAML tends to be multi-line, whereas
JSON tends to be more concise
● YAML reads more nicely (opinion?)
8. 3 OR MORE PARAMETERS TO ANSIBLE
MODULES SHOULD USE THE YAML
DICTIONARY FORMAT
RULE
2
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Parameters-to-Ansible-modules-SHOULD-use-the-Yaml-dictionary-fo
rmat-when-3-or-more-parameters-are-being-passed
# ✘ BAD
- file: src=/file/to/link/to
dest=/path/to/symlink owner=foo
group=foo state=link
# ✔ GOOD
- file:
src: /file/to/link/to
dest: /path/to/symlink
owner: foo
group: foo
state: link
9. 3 OR MORE PARAMETERS TO ANSIBLE
MODULES SHOULD USE THE YAML
DICTIONARY FORMAT
RULE
2
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Parameters-to-Ansible-modules-SHOULD-use-the-Yaml-dictionary-fo
rmat-when-3-or-more-parameters-are-being-passed
When a module has several parameters that are being passed in, it’s
hard to see exactly what value each parameter is getting.
It is preferred to use the Ansible Yaml syntax to pass in parameters so
that it’s more clear what values are being passed for each parameter.
10. PARAMETERS TO ANSIBLE MODULES
SHOULD USE THE DICTIONARY FORMAT IF
LINES WOULD EXCEED 120 CHARACTERS
RULE
3
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Parameters-to-Ansible-modules-SHOULD-use-the-Yaml-dictionary-fo
rmat-when-the-line-length-exceeds-120-characters
# ✘ BAD
- get_url: url=http://example.com/path/file.conf
dest=/etc/foo.conf
sha256sum=b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b8
78ae4944c
# ✔ GOOD
- get_url:
url: http://example.com/path/file.conf
dest: /etc/foo.conf
Sha256sum:
B5bb9d8014a0f9b1d61e21e796d78dc...d32812f4850b878ae4944c
11. PARAMETERS TO ANSIBLE MODULES
SHOULD USE THE DICTIONARY FORMAT IF
LINES WOULD EXCEED 120 CHARACTERS
RULE
3
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Parameters-to-Ansible-modules-SHOULD-use-the-Yaml-dictionary-fo
rmat-when-the-line-length-exceeds-120-characters
Lines that are long quickly become a wall of text that isn’t easily parsable.
It is preferred to use the Ansible Yaml syntax to pass in parameters so
that it’s more clear what values are being passed for each parameter.
12. THE ANSIBLE COMMAND MODULE SHOULD
BE USED INSTEAD OF THE ANSIBLE SHELL
MODULE
RULE
4
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-Ansible-command-module-SHOULD-be-used-instead-of-the-Ans
ible-shell-module
# ✘ POOR
- name: Bare shell execution
shell: cat myfile
# BETTER
- name: Quoting templated variable to avoid injection
shell: cat {{ myfile | quote }}
# ✔ BEST
- name: Quoting templated variable to avoid injection
command: cat {{ myfile }}
13. THE ANSIBLE COMMAND MODULE SHOULD
BE USED INSTEAD OF THE ANSIBLE SHELL
MODULE
RULE
4
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-Ansible-command-module-SHOULD-be-used-instead-of-the-Ans
ible-shell-module
If you want to execute a command securely and predictably, it may be
better to use the command module instead, using the shell module only
when explicitly required.
The Ansible shell module can run most commands that can be run from a
bash CLI. This makes it extremely powerful, but it also opens our
playbooks up to being exploited by attackers.
When running ad-hoc commands, use your best judgement.
14. ANSIBLE PLAYBOOKS MUST BEGIN WITH
CHECKS FOR ANY VARIABLES THAT THEY
REQUIRE
RULE
5
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-playbooks-MUST-begin-with-checks-for-any-variables-that-th
ey-require
---
- hosts: localhost
gather_facts: no
tasks:
- fail: msg="Playbook requires g_env to be set and non empty"
when: g_env is not defined or g_env == ''
---
# tasks/main.yml
- fail: msg="Role requires arl_env to be set and non empty"
when: arl_env is not defined or arl_env == ''
15. ANSIBLE PLAYBOOKS MUST BEGIN WITH
CHECKS FOR ANY VARIABLES THAT THEY
REQUIRE
RULE
5
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-playbooks-MUST-begin-with-checks-for-any-variables-that-th
ey-require
If an Ansible playbook or role requires certain variables to be set, it’s best
to check for these up front before any other actions have been performed.
In this way, the user knows exactly what needs to be passed into the
playbook.
16. ANSIBLE TASKS SHOULD NOT BE USED IN
ANSIBLE PLAYBOOKS. INSTEAD, USE
PRE_TASKS AND POST_TASKS
RULE
6
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-tasks-SHOULD-NOT-be-used-in-ansible-playbooks-Instead-
use-pre_tasks-and-post_tasks
# ✘ BAD
- hosts: localhost
tasks:
- name: Executes AFTER the example_role, so it’s confusing
debug: msg="in tasks list"
roles:
- role: example_role
# ✔ GOOD
- hosts: localhost
pre_tasks:
- name: Executes BEFORE the example_role, so it makes sense
debug: msg="in pre_tasks list"
roles:
- role: example_role
17. ANSIBLE TASKS SHOULD NOT BE USED IN
ANSIBLE PLAYBOOKS. INSTEAD, USE
PRE_TASKS AND POST_TASKS
RULE
6
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-tasks-SHOULD-NOT-be-used-in-ansible-playbooks-Instead-
use-pre_tasks-and-post_tasks
An Ansible play is defined as a Yaml dictionary and because of that
Ansible doesn’t know if the play’s tasks list or roles list was specified first.
Therefore, Ansible always runs tasks after roles.
This can be quite confusing if the tasks list is defined in the playbook
before the roles list because people assume in order execution in
Ansible.
Therefore, we SHOULD use pre_tasks and post_tasks to make it more
clear when the tasks will be run.
18. ALL TASKS IN A ROLE SHOULD BE TAGGED
WITH THE ROLE NAME
RULE
7
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#All-tasks-in-a-role-SHOULD-be-tagged-with-the-role-name
# roles/example_role/tasks/main.yml
- debug: msg="in example_role"
tags:
- example_role
19. ALL TASKS IN A ROLE SHOULD BE TAGGED
WITH THE ROLE NAME
RULE
7
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#All-tasks-in-a-role-SHOULD-be-tagged-with-the-role-name
Ansible tasks can be tagged, and then these tags can be used to either
run or skip the tagged tasks using the --tags and --skip-tags
ansible-playbook options respectively.
This is very useful when developing and debugging new tasks. It can also
significantly speed up playbook runs if the user specifies only the roles
that changed.
20. THE ANSIBLE ROLES DIRECTORY MUST
MAINTAIN A FLAT STRUCTURE
RULE
8
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-Ansible-roles-directory-MUST-maintain-a-flat-structure
production # inventory file for production servers
staging # inventory file for staging environment
group_vars/
host_vars/
site.yml # master playbook
webservers.yml # playbook for webserver tier
dbservers.yml # playbook for dbserver tier
roles/
common/ # this hierarchy represents a "role"
tasks/, handlers/, templates/, files/, vars/, defaults/, meta/
21. THE ANSIBLE ROLES DIRECTORY MUST
MAINTAIN A FLAT STRUCTURE
RULE
8
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-Ansible-roles-directory-MUST-maintain-a-flat-structure
The purpose of this rule is to:
● Comply with the upstream best practices
● Make it familiar for new contributors
● Make it compatible with Ansible Galaxy
22. ANSIBLE ROLES SHOULD BE NAMED
TECH_COMPONENT[_SUBCOMPONENT]
RULE
9
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-Roles-SHOULD-be-named-like-technology_component_subc
omponent
roles/
# this hierarchy represents a "role"
common/
# ✘ BAD
database/
# ✔ GOOD
mysql_slave/
23. ANSIBLE ROLES SHOULD BE NAMED
TECH_COMPONENT[_SUBCOMPONENT]
RULE
9
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#Ansible-Roles-SHOULD-be-named-like-technology_component_subc
omponent
For consistency, role names SHOULD follow the above naming pattern. It
is important to note that this is a recommendation for role naming, and
follows the pattern used by upstream.
Many times the technology portion of the pattern will line up with a
package name. It is advised that whenever possible, the package name
should be used.
25. THE DEFAULT FILTER SHOULD REPLACE
EMPTY STRINGS, LISTS, ETC
RULE
10
@KeithResarhttps://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#The-default-filter-SHOULD-replace-empty-strings-lists-etc
When using the jinja2 default filter, unless the variable is a boolean,
specify true as the second parameter. This will cause the default filter to
replace empty strings, lists, etc with the provided default rather than only
undefined variables.
This is because it is preferable to either have a sane default set than to
have an empty string, list, etc. For example, it is preferable to have a
config value set to a sane default than to have it simply set as an empty
string.