5 minutes
Ansible Galaxy Collections and Source Control
TL;DR
The location used for storing and loading Ansible collections can be changed with COLLECTIONS_PATH
. I use this to store Ansible collections in source control.
Background
I configure at least 20 of my own Linux hosts with Ansible, including personal laptops. I maintain one playbook1 with many roles. This is stored in a self-hosted Git repository.
Rather than install Ansible, I checkout Ansible as a Git submodule2.
This means I can run Ansible by installing pre-requisite Python modules (which get installed on my laptops etc. by Ansible) and simply running git clone --recurse-submodules mygitserver:myansiblerepository
In the advent of launching Ansible 3.0, the Ansible project has moved away from including every Ansible plugin and module within the core project. Now nearly all plugins and modules live in Ansible Galaxy collections.
This means that it is no longer sufficient for me to simply use Ansible (now Ansible Core) as a Git submodule.
Ansible Galaxy Collections
Ansible Galaxy is the official hub for sharing Ansible content.
Collections are a distribution format for packaging and distributing Ansible Content, for example playbooks, roles, modules and plugins.
On upgrading to Ansible version 2.10, a number of modules I use in many roles had been migrated from Ansible Core into Ansible Galaxy Collections.
By way of example, I use the synchronize
module (a wrapper for rsync) in scenarios where a number of files have to be copied onto a host. It is faster than using the Ansible copy module.
- name: copy cinnamon configuration
synchronize:
src: "files/home/myuser/.cinnamon"
dest: /home/myuser/
recursive: yes
perms: yes
owner: yes
group: yes
use_ssh_args: true
rsync_opts:
- "--no-motd"
when: inventory_hostname not in groups['ltspServers']
In order to use this module now, I execute ansible-galaxy collection install ansible.posix
before running Ansible.
Relying on Ansible Collections
I understand why the Ansible developers have made these changes. Maintaining potentially hundreds of plugins within the core codebase of a project is not something that will scale well, especially when that project is successful and becoming widely adopted.
There are two minor drawbacks I see with this change.
Drawbacks
1. Trust
I had a higher degree of trust in any Ansible module that was included in Ansible, if only because Ansible (the company) was acquired by Red Hat in 2015. Any code accepted into the Ansible code base should be subject to a degree of scrutiny, if only in theory to protect Red Hat’s reputation.
That said, blind trust of code in any configuration management tool is probably a bad idea, subject to the threat model any person or organisation is working to. Auditing the code used in any Ansible modules utilised in a role/playbook is probably a good idea.
2. Installing Collections on Multiple Hosts/User Accounts
Ansible Galaxy, by default installs collections into ~/.Ansible/collections
. That means that the collections used have to be re-installed on each new host or user.
In all, I use Ansible modules from four different collections. I don’t want to be in a position where I have to manually install them (or update them) each time I run Ansible from a new or freshly rebuilt host.
Ansible, as dependently as ever, provides a solution to the above problem.
Storing Ansible Collections in Source Control
The locations Ansible and Ansible Galaxy use for saving and loading collections is configurable.
I use a shell script (env.sh
) which is stored at the root of my Ansible configuration repository to setup my Ansible runtime environment. Today, this looks something like:
#!/bin/bash
set -eu
x=$(dirname $(readlink -f ${BASH_SOURCE[0]}))
. ${x}/ansible/hacking/env-setup
export PATH=$PATH":${x}/ansible/bin"
export ANSIBLE_INVENTORY=${x}/hosts
export ANSIBLE_HOST_KEY_CHECKING=True
export ANSIBLE_REMOTE_USER=daniel
export ANSIBLE_BECOME=True
export ANSIBLE_BECOME_METHOD=sudo
export ANSIBLE_BECOME_USER=root
export ANSIBLE_BECOME_ASK_PASS=True
export ANSIBLE_ASK_VAULT_PASS=True
export ANSIBLE_SSH_PIPELINING=True
export ANSIBLE_COLLECTIONS_PATH="${x}/collections"
$x
is set dynamically at run time to the path this shell script is run from, e.g. ~/ansible
.
The very last line in the shell script is setting the Ansible environment variable for COLLECTIONS_PATH
.
In my testing, I dot-source the above script (. ~/ansible/env.sh
) into my environment and then run ansible-galaxy collection install ansible.posix
.
The ansible.posix
collection is now installed to ~/ansible/collections/ansible_collections/ansible/posix/
instead of ~/.ansible/collections/ansible_collections/ansible/posix/
.
After installing the module, I add it to version control:
$ git add ~/ansible/collections/ansible_collections/ansible/posix
$ git commit -m 'synchronize module moved to ansible.posix collection'
$ git push
In future, when I checkout my Ansible configuration git repository, it includes both the Ansible Core submodule and the Ansible Galaxy collections I rely on.
Using Fully Qualified Collection Names
If using an Ansible Core version greater than or equal to 2.10, I think it is a good idea to start using the FQCN (Fully Qualified Collection Name) in playbooks whenever applicable.
Although Ansible 2.10 included a feature to map modules specified by shortname to their respective FQCN, I suspect that as modules get merged, broken-out and renamed, older plugin routes are not going to be indefinitely maintained.
By way of example, in the earlier demonstrated use of the synchronize
module, synchronize
becomes ansible.posix.synchronize
:
- name: copy cinnamon configuration
ansible.posix.synchronize:
src: "files/home/myuser/.cinnamon"
...
This means in future, when I’m wondering whether a collection is still used, I can easily search my repository and find all usages of that collection.
Conclusion
Ansible collections provide a convenient way of packaging and sharing Ansible content including Ansible modules. Ansible Galaxy provides many useful collections.
Storing the Ansible content that is relied on for Ansible configurations in source control is not only convenient, but it provides auditability over the configuration management of hosts.
References
- Ansible Installation Guide - Installing and running the
devel
branch from source - Git Book - 7.11 Git Tools - Submodules
- Ansible Documentation - Developer Guide - Developing Collections
- Ansible Documentation - Ansible Configuration Settings - Collections Path
-
Using a single playbook is not necessarily best practice. It suits me because I purposefully try keep the configuration of my hosts as simple as possible, and there’s a great deal of consistency between roles. I may in future change tack and start maintaining different playbooks for on-prem and VPS-hosted services. ↩︎
-
When adding the Ansible Core repository as a Git submodule, the default branch is
devel
. Using this branch is not recommended unless modifying Ansible Core or trying out features under development. I recommend switching to the latest stable tag. ↩︎