This episode is actually why I started this series in the first place. I am an active Docker user and Docker fan, but I like containers and DevOps topics in general. I am a moderator on the official Docker forums and I see that people often struggle with the installation process of Docker CE or Docker Desktop. Docker Desktop starts a virtual machine, and the GUI is to manage the Docker CE inside the virtual machine even on Linux. Even though I prefer not to use a GUI for creating containers, I admit it can be useful in some situations, but you always need to be ready to use the command line where all the commands are available. In this episode I will use Ansible to install Docker CE in the previously created virtual machine, and I will also install a web-based graphical interface, Portainer.
If you want to run the playbook called playbook-lxd-install.yml, you will need to configure a physical or virtual disk which I wrote about in The simplest way to install LXD using Ansible. If you don't have a usable physical disk, Look for truncate -s 50G <PATH>/lxd-default.img to create a virtual disk.
How you activate the virtual environment, depends on how you created it. In the episode of The first Ansible playbook describes the way to create and activate the virtual environment using the "venv" Python module and in the episode of The first Ansible role we created helper scripts as well, so if you haven't created it yet, you can create the environment by running
./create-nix-env.sh venv
Optionally start an ssh agent:
ssh-agent $SHELL
and activate the environment with
source homelab-env.sh
Small improvements before we start today's main topic
You can skip this part too, if you joined the tutorial at this episode, and you don't want to improve other playbooks.
We discussed facts before in the "Using facts and the GitHub API in Ansible" episode, but we left this setting on default in other playbooks. Let's quickly add gather_facts: false to all playbooks, except playbook-hello.yml as that was to demonstrate how a playbook runs.
Now that we have a separate group for the virtual machines that run Docker, we can also create a new group for LXD, so when we add more machines, we will not install LXD on every single machine, and we will not remove it from a machine on which it was not installed. Let's add the following to the inventory.yml
lxd_host_machines:hosts:YOURHOSTNAME:
NOTE: Replace YOURHOSTNAME with your actual hostname which you used in the inventory under the special group called "all".
In my case, it is the following:
lxd_host_machines:hosts:ta-lxlt:
And now, replace hosts: all in playbook-lxd-install.yml and playbook-lxd-remove.yml with hosts: lxd_host_machines.
Reload ZFS pools after removing LXD to make the playbook more stable
When I wrote the "Remove LXD using Ansible" episode, the playbook worked for me every time. Since then, I noticed that sometimes it cannot delete the ZFS pool, because it's missing. I couldn't actually figure out why it happens, but a workaround can be implemented to make the playbook more stable. We have to restart the zfs-import-cache Systemd service, which will reload the ZFS pools so the next task can delete it and the disks can be wiped as well.
Open roles/zfs_destroy_pool/tasks/main.yml and look for the following task:
# To fix the issue of missing ZFS pool after uninstalling LXD-name:Restart ZFS import cachebecome:trueansible.builtin.systemd:state:restartedname:zfs-import-cache
Since we already configured our SSH client for the newly created virtual machine last time, we could create a new inventory group and add the virtual machine to that group. But sometimes you don't want to do that, or you can't. That's why I wanted to show you a way to create a new inventory group without changing the inventory file. Then we can add a host to this group. Since last time we also used Ansible to get the IP address of the new virtual machine, we can add that IP to the inventory group. The next task will show you how you can do that.
# region task: Add Docker VM to Ansible inventory-name:Add Docker VM to Ansible inventorychanged_when:falseansible.builtin.add_host:groups:_lxd_docker_vmhostname:"{{vm_inventory_hostname}}"ansible_user:"{{config_lxd_docker_vm_user}}"ansible_become_pass:"{{config_lxd_docker_vm_pass}}"# ansible_host is not necessary, inventory hostname will be usedansible_ssh_private_key_file:"{{vm_ssh_priv_key}}"#ansible_ssh_common_args: "-o StrictHostKeyChecking=no"ansible_ssh_host_key_checking:false# endregion
You need to add the above task to the "Create the VM" play in the playbook-lxd-docker-vm.yml playbook. The builtin add_host module is what we needed. Despite what you see in the task, it has only 2 parameters. Everything else is a variable which you could use in the inventory file as well. The groups and name parameters have aliases, and I thought that using the hostname alias of name would be better as we indeed add a hostname or an IP as its value. groups can be a list or a string. I defined it as a string as I have only one group to which I will add the VM.
I mentioned before that I like to start the name of helper variables with an underscore. The name of the group is not a variable, but I start it with an underscore, so I will know it is a temporary, dynamically created inventory group. We have some variables that we defined in the previous episode.
vm_inventory_hostname: The inventory hostname of the VM, which is also used in the SSH client configuration. It actually comes from the value of config_lxd_docker_vm_inventory_hostname with a default in case it is not defined.
config_lxd_docker_vm_user: The user that we created using cloud init and with which we can SSH into the VM.
config_lxd_docker_vm_pass: The sudo password of the user. It comes from a secret.
vm_ssh_priv_key: The path of the SSH private key used for the SSH connection. It is just a short alias for config_lxd_docker_vm_ssh_priv_key which can be defined in the inventory file.
Using these variables we could configure the SSH connection parameters like ansible_user, ansible_become_pass and ansible_ssh_private_key_file. We also have a new variable, ansible_ssh_host_key_checking. When we first SSH to a remote server, we need to accept the fingerprint of the server's SSH host key, which means we know and trust the server. Since we dynamically created this virtual machine and detected its IP address, we would need to accept it every time we recreate our VM, so I just disable host key checking by setting boolean false as value.
We have a new inventory group, but we still don't use it. Now we need a new play in the paybook which will actually use this group. Add the following play skeleton to the end of playbook-lxd-docker-vm.yml.
# play: Configure the OS in the VM-name:Configure the OS in the VMhosts:_lxd_docker_vmgather_facts:falsepre_tasks:roles:# endregion
Even though we forced Ansible to wait until the virtual machine gets an IP address, having an IP address doesn't mean the SSH daemon is ready in the VM. So we need the following pre task:
-name:Waiting for SSH connectionansible.builtin.wait_for_connection:timeout:20delay:0sleep:3connect_timeout:2
The built-in wait_for_connection module can be used to retry connecting to the servers. We start checking it immediately, so we set the delay to 0. If we are lucky, it will be ready right away. If it does not connect in 2 seconds (connect_timeout), Ansible will "sleep" for 3 seconds and try again. If the connection is not made in 20 seconds (timeout), the task will fail.
While I was testing the almost finished playbook, I realized that sometimes the installation of some packages failed like if they were not in the APT cache yet, so I added a new pre task to update the APT cache before we start the VM. We already discussed this module in Using facts and the GitHub API in Ansible
If you don't want to add it, you can just rerun the playbook and it will probably work. Now we have an already written Ansible role which we will want to use here too, which is cli_tools. So this is how our second play looks like in playbook-lxd-docker-vm.yml:
# play: Configure the OS in the VM-name:Configure the OS in the VMhosts:_lxd_docker_vmgather_facts:falsepre_tasks:-name:Waiting for SSH connectionansible.builtin.wait_for_connection:timeout:20delay:0sleep:3connect_timeout:2-name:Update APT cachebecome:truechanged_when:falseansible.builtin.apt:update_cache:trueroles:-role:cli_tools# endregion
Now you could delete the virtual machine if you already created it last time, and run the playbook with the new play again to create the virtual machine and immediately install the command line tools in it.
I admit, that there are already existing great roles we could use to Install Docker. For example the one made by Jeff Geerling which supports multiple Linux distributions, so feel free to use it, but we are still practicing writing our own roles, so I made a simple one for you, although that works only on Ubuntu. On the other hand, I will add something that even Jeff Geerling didn't do.
We will create some default variables in roles/docker/defaults/main.yml:
docker_version:"*.*.*"docker_sudo_users:[]
The first, docker_version is to define which version we want to install. When you just start playing with Docker but don't want to use Play with Docker, you probably want to install the latest version. That's why the default value is "*.*.*", which means the latest major, minor and patch version of Docker CE. You will see the implementation soon. The second variable is docker_sudo_users, which is an empty list. We will be able to add users to the list who should be able to use Docker. We will discuss it later in more details.
As I say it very often we should always start with the official documentation. It starts with uninstalling old and unofficial packages. Our role will not include it, so you will do it manually or write your own role as a homework. Then it updates the APT cache, which we just did as a pre task, so we will install the dependencies first. The official documentation says:
sudo apt-get install ca-certificates curl
Which looks like this in Ansible in roles/docker/tasks/main.yml:
Note: We know that the "cli_tools" role already installed curl, but we don't care, because when we create a role, we try to make it without depending on other roles. So even if we decide later not to use the "cli_tools" role, our "docker" role will still work perfectly.
The official documentation continues with creating a folder, /etc/apt/keyrings. It uses the
sudo install-m 0755 -d /etc/apt/keyrings
command, but it really just creates a folder this time, which looks like this in Ansible:
-name:Make sure the folder of the keyrings existsbecome:trueansible.builtin.file:state:directorymode:0755path:/etc/apt/keyrings
The next step is downloading the APT key for the repository. Previously, the documentation used the apt-key command which was deprecated on Ubuntu, so it was replaced with the following:
-name:Install the APT key of Docker's APT repobecome:trueansible.builtin.get_url:url:https://download.docker.com/linux/ubuntu/gpgdest:/etc/apt/keyrings/docker.ascmode:a+r
Then the official documentation shows how you can add the repository to APT depending on the CPU architecture and Ubuntu release code name, like this:
# Add the repository to Apt sources:echo\"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \$(. /etc/os-release &&echo"$VERSION_CODENAME") stable" | \sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
In the "Using facts and the GitHub API in Ansible" episode we already learned to get the architecture. We also need the release code name, so we gather the distribution_release subset of Ansible facts as well.
-name:Get distribution release factansible.builtin.setup:gather_subset:-distribution_release-architecture
Before we continue with the next task, we will add some variables to roles/docker/vars/main.yml.
Again, this is very similar to what we have done before to separate our helper variables from the tasks, and now we can generate docker.list under /etc/apt/sources.list.d/. To do that, we add the new task in roles/docker/tasks/main.yml:
The built-in apt_repository module can also update APT cache after adding the new repo. The filename is automatically generated if we don't set it, but the "filename" parameter is actually a name without the extension, so do not add .list at the end of the name.
The official documentation recommends using the following command to list the available versions of Docker CE on Ubuntu.
apt-cache madison docker-ce | awk'{ print $3 }'
It shows the full package name which includes more than just the version of Docker CE. Fortunately, the actual version number can be parsed like this:
apt-cache madison docker-ce \
| awk'$3 ~ /^([0-9]+:)([0-9]+\.[0-9]+\.[0-9]+)(-[0-9]+)?(~.*)$/ {print $3}'
Where the version number is the second expression in parentheses.
[0-9]+\.[0-9]+\.[0-9]+
We can replace it with an actual version number, but keeping the backslashes:
26\.1\.3
Let's search for only that version:
apt-cache madison docker-ce \
| awk'$3 ~ /^([0-9]+:)26\.1\.3(-[0-9]+)?(~.*)$/ {print $3}'
Output:
5:26.1.3-1~ubuntu.22.04~jammy
This is what we will implement in Ansible:
-name:Get full package version for {{ docker_version }}changed_when:falseansible.builtin.shell:|apt-cache madison docker-ce \| awk '$3 ~ /^([0-9]+:){{ docker_version | replace('*', '[0-9]+') | replace('.', '\.') }}(-[0-9]+)?(~.*)$/ {print $3}'register:_docker_versions_command
I hope now it starts to make sense why the default value of docker_version was *.*.*. It was because we replace that with regular expressions. We also escape all dots as otherwise it would mean "any character" in the regular expression. This solution allows us to install the latest version unless we override the default value with an actual version number. Even if we override it, we can use a version like 26.0.* to get a list of available patch version of Docker CE 26.0 instead of the latest major version. Of course, this is still a list of versions unless we set a specific version number, but we can get the first line in the next task. According to the official documentation, we would install Docker CE and related packages like this:
What is not mentioned in the documentation is marking Docker CE packages as held. In the terminal it would be like this:
apt-mark hold docker-ce docker-ce-cli docker-ce-rootless-extras containerd.io docker-compose-plugin
It will be almost the same in Ansible as we need to use the built-in command module:
-name:Hold Docker CE packagesbecome:trueansible.builtin.command:apt-mark hold docker-ce docker-ce-cli docker-ce-rootless-extras containerd.io docker-compose-pluginchanged_when:false
Note: I never actually saw an upgraded containerd causing problems, but this is a very important component of Docker CE, so I decided to hold that too. If it causes any problem, you can "unhold" that any time by running the following command:
The Docker daemon binds to a Unix socket, not a TCP port. By default it's the root user that owns the Unix socket, and other users can only access it using sudo. The Docker daemon always runs as the root user.
If you don't want to preface the docker command with sudo, create a Unix group called docker and add users to it. When the Docker daemon starts, it creates a Unix socket accessible by members of the docker group. On some Linux distributions, the system automatically creates this group when installing Docker Engine using a package manager. In that case, there is no need for you to manually create the group.
Of course, using the docker group is not really secure, which is also mentioned right after the previous quote in the documentation:
The docker group grants root-level privileges to the user. For details on how this impacts security in your system, see Docker Daemon Attack Surface.
If we want to have just a little bit more secure solution, we don't use the docker group which can directly access the docker socket, but we create another group like docker-sudo and we allow all users in this group to run the docker command as root by using sudo docker without a password. It would involve creating a new rule in /etc/sudoers.d/docker like:
%docker-sudo ALL=(root) NOPASSWD: /usr/bin/docker
This would force users to always run sudo docker, not just docker and they will often forget it and get an error message. We could add an alias like
alias docker='sudo \docker'
to ~/.bash_aliases, but that would work only when the user uses the bash shell. Instead of that, we can add a new script in /usr/local/bin/docker,
which usually overrides /usr/bin/docker and add this command in the script:
#!/usr/bin/env shexec sudo /usr/bin/docker "$@"
This is what we will do with Ansible, so our new script will be executed which will call the original docker command as an argument of sudo. Now even when we use Visual Studio Code's Remote Explorer to connect to the remote virtual machine and use Docker in the VM from VSCode, /var/log/auth.log on Debian-based systems will contain exactly what docker commands were executed. If you don't find this file, it may be called /var/log/secure on your system. This is for example how browsing files from VSCode in containers looks like:
This is useful when you want to be able to investigate accidental damage when a Docker user executes a command that they should not have executed, and they don't even know what they executed. It will not protect you from intentional harm as an actual hacker could also delete the logs. On the other hand, if you have a remote logging server where you collect logs from all machines, you will probably have the logs to figure out what happened.
Now let's configure this in Ansible.
First we will create the docker-sudo group:
-name:Ensure group "docker-sudo" existsbecome:trueansible.builtin.group:name:docker-sudostate:present
Now we can finally use our docker_sudo_users variable which we defined in roles/docker/defaults/main.yml and check if there is any user who doesn't exist.
-name:Check if docker sudo users are existing usersbecome:trueansible.builtin.getent:database:passwdkey:"{{item}}"loop:"{{docker_sudo_users}}"
If the user exists, it returns the passwd record of the user, and it fails otherwise. Now let's add the groups to the users:
-name:Add users to the docker-sudo groupbecome:trueansible.builtin.user:name:"{{item}}"# users must be added to docker-sudo group without removing them from other groupsappend:truegroups:-docker-sudoloop:"{{docker_sudo_users}}"
We used the built-in user module, to add the docker-sudo group to the users defined in docker_sudo_users. We are close to the end. The next step is creating the script using a similar solution we used in the hello_world role at the beginning of the series.
-name:Create a sudo wrapper for Dockerbecome:trueansible.builtin.copy:content:|#!/usr/bin/env shexec sudo /usr/bin/docker "$@"dest:/usr/local/bin/dockermode:0755
And finally, using the same method, we create the sudoers rule:
-name:Allow run execute /usr/bin/docker as root without passwordbecome:trueansible.builtin.copy:content:|%docker-sudo ALL=(root) NOPASSWD: /usr/bin/dockerdest:/etc/sudoers.d/docker
We could now run the playbook to install Docker in the virtual machine:
./run.sh playbook-lxd-docker-vm.yml
Install Portainer CE, the web-based GUI for containers
Installing Portainer CE is the easy part, actually. We will need to use the built-in pip module to install a Python dependency for Ansible to be able to manage Docker, and then we will also use a community module, called docker_container. Let's create the tasks file first at roles/portainer/tasks/main.yml:
When you finished installing Portainer, you need to quickly open it in a web browser on port 9443. If you do it on a publicly available machine in your LAN network, you can simply open it like https://192.168.4.58:9443. In this tutorial, our virtual machine needs an SSH tunnel like below, so you can use https://127.0.0.1:9443:
ssh -L 9443:127.0.0.1:9443 -N docker.lxd.ta-lxlt
Your hostname will be different. If you already have other containers with a forwarded port, you can add more ports to the tunnel:
When you can finally open Portainer in your browser, create your first user and configure the connection to the local Docker environment. If you wait too long, the webinterface will show an error message, and you will need to go to the terminal in the virtual machine and restart portainer:
docker restart portainer
After that, you can start the configuration. This way if you install Portainer on a publicly available server, there will be less time for others to log in before you do, and after you initialized Portainer, it is no longer possible to log in without a password.
I hope this episode helped you to install Docker and allow non-root users to use the Docker commands in a little more secure way than the documentation suggests. Now you can have a web-basd graphical interface for containers, however, Portainer is definitely not Docker Desktop, so you will not have the extra features like Docker Desktop extensions. Using Ansible can help to deploy your entire dev environment, destroy it and recreate any time. Using containers you can have pre-built and pre-configured applications that you can try and learn more about it to customize the configuration for your needs. When you have a production environment, you need to focus much more on security, but now that you have the tools to begin with a dev environment, you can make that new step more easily.
The final source code of this episode can be found on GitHub:
Source code to create a home lab. Part of a video tutorial
README
This project was created to help you build your own home lab where you can test
your applications and configurations without breaking your workstation, so you can
learn on cheap devices without paying for more expensive cloud services.
The project contains code written for the tutorial, but you can also use parts of it
if you refer to this repository.
Note: The inventory.yml file is not shared since that depends on the actual environment
so it will be different for everyone. If you want to learn more about the inventory file
watch the videos on YouTube or read the written version on https://dev.to. Links in
the video descriptions on YouTube.
You can also find an example inventory file in the project root. You can copy that and change
the content, so you will use your IP…