Infrastructure@Home: Using Ansible to Install the Service Monitoring Software Consul

Sebastian - May 4 '20 - - Dev Community

This is the 4th article in my series about infrastructure monitoring. Let’s recap shortly. I discussed the motivation for automating my infrastructure and defined four sets of requirements: infrastructure management, application management and service discovery. The first two articles explained my infrastructure, the steps to install the basic OS, and using Ansible for system management and package installation.

This article is a tutorial about writing a complex Ansible role that will install the service monitoring and discovery software Hashicorp Consul. Consul will fulfill the following application management requirement:

  • AM4: An interface in which I see the status of each application (health, exposed endpoints)

This article originally appeared at my blog.

Ansible Roles

Roles are a concept that enables reusability. A role defines a set of tasks that are executed against a host to achieve a desired state. This is not different from what a playbook does, but roles provide a standard structure and can be reused in different playbooks or even by other users [^1].

Roles that should be executed on hosts are named inside a playbook. For example, if we want to install the new_role on all hosts, we can define the playbook as follows [^2].

- name: Install new role
- hosts: all
- roles:
  - new_role
Enter fullscreen mode Exit fullscreen mode

Roles have a basic directory structure. By executing ansible-galaxy role init new_role, you will get the following diretory layout.

new_role
├── README.md
├── defaults
├── files
├── handlers
├── meta
├── tasks
├── templates
├── tests
└── vars
Enter fullscreen mode Exit fullscreen mode

What goes into these directories? In tasks and handlers include the concrete tasks that executing your role encompasses. In files, you put static configuration file, and in templates, you put files that will include variables. Finally, variables include all dynamic values that your playbook needs, and defaults encompasses variables that other roles or playbooks might overwrite. If you want to share your role on ansible galaxy, include relevant information in meta.

Now let’s apply this knowledge to write an Ansible role for install Consul on all our nodes.

Consul

Hashicorps Consul is a service monitoring and discovery software. It helps us to define, monitor and connect services that run on our nodes. A service is any software, for example a web server, an application, and it can run on the nodes itself or inside a container (Docker). Consul provides a command line interface, HTTP API enpoints, and a Web UI to monitor all systems.

Install Consul with a Playbook

In order to install Consul, we need to do this step on each node:

  • Determine the correct Linux distribution and hardware
  • Create consul group, user and directories
  • Download and link the consul binary
  • Copy configuration files
  • Enable Consul with systemd

All these steps will be fully automated with Ansible!

Step 1 - Determine the correct Linux Distribution and Hardware

We actually need to differentiate three versions

Here is the corresponding Ansible task. The consul version is configured as a role variable. Then, for each host, I define the src file name and the src url to use them in later tasks.

- name: Set vars when architecture is armv7l
  set_fact:
    consul_src_file: "consul_{{ consul_version }}_linux_armhfv6.zip"
    consul_src_url: "https://releases.hashicorp.com/consul/{{ consul_version }}/consul_{{ consul_version }}_linux_armhfv6.zip"
    consul_src_hash: "https://releases.hashicorp.com/consul/{{ consul_version }}/consul_{{ consul_version }}_SHA256SUMS"
  when: ansible_facts.architecture == 'armv7l'
Enter fullscreen mode Exit fullscreen mode

Step 2 - Create Consul Group, User and Directories

For security reasons, it is advisable to create dedicated users that will own the consul configuration files and binaries. We can use the Ansible modules group and user. Finally, we create the directory.

- name: Create consul group
  group:
    name: consul
    state: present
    system: true

- name: Create consul user
  user:
    name: consul
    group: consul
    shell: /bin/false
    home: /etc/consul/
    state: present

- name: Create consul dir
  file:
    state: directory
    path: "{{ consul_base_directory }}"
    owner: consul
    mode: 0755
Enter fullscreen mode Exit fullscreen mode

Step 3 - Download and Link the Consul Binary

We will download the binary to a temporary directory, unzip it to the target directory, and provide a symlink. When downloading, Ansible provides an option to check the download result with a SHA256 checksum - we will also use this.

- name: Get consul binary
  get_url:
    url: "{{ hostvars[inventory_hostname].consul_src_url }}"
    dest: /tmp
    checksum: sha256:{{ hostvars[inventory_hostname].consul_src_hash }}

- name: Unzip consul binary
  unarchive:
    src: /tmp/{{ hostvars[inventory_hostname].consul_src_file }}
    dest: "{{ consul_base_directory }}"
    remote_src: true
    mode: 0744

- name: Create symlink
  file:
    src: "{{ consul_base_directory }}/consul"
    dest: /usr/local/bin/consul
    state: link
Enter fullscreen mode Exit fullscreen mode

Step 4 - Copy Configuration Files

In the config file, we specify important information like the name of the datacenter, the directory from which to read the config files, the node name and central IP address to which the nodes will join. Here is the template file, in which we will print concrete values during execution of the tasks.

{
  "datacenter": "{{ consul_datacenter }}",
  "data_dir": "{{ consul_base_directory }}",
  "log_level": "INFO",
  "enable_local_script_checks": true,
  "enable_syslog": true,
  "node_name": "{{ ansible_facts.hostname }}",
  "retry_join": ["{{ consul_main_server_ip }}"]
}
Enter fullscreen mode Exit fullscreen mode

Consul reads all configuration files from a directory in alphabetic order. We will copy the basic configuration consul.json to all nodes, and additionally the server.json to the server.

- name: Copy consul config
  template:
    src: config.json.j2
    dest: "{{ consul_base_directory }}/config.json"
    mode: 0644
  notify: Restart consul

- name: Copy consul server config
  template:
    src: server.json.j2
    dest: "{{ consul_base_directory }}/server.json"
    mode: 0644
  when: consul_role == 'server'
  notify: Restart consul
Enter fullscreen mode Exit fullscreen mode

Step 5 - Enable Consul with Systemd

Consul is just a binary. If we want to start it during booting, we need to provide a sytemd service file. We will use the Consul service file template, copy it to the host, and enable the service.

- name: Copy consul service file
  template:
    src: consul.service.j2
    dest: /etc/systemd/system/consul.service
    mode: 0644
  notify: Restart consul

- name: Set file permissions
  file:
    path: "{{ consul_base_directory }}"
    owner: consul
    group: consul
    recurse: true

- name: Enable consul service
  systemd:
    service: consul.service
    enabled: true
  notify: Restart consul
Enter fullscreen mode Exit fullscreen mode

Running the Playbook

We can finally run the playbook. Executed on my machine, it looks like this:

consul_run_playbook

Consul will be automatically installed on all nodes. The very first time I ran this playbook, I understood the power of Ansible to apply complex installation steps to an arbitrary number of nodes! My infrastructure only consists of 5 nodes, but the playbook could configure 50 nodes too.

Using Consul

Now that Consul is running, let’s see some of its capabilities. In Consul, you should define 3 or 5 server nodes, and any number of client nodes. Server node take the role of monitoring and exchanging the cluster state, things like number of nodes, services running on these nodes, and more.

Consul offers an a rich set of command line commands. For now, I will only show how nodes can connect to each other to form the cluster.

Let’s manually start a Consul agent on the command line. We see important properties like the node name, datacenter, and IP address.

$ consul —agent
==> Starting Consul agent…
           Version: 'v1.5.2'
           Node ID: '094163f6-9215-6f2c-c930-9e84600029da'
         Node name: 'raspi-3-1+'
        Datacenter: 'infrastructure_at_home' (Segment: '<all>')
            Server: true (Bootstrap: false)
       Client Addr: [127.0.0.1] (HTTP: 8500, HTTPS: -1, gRPC: 8502, DNS: 8600)
      Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
           Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false, Auto-E
Enter fullscreen mode Exit fullscreen mode

Another command shows all members of the same cluster.

$ consul members

Node       Address             Status  Type    Build  Protocol  DC                      Segment
raspi-3-2  192.168.2.202:8301  alive   server  1.7.1  2         infrastructure_at_home  <all>
raspi-4-1  192.168.2.203:8301  alive   server  1.7.1  2         infrastructure_at_home  <all>
raspi-4-2  192.168.2.204:8301  alive   server  1.7.1  2         infrastructure_at_home  <all>
raspi-3-1  192.168.2.201:8301  alive   client  1.7.1  2         infrastructure_at_home  <default>
Enter fullscreen mode Exit fullscreen mode

Because of the config file that we used, all nodes are already connected. Nodes can join a cluster manually by using consul join <ipaddress>.

Finally, Consul also provides a Web UI that shows the nodes in the server, their health status, and the services running on them.

Conclusion

This article explained how to write a complex Ansible playbook for installing Consul. We walked through the five installation steps that created the required users, downloaded and linked the binary, copied templated config files (for differentiating server and agent nodes) and to start the service at runtime.

Consul provides the management view of all nodes in my cluster. For now, we barely scratched the surface of what is possible. In the next post, we will continue to set up the infrastructure by installing Docker and Hashicorp Nomad, a job scheduler.

Footnotes

[1]: On Ansible Galaxy you can search and download roles for a vast set of tasks and operating systems.

[2]: Important note: Roles are executed before any other tasks in your playbook! So, in this example, the task to print the hostname will be run after the role. If you need tasks to run beforehand, use the pre_tasks declaration.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player