Since Nomad 1.3.4, Hashicorp ships Nomad with internal Service Discovering (SD). Nobody on the net talks about it, they prefer use Nomad with Consul. We will see how I have managed it to get it work.

At the base, the infrastructure.

To make it happen, I used only 3 Ubuntu machines, and the 3 have the same role, master AND worker. Because my main computer that run Proxmox to run all my Virtual Machines (VM) have only 12 Go of RAM and that pretty restrictive. But nothing is impossible and Nomad is truly well optimized so I won’t complain about it here.

The 3 Virtual Machine have:

  • 1Go of RAM
  • 1vC
  • 10Go of disk
  • the rest is not relevant

The Ansible role to deploy Nomad

To deploy Nomad on the 3 nodes, I used Ansible to do that. This is my role that implement the process from their website:

tasks/main.yml

# ============================
# Nomad service installation
# ============================
- name: Add system user nomad
  become: yes
  ansible.builtin.user:
    name: "{{ group.nomad }}"
    shell: /bin/false
    home: /etc/nomad.d
    system: yes
    state: present
    create_home: no
    groups: "{{ group.docker }}"
    append: yes
    comment: Nomad system user account

- name: Create path volume directory
  become: yes
  ansible.builtin.file:
    state: directory
    path: "{{ item }}"
    owner: "{{ group.nomad }}"
    group: "{{ group.nomad }}"
    mode: 0755
  loop:
    - "{{ nomad.data_dir }}"
    - /etc/nomad.d

- name: Unarchive nomad
  become: yes
  ansible.builtin.unarchive:
    src: "https://releases.hashicorp.com/nomad/{{ nomad.version }}/nomad_{{ nomad.version }}_linux_amd64.zip"
    dest: "/usr/local/bin"
    remote_src: yes
  register: nomad_unarchived_action
  ignore_errors: yes

- name: Add autocomplete
  become: yes
  command: nomad -autocomplete-install
  when: nomad_unarchived_action.changed
  ignore_errors: yes

##
# Config
##
- name: Inject conf file
  become: yes
  ansible.builtin.template:
    src: "templates/{{ item }}.hcl.j2"
    dest: "/etc/nomad.d/{{ item }}.hcl"
    owner: "{{ group.nomad }}"
    group: "{{ group.nomad }}"
    mode: 0755
  loop:
    - nomad
    - plugin
    - server
    - client

- name: Add nomad as a service
  become: yes
  ansible.builtin.template:
    src: nomad.service
    dest: /etc/systemd/system/nomad.service
    owner: root
    group: root
    mode: 0600

- name: Enable nomad service
  become: yes
  ansible.builtin.systemd:
    enabled: yes
    state: restarted
    daemon_reload: yes
    name: nomad

templates/client.hcl

client {
  enabled = true
  servers = [{{ groups['nomaded'] | map('extract', hostvars, ['ansible_host']) | map('regex_replace', '^(.*)$', '"\\1"') | join(', ') }}]

  server_join {
    retry_join = [{{ groups['nomaded'] | map('extract', hostvars, ['ansible_host']) | map('regex_replace', '^(.*)$', '"\\1"') | join(', ') }}]
  }
  
  options {
    "driver.raw_exec.enable" = "1"
    "docker.privileged.enabled" = "true"
  }
}

templates/server.hcl

server {
  enabled = true
  bootstrap_expect = {{ groups['nomaded'] | length }}

  server_join {
    retry_join = [{{ groups['nomaded'] | map('extract', hostvars, ['ansible_host']) | map('regex_replace', '^(.*)$', '"\\1"') | join(', ') }}]
  }
}

templates/nomad.hcl

datacenter = "{{ nomad.datacenter_name }}"
data_dir = "{{ nomad.data_dir }}"

templates/plugins.hcl

plugin "docker" {
  config {
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}

templates/nomad.service use the template on their website but replace USER and GROUP by root, as the template said, client need to run Nomad in root mode.

[Unit]
Description=Nomad
Documentation=https://www.nomadproject.io/docs/
Wants=network-online.target
After=network-online.target

[Service]

# Nomad server should be run as the nomad user. Nomad clients
# should be run as root
User=root
Group=root

ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/nomad agent -config /etc/nomad.d
KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2

TasksMax=infinity
OOMScoreAdjust=-1000

[Install]
WantedBy=multi-user.target

If you use this role, you will need to have in your Ansible inventory:

  • an inventory group nomaded that contains your nodes
  • some variables

inventory/hosts.yml

nomaded:
    hosts:
        nomad1:
            ansible_host: IP1
        nomad2:
            ansible_host: IP2
        nomad3:
            ansible_host: IP3

inventory/group_vars/all.yml

nomad:
    version: 1.6.*
    datacenter_name: dc1
    data_dir: /mnt/nomad

Run the playbook and voila, go to http:NOMAD_IP1:4646 and, normally, if the role doesn’t throw some error, you have the incredible Nomad user interface (UI).

Deploy your first job in Nomad

For the first exemple, I choose the image traefik/whoami because it is easy to deploy and it’s a simple webserver that reply the headers of your request.
whoami.hcl

job "whoami" {
  datacenters = ["dc1"]
  type        = "service"

  group "whoami" {
    count = 1

    network {
      port "http" {
        to = 80
      }
    }
  
    service {
      name = "whoami"
      port = "http"
      provider = "nomad"
    }

    task "whoami" {
      driver = "docker"

      config {
        image = "traefik/whoami"
        ports = ["http"]
      }

      resources {
        cpu = 10
        memory = 10
      }
    }
  }
}

The job definition follow the spec of Nomad. The service part will be what is interesting, it shows:

  • the name of the service
  • the tags for Traefik
  • the provider is the service discovery used for getting the location of the service
  • the port to bind inside the container The rest is self explained.

To deploy it, I use this Terraform plugin but you can use the UI to deploy your jobs, I keep that for another article.

Deploy Traefik as reverse proxy

traefik.hcl

job "traefik" {
  datacenters = ["dc1"]
  type        = "service"

  constraint {
    attribute = "${attr.unique.network.ip-address}"
    value     = "IP1" # Put a static ip here to avoid to have one traefik instance per client machine 
  }

  group "traefik" {
    count = 1

    network {
      port "http" {
        static = 80
        to = 80
      }
      port "admin" {
        static = 8080
        to = 8080
      }
    }

    service {
      name = "traefik"
      provider = "nomad"
      port = "http"
      tags = [
         "traefik.http.routers.traefik-http.service=api@internal",
         "traefik.http.routers.traefik-http.rule=Host(`traefik.exemple.lan`)",
      ]
    }

    task "traefik" {
      driver = "docker"

      config {
        image = "traefik:v2.9.6"
        force_pull = true
        ports = ["admin", "http","]
        args = [
          "--api.dashboard",
          "--api.insecure",
          "--serversTransport.insecureSkipVerify",
          "--entrypoints.web.address=:80",
          "--entrypoints.traefik.address=:8080",
          "--providers.nomad",
          "--providers.nomad.endpoint.address=http://IP_NOMAD:4646",
          "--providers.nomad.stale",
          "--providers.nomad.defaultRule=Host(`{{ .Name }}.exemple.lan`)"
        ]
      }

      resources {
        cpu        = 20
        memory     = 50
        memory_max = 100
      }
    }
  }
}

After that you need to update your local DNS to target NOMAD_IP1 on *.exemple.lan.
This is the minimal Nomad Traefik job to get the service discovery working. Once the job is deployed, you can go to the traefik.exemple.lan, you will access to the Traefik dashboard.
And on the whoami.exemple.lan, you will have access to Whoami.

I hope this tutorial will help you to get started with Nomad without Consul.