This may be a controversial opinion…

… but you don’t need --- at the start of a YAML file in Ansible.

What does the Ansible documentation say?

I know, I know, if you look at the official documentation on docs.ansible.com, then all of the examples start with ---. And if the official examples do it, then everyone should just blindly copy that without thinking, right?

Wrong! The Ansible documentation on YAML syntax says:

There’s another small quirk to YAML. All YAML files (regardless of their association with Ansible or not) can optionally begin with --- and end with .... This is part of the YAML format and indicates the start and end of a document.

© Copyright Ansible project contributors.

I’ve added the emphasis: optionally. They then continue with one example with --- at the start and ... at the end. The funny thing is, that’s about the only example on the Ansible documentation site (that I could find) that ends with .... So the end marker ... is clearly optional. What about the start marker ---?

What does the YAML specification say?

Ansible uses version 1.2 of the YAML specification and unless you are doing something really exotic, that’s the only version you should care about. Revision 1.2.0 was published in July 2009 and revision 1.2.2 in October 2021. That last revision doesn’t make any changes to the specification, it only corrects some errors and adds clarity.

Chapter 9 of the YAML spec introduces two concepts: documents and streams.

A stream can contain zero or more documents. It’s called a (character) stream because it can be something else than a file on your hard disk, for example some data that’s sent over a network connection. So your Ansible playbook file with extension .yml or .yaml is not a YAML document, it’s a YAML stream.

A document can have several parts:

  • Document prefix: optional character encoding and optional comment lines.
    Seriously, it’s 2022, are you going to make life hard for yourself and use any other encoding than ASCII or UTF-8? The default encoding that every YAML processor, inclusing Ansible, must support is UTF-8. So You Ain’t Gonna Need It.
    Comments can be placed anywhere, so don’t worry.
  • Document directives: these are instructions to the YAML processor and aren’t part of the data structure. The only directive I’ve occasionally seen in the wild is %YAML 1.2, to indicate the version of YAML used. That’s the default version for Ansible anyway, so You Ain’t Gonna Need It.
  • Document markers: a parser needs some way to know where directives stop and document content begins. That’s the directives end marker, ---. There is also a document end marker, ..., which tells a parser to stop looking for content and start scanning for directives again. If there are no markers and the first line doesn’t start with % (a directive), then a parser knows that everything is content. In real life you probably won’t ever have multiple documents in the same stream (file), instead you’ll organize your Ansible code in separate .yaml files, with playbooks and roles and tasks etc.
  • Document content: that’s the only really interesting stuff you care about.

YAML knows 3 types of documents:

  • Bare documents: don’t begin with directives or marker lines. Such documents are very “clean” as they contain nothing other than the content. This is the kind of YAML documents I prefer for Ansible.
  • Explicit documents: begin with an explicit directives end maker (---) but have no directives. This is the style that many people use if they just copy/paste examples from Stack Overflow.
  • Directives documents: start with some directives, followed by an explicit directives end marker. You don’t need directives for Ansible.

Configuring yamllint

I use ansible-lint and yamllint in a pre-commit hook to check the syntax of my Ansible files. This is currently my .yamllint.yml:

rules:
  document-start:
    present: false
  truthy:
    allowed-values: ['true', 'false', 'yes', 'no']

document-start makes sure that there is no --- at the start of a file. I also have opinions on truthy: an Ansible playbook is supposed to be readable both by machines and humans, and then it makes sense to allow the more human-readable values yes and no.

Do you also have opinions that make you change the default configuration of your linters?

How big is a clean install of Ubuntu Jammy Jellyfish (22.04)?

Because curiosity killed the cat, not because it’s useful! 😀

Start with a clean install in a virtual machine

I start with a simple Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu/jammy64"
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "playbook.yml"
  end
end

This Ansible playbook updates all packages to the latest version and removes unused packages.

- name: Update all packages to the latest version
  hosts: all
  remote_user: ubuntu
  become: yes

  tasks:

  - name: Update apt cache
    apt:
      update_cache: yes
      cache_valid_time: 3600
      force_apt_get: yes

  - name: Upgrade all apt packages
    apt:
      force_apt_get: yes
      upgrade: dist

  - name: Check if a reboot is needed for Ubuntu boxes
    register: reboot_required_file
    stat: path=/var/run/reboot-required get_md5=no

  - name: Reboot the Ubuntu box
    reboot:
      msg: "Reboot initiated by Ansible due to kernel updates"
      connect_timeout: 5
      reboot_timeout: 300
      pre_reboot_delay: 0
      post_reboot_delay: 30
      test_command: uptime
    when: reboot_required_file.stat.exists

  - name: Remove unused packages
    apt:
      autoremove: yes
      purge: yes
      force_apt_get: yes

Then bring up the virtual machine with vagrant up --provision.

Get the installation size

I ssh into the box (vagrant ssh) and run a couple of commands to get some numbers.

Number of installed packages:

$ dpkg-query --show | wc --lines
592

Size of the installed packages:

$ dpkg-query --show --showformat '${Installed-size}\n' | awk '{s+=$1*1024} END {print s}' | numfmt --to=iec-i --format='%.2fB'
1.14GiB

I need to multiply the package size with 1024 because dpkg-query outputs size in kilobytes.

Total size:

$ sudo du --summarize --human-readable --one-file-system /
1.9G	/

Get the installation size using Ansible

Of course, I can also add this to my Ansible playbook, and then I don’t have to ssh into the virtual machine.

  - name: Get the number of installed packages
    shell: dpkg-query --show | wc --lines
    register: package_count
    changed_when: false
    failed_when: false
  - debug: msg="{{ package_count.stdout }}"

  - name: Get the size of installed packages
    shell: >
      dpkg-query --show --showformat '${Installed-size}\n' 
      | awk '{s+=$1*1024} END {print s}' 
      | numfmt --to=iec-i --format='%.2fB'
    register: package_size
    changed_when: false
    failed_when: false
  - debug: msg="{{ package_size.stdout }}"

  - name: Get the disk size with du
    shell: >
      du --summarize --one-file-system /
      | numfmt --to=iec-i --format='%.2fB'
    register: du_used
    changed_when: false
    failed_when: false
  - debug: msg="{{ du_used.stdout }}"

The output is then:

TASK [Get the number of installed packages] ************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "592"
}

TASK [Get the size of installed packages] **************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "1.14GiB"
}

TASK [Get the disk size with du] ***********************************************
ok: [default]

TASK [debug] *******************************************************************
ok: [default] => {
    "msg": "1.82MiB /"
}