Simple log aggregation for Docker containers

The Why

The Docker composition on one of my side projects has recently got to 6 containers: SMTP-in, SMTP-out, app, subscription, website, and certbot. Since I have deployed a 0.1.0 version to DigitalOcean a couple of weeks ago, I caught myself feeling increasingly anxious about losing logs every time I deployed a new version of the container.

Every container is streaming its logs to STDOUT. I wanted to have a persistent log file for each container, which Docker almost does out-of-the-box, with the “little” caveat that those files are removed every time the container is rebuilt. 🙃

One of the things that I’ve challenged myself to do on this side project is to have it as self-contained as possible. — This is why I’ve set up my own SMTP server instead of buying a transactional email subscription. 😎

This is the kind of simplicity I was referring to when I said “simple log aggregation” in the title: no third-party services. I wanted it to be a Docker composition that I can start on my laptop as easily and as cleanly as on a VM in some cloud.

At the end of the day, all I needed is to have those logs — plain text or JSON — stored somewhere so that I can grep or jq them whenever I want to do see what happened when.

After googling around for a few mornings, I ended up deploying an additional logger container that runs Syslog and then instructed all the other containers to stream their logs into it. I’ve seen this called “the sidecar pattern” in other places. The logger container’s files live in a mounded directory, so they can have a life outside Docker, and across the container life cycle.

Some dirty details

The logger

I’ve started with the mumblepins/syslog-ng-alpine for the proof of concept, but then I saw a warning about old log format, or something. On a closer inspection, I realized that it hasn’t been updated in 4 years. Hm… Looking at its Dockerfile on GitHub I found that it was manually compiling Syslog from the source. Umm… OK. On the next day, inspired by that I’ve created my own Dockerfile, which used Alpine’s built-in package of Syslog.

Extracted a copy of its config file from the running container, added the bits from mumblepins that made it listen on the network, and with that I had it both up to date and working. Open Source is awesome. 😎

The docker-compose config

To tell a container to stream its logs to a Syslog is well documented, and quite straightforward:

app:
  # ...more settings...
  depends_on: [logger]
  logging:
    driver: syslog
    options:
      syslog-address: tcp://127.0.0.1:514
      syslog-format: rfc3164
      tag: subscription

This worked, but having this for 6 services looked too non-DRY for my taste, so I googled around for a morning or two and found about YAML merge type which is described under the Extension fields section of Docker Compose references, and which gave me this:

x-logging: &logging
  depends_on: [logger]
  logging:
    driver: syslog
    options:
      syslog-address: tcp://127.0.0.1:514
      syslog-format: rfc3164
      tag: "{​{.Name}}"

I added the tag: option to have the symbolic container name in logs instead of its hex ID. And so this snippet now allows me to include it in every service’s config like this:

smtp-in:
  container_name: smtp-in
  # ...
  <<: *logging

This is neat. 🤓