How to run Docker and get more sleep than I did

Bugs in Gusto’s codebase mean that people don’t get paid. Mortgage payments, college tuition, child support—they all rely on accurate, timely paychecks. That’s why we need tens of thousands of tests across Ruby, Python, and Go, to run on every commit.

Our heavily Dockerized continuous integration (CI) pipeline helps us do this, at-scale, for all of our repositories.

I started writing the best-practices list below after hundreds of hours and many late nights spent staring at logfiles, fixing Docker breakages, and troubleshooting buggy code. It’s been pretty helpful to other Gusto engineers, and I hope it is for you too.

Where this list began

I thought I was seeing things one night.

I was debugging ActiveRecordDoc’s integration with our Docker-based continuous integration (CI) pipeline. ActiveRecordDoc is a homegrown Ruby gem that validates PII-related fields we have on most of our database tables. Unfortunately, the CI step that ran the gem’s Rake tasks was randomly hanging, making the test suite extremely flakey for the entire engineering team, reducing iteration velocity, and harming confidence in deploys.

After spending most of the evening digging through logs, I decided to try running the exact same script locally. I eventually saw my local terminal hang the same way as the CI step.

I tried to repeat the process—but it worked fine, over and over again.

Was I hallucinating?

I tried nine more times. The script finally hung right before it was supposed to execute a special Ruby function. I kept my hands off the keyboard, and the script stayed hung.

I cautiously hit the “Up” arrow.

Execution continued.

Huh.

Reading through the code, it looked like the hanging Ruby expression needed to output the name of every single loaded Rails model class on a single line. This was easily 300,000 characters sent to STDOUT at once. I wondered if the raw quantity of characters on a single line was causing an issue. I stopped logging the line, and the flake mode stopped.

Welp.

My best guess is that some buffer in the kernel wasn’t getting flushed normally, which effectively served as backpressure on the Docker process’ execution as a whole. Somehow, sending any input flushed the buffer and continued execution. Some Googling revealed a few similar bugs (here, here, and here) that were never fully resolved.

I got around this for ActiveRecordDoc by adding a SafeLog Ruby function that would chunk the output if a single line exceeded 10,000 bytes. The flakes were fixed. Great!

00100dPORTRAIT_00100_BURST20181107142559638_COVER-3-
Dramatic recreation: When we find a bug

Basics

Docker provides a mechanism for defining, instantiating, and maintaining isolated execution environments. We think of Docker containers as heavier weight than processes, but lighter weight than true virtual machines. They excel at helping us run CI, generate reproducible environments, and better utilize host resources for deployed apps. Because they are not completely isolated—for example, sharing UIDS/GIDS with the host system (but only on host systems like Linux that support UIDs/GIDs)—they can be challenging to durably use when orchestrated.

Terminology

Image: Immutable binary blobs stored on-disk or in Dockerhub that can be instantiated into “containers”. docker images.

Container: An instantiated image that runs arbitrary commands. docker container ps

Host System: The "underlying" system that is running a docker container. If you're running Docker containers on your laptop, the laptop is the host system.

Volume: A filesystem that a docker container is able to access. A container can share volumes with other containers, and the host filesystem. docker volume ls

docker-compose: An orchestration mechanism for Docker containers. docker-compose ingests a provided .yml file and generates raw Docker commands. In theory, everything you can do with docker-compose, you can do with docker.

Versioning

We once had a CI outage because the mysql default image on Dockerhub got upgraded from 5.7 to 8. The Ruby installation relied on 5.7, which didn’t exist anymore on newer hosts. Some best-practices we derived:

  • When inheriting from a base image, don't leave the base image unversioned (FROM: ruby). Always version it: (FROM: ruby:2.3.4).
  • In your docker-compose.yml file, always explicitly specify a version of docker-compose with the version key. There are major differences between versions 2, 3, etc.
  • Make sure the version of the docker engine you’re using is itself fixed across your fleet. Upgrade it periodically. The Docker binary is sometimes upgraded in backwards-incompatible ways.

Layering

Layers are simply files stored on the host operating system that are composed by Docker’s Union Filesystem driver to generate the final image. Intelligent layer usage can simplify deployment and debugging, while copy-pasting or misunderstanding layers can lead to a lot of needless work.

  • Each local line in a Dockerfile generates a "layer" for the image that's ultimately generated.
  • Each layer depends on every previous layer. Each layer is cached, which can significantly speed up your Docker builds. A Docker Image is a represented by a linear collection of layers.
  • The lower the churn on a line in a Dockerfile, the earlier in the file it should be listed. Higher-churn lines should be near the bottom of the file. This will speed up builds by letting the Docker image builder rely on caching.
  • If you find yourself copy-pasting the same set of layers between different Dockerfiles, take a look at multi-stage builds. They can help eliminate redundancy across your codebase.

Isolation

When we were initially setting up our CI system, we would regularly see failure modes where tests would fail in non-reproducible ways. After a lot of digging, it turned out that a new build’s containerized databases would sometimes connect to an old build’s containerized databases, proceed normally, and promptly read polluted state. We now emphasize isolation pretty heavily with all of our Docker usage to prevent this sort of failure mode from reoccurring.

  • Use the --project-name flag with docker-compose (to prefix all generated containers) with something uniquely trustworthy like BUILDKITE_JOB_ID, or a PRNG -derived number.
  • Explicitly run docker-compose down or docker stop after your code finishes.
  • If you have to manually orchestrate docker containers, name volumes and networks uniquely so that there is no cross-run pollution.

Resource Utilization

As we ramped up our Docker usage, we got unexpected “No space left on device” errors on long-running CI hosts. We discovered that Docker images, volumes, and networks are automatically persisted to disk, grow quickly, and are not automatically garbage-collected. We also found that one live container on a host could easily take up too many resources and crowd out other containers if unchecked.

  • A single docker container has no resource constraints and is only bottlenecked by what the kernel / host OS will give it.
  • docker and docker-compose support special flags to restrict a container's memory, cpu, swap consumption.
  • Run docker system prune --volumes periodically to clear out images, volumes, networks, etc that are unassociated with currently-running containers. To avoid race conditions with live containers you can use a date filter as so: docker container prune --filter "until=24h” to only prune containers that are a day or older.

Deploying

We initially used Dockerhub to store and retrieve all of our images. Unfortunately, Dockerhub writes failed regularly and were a significant cause of CI flakiness. It was also easy to accidentally overwrite images, with no audit trail present. There are SaaS alternatives like Docker Trusted Registry, Artifactory, and others, but they are extremely expensive and we don’t yet need all their features.

  • We currently self-host an instance of the open-source Docker Registry to serve as a write-around cache and eliminate Dockerhub’s reliability concerns on reads.
  • We believe that relying solely on image_name:latest to serve the latest instance of image is an anti-pattern, since the old image will be overwritten.
  • Instead, make sure there is always a copy of your image in Dockerhub. For example, if you push up an image named, say, gusto/zenpayroll:latest, make sure the image with a tag of its raw MD5/SHA also exists in Dockerhub (gusto/zenpayroll:abc123).
  • Docker images & layers are just stored as files on the filesystem, and are straightforward to bake into bootstrapping scripts. This speeds up startup time (eliminates network reads with docker pull) and can reduce requests to your artifact stores.

00100dPORTRAIT_00100_BURST20181107143015533_COVER-1--1-
Dramatic recreation: When we fix a bug

Volumes

When we started, we naively assumed that Docker volumes were a simple solution to a broad class of problems we had around speed, development, and secrets. Unfortunately, we found they turn out to have meaningful challenges around permissioning, speed, and troubleshooting.

  • If you’re developing in Docker, be sure to volume in code from your host filesystem, instead of ADD’ing or COPY'ing, to drastically decrease iteration cycles.
  • Never bake secrets and private cryptographic materials into an image. If you can't retrieve secrets from a service, volume them in from the host filesystem.
  • Prefer named volumes to anonymous/unnamed volumes.
  • It's easy for undeletable root-owned files to be created on the host filesystem. When possible, execute container operations as a user that does not have root permissions on-disk.
  • If you volume in files from the host container, they’ll import the exact same bits for their UIDs/GIDs—even if there are no equivalent UIDs / GIDs in the container’s execution context, or they map elsewhere. Tread carefully with your ADD directives!

Miscellaneous Oddities

There are a lot of corner cases with Docker. Stay vigilant and keep a lookout for weirdness.

  • Docker will nondeterministically hang—forever—if you try and print out too much output in a single call to STDOUT. This is a long-standing Docker bug, is astonishingly easy to replicate, and has been the source of a several weird flakey tests for us.
  • To get around this, make sure you split up extremely long lines in subsequent STDOUT calls. We don't know what the exact threshold looks like yet (and may be system-dependent), but a few thousand characters on a single line has been a good ballpark for us to stay below.
  • Volumes on OS X are incredibly slow. If you’re running a Rails server from a Docker container, and trying to volume in hundreds of assets, it will be extremely slow due to how serialization of open() calls work in Docker for Mac. This limitation does not exist when running Docker in Linux since Docker actually uses the host OS’s kernel, while Docker for Mac runs an Alpine Linux VM behind the scenes.

Debugging Tips

It’s pretty common to need to debug running docker containers. Since we’re running Rails tests pretty commonly, we might need to, for example, modify RSpec files to insert pry statements, or rebuild images on a CI host with different package versions installed. We discovered a few tips to make debugging easier.

  • Ensure a bare-bones editing environment (vim/nano, curl, etc) exists as part of your Dockerfile, even for images derived from lightweight base images like Alpine. If you need to debug/modify/reboot a live container, it'll be significantly easier.
  • Exploit layers and the FROM directive heavily. One of the most expensive parts of a Dockerfile will usually be the apt-get install MyCustomPackages && MoreCustomPackages. If you place these into their own layers, you can modify the rest of the file without needing to rebuild everything.
  • Sometimes, if you're unsure why something is installed, the fastest debugging tip will be just to cut out all the layers except the one you think is introducing the bug, and dropping in a shell in a running container to observe the environment.
  • While docker run spins up a new docker container, docker exec -it allows you to attach to an existing container. Use docker exec -it bash liberally for debugging broken containers.
  • docker ps shows you all containers that are currently running, along with their container ids. Use in conjunction with docker exec -it containerid123 bash to debug.
  • Get comfortable with the --verbose option. Although it can feel overwhelming, it can help debug many gnarly orchestration issues, with docker-compose and raw Docker.
  • Get good at reading long log lines. Most of bugs I’ve found were discovered by reading endless amounts of logs and seeing what was missing, or shouldn’t have been there in the first place.

Conclusion

Docker has been a great tool to support our self-hosted CI/CD pipeline. It's unlocked significant cost savings (30%+) over cloud solutions, saved engineering time, and helped scale our release engineering workflows. It does have its quirks, but if you manage them effectively, it’s entirely possible to keep your infrastructure stable—and still get to bed on time.