This past summer, I published a blog post called “Laying the Cultural and Technical Foundation for Big Rails” and spoke about it at RailsConf.

Since then, we’ve made a ton of development and improvements, and in this blog post I want to share about the latest public iteration of the RubyAtScale Modularization Toolchain that we use at Gusto to modularize our large Ruby on Rails application and is currently under active development and improvement.

This post is mostly meant as a how-to guide to effectively use these solutions, but in a future post I’ll dive more into the problem of large, entangled codebases and how modularization principles can be used to address that problem.

Packs and their Extensions

The toolchain starts with packs, a pretty simple specification for a ruby package, or “pack” as we call it. By itself, packsdoesn’t do much. Instead, it’s extended by a modular suite of tools that can be adopted gradually.

For example…

  • packs-rails can be dropped into your Rails application to ensure Rails Autoload Paths are set up correctly. This also includes helpers to make rspec and factory_bot integrate with your packs.
  • rubocop-packs contain cops that are intended to help improve boundaries between your packs.
  • packwerk, from Shopify, can be used to define architectural rules about pack boundaries. More on packwerk below.
  • packwerk-extensions, which use packwerk’s extensible API to provide other types of pack boundaries.
  • danger-packwerk, which provides automated inline pull request comments related to architecture boundary violations.
  • code_ownership can be used to specify ownership of packs and integrate it into various developer tools.
  • use_packs exposes a CLI, bin/packs, that makes it easy to create new packs, move files between packs, and more.
  • pack_stats makes it easy to send metrics about pack adoption and modularization to your favorite metrics provider, such as DataDog (which has built-in support).

How Is a Pack Different From a Gem?

A ruby gem is the Ruby community solution for packaging and distributing Ruby code. A gem is a great place to start new projects, and a great end state for code that’s been extracted from an existing codebase. packs are intended to help gradually modularize an application that has some conceptual boundaries, but is not yet ready to be factored into gems.

How Packwerk Works

Packwerk does the following three things in order:

  • Parses a graph of “pack” nodes, creating edges based on references to Ruby Constants.
  • Provides a simple and extensible, YML based declarative language for constraining that graph in various ways (e.g. dependencies and public API usage), called package.yml
  • Outputs the “diff” between the declared graph and the constrained graph as another YML file called package_todo.yml

Packwerk analyzes each reference to a ruby constant with something called a “checker.” A checker takes in a reference to a constant, which includes information about what pack is referencing the constant as well as what pack defines that constant. Each checker can define arbitrary rules for whether the constant is a “violation” or not. For example, the “dependency” checker considers a reference a violation if one pack references another without listing it as an explicit dependency.

How to Get Started

The toolchain here has developed alongside Gusto’s journey to modularize its Rails application. Along the way, because we believe in gradual and incremental improvement, we wanted to make sure we always left a path for new applications to adopt this framework in a way that adds value incrementally. Here’s how I’d go about advising an organization to use these tools to modularize their app.

Step 1: Break up your code into domain-based packs

Before getting into the process of using packwerk or rubocop-packs to improve system boundaries, the simplest way to start is to move code files into packs. Check out the blog post linked in the introduction for a before and after of what this means. use_packs exposes a helpful CLI – bin/packs — which developers at Gusto use to create new packs and move files into them.

There is no silver bullet for how your application should be broken up. The best way I’ve found is to sit down with subject matter experts (including product) and make a best first attempt at breaking up your application into smaller domains. Feel free to allow imperfection. Your first attempt at domain boundaries will never be perfect, and that’s okay!

The important thing to remember is that this is easy to change if you get it wrong, since you can always merge your packs back into a mega-pack or move files around again. In Rails, moving a file between packs only changes “autoload paths,” meaning how the constant the file defines is referenced remains unchanged. At Gusto, this means developers can experiment with boundaries inexpensively and with low risk, and they regularly move files around as their understanding of boundaries improves.

Step 2: Find owners for packs

The next step would be to use code_ownership to create code teams within your application and assign them to packs. Having packs be assigned to owners creates accountability, which is useful for everything from creating appropriate context in automated developer feedback and observability tools to just making sure folks know who they can talk to about changing code they are unfamiliar with.

Step 3: Begin enforcing your pack dependency structure

One thing packwerk can do is enforce the dependencies between your packs. By having each pack specify what it depends on, you can use packwerk to create a package_todo.yml file, which represents “violations” between packs, such as the use of a pack without a stated dependency. This tool makes it easy to start creating a technically enforced architecture diagram.

Begin using this by setting enforce_dependencies to true on a couple of packs you feel the most confident about. For example, you might feel very confident that your feature flags or authorization framework should not depend on your domain, but packwerk may reveal through “dependency violations” that they do. Fix these by aligning the code with the design. Once all dependency violations are complete, you can set enforce_dependencies to strict . This locks in your progress and prevents architecture regressions.

Fixing dependency violations first helps ensure code is in the right place. Once we feel confident code is in the right place, we can move onto improving the API to that code.

A quick note on packwerk:

The capabilities in this blog post are shipping in packwerk major version 3, which besides various bug fixes and performance improvements will also be shipping with the ability to extend checkers (constraints around constant references), validators, output formatters, and more! If you’d like to use packwerk 3 today, as the toolchain already does, you can build off of main in your Gemfile:

gem 'packwerk', github: 'Shopify/packwerk', branch: 'main'

Step 4: Begin enforcing public API boundaries

Next, packwerk-extensions supports the idea of a “privacy” checker. To use this, find packs where you believe should have a clear, well-abstracted API. Set enforce_privacy to true in those packs. Run bin/packwerk update-todo and try to fix most or all of the violations before shipping. This is important so consumers have public API to reach for when they get a privacy violation.

Step 5: Harden the good boundaries

Lastly, once your dependency structure is clean and the APIs between packs make the application simpler to understand, you can harden those boundaries with rubocop-packs . Cops such as Packs/RootNamespaceIsPackName ensure that every pack establishes exactly one top-level namespace equal to the pack’s name. Packs/DocumentedPublicApi ensures your public API is documented!

What’s next?

There’s a lot of detail missing in this post about the problems we are trying to solve, the symptoms of an entangled, monolithic codebase, how to go about determining domain boundaries, technical strategies for refactoring towards better boundaries, management and cultural techniques and changes that need to be made to support this work, and more.

There’s also so much more potential for the packs ecosystem and for Ruby Packages as a concept in general. At Gusto, our vision is to allow the packs ecosystem to support gradual modularization to permit packs to be built, tested, and even deployed independently, as determined automatically based on boundary improvement progress. For example at Gusto, we use a feature we call “conditional builds” to run a subset of tests that could possibly fail based on the system package graph.

Join us!

We’d love for you to try out this ecosystem and provide us feedback or contribute! I’d also love to hear you and your organizations’ approaches to modularization – I’m very supportive of all efforts to improve packaging and modularization capabilities in the Ruby language and Rails framework.

If you’d like to chat more, please reach out to me here or in the Ruby/Rails Modularity Slack Server. I’d be happy to chat more!