Roy Lindauer

Execute Workflows with Path Filtering in CircleCI

As I often love to tell people, I love monorepos and I use a monorepo for my own projects. But there is not really an out of the box solution for monorepos so you often end up having to write scripts to glue everything together. Sometimes it really does feel like wadding up a bunch of projets into a loose ball then duct taping and hot-gluing them together into a much bigger ball. It's just that over time the glue gets stronger and the tape much more pleasing to the eye and overall the techniques become more sophisticated.

Some of the glue required for a monorepo is around building only projects that have changes. As the monorepo grows you are going to run into isues where every minor change is building, testing, and deploying every single project. This can start to become quite costly in terms of both money and time. My solution here is to use CircleCI "pipeline parameters" and path filtering to selectively run workflows. This way we only run builds for projects that have changes.

The basic life cycle of a build in my monorepo looks as such:

  1. Commit to the github repository triggers a build at CircleCI
  2. CircleCI build runs a single workflow that calculates which pipeline parameters to enable based on path filtering, then makes an API call to CircleCI to create a new build
  3. New build runs only the workflows enabled by the pipeline parameters

This is very specific to my monorepo, so it does not include forked pull requests but you could easily update the script to support those builds as well.

I've created a demo project on Github and on CircleCI

The glue of all of this is path filtering. To be able to get a list of files that have changed we have to determine what is our base revision and then perform a git diff between the current commit and that base revision. The logic I have settled into is as such: if the current branch is a main branch (which is develop, staging, or prod in my monorepo) then set the base revision to the commit of the last successful build for this branch. If it's not a main branch then try to find the first commit that is shared in multiple branches, or the parent commit of the current branch. If that cannot be identified then we default the base revision to HEAD~1. We can override this entirely by setting the ENV var BASE_REVISION to the exact base revision we want to use. With a base revision then we can perform a git diff to get a list of paths, then we iterate over the paths to match them to our path filtering map. That gives us the list of pipeline parameters to enable which we use to make an API call to CircleCI to create a new build.

The map is pretty simple. The key is the path we want to match and the value is a hash of the pipeline parameter and the value to set.

@path_filter_mapping ||= {
  '.circleci/.*' => { 'run-ops-workflow' => true },
  'bin/.*' => { 'run-ops-workflow' => true },
  'build/.*' => { 'run-ops-workflow' => true },
  'infrastructure/.*' => { 'run-infrastructure-workflow' => true },
  'src/project1/.*' => { 'run-project1-workflow' => true },
  'src/project2/.*' => { 'run-project2-workflow' => true }
}

There is more glue that we need to add to our wadded up monorepo ball. CircleCI has a featured called "auto-cancel redundant builds". You will need to disable this feature and use a custom script to take over that responsibility. The reason is that CircleCI cancels redundant builds for the entire branch, when what we want is to only cancel redundant workflows. We may want 5 builds happening for prod, but each build is a different project.

Definitely checkout the demo project at Github and on CircleCI to see each of the scripts and the circleci configuration.

Example:

I committed a change to the main branch to a file under bin. The workflow ci determined that the only workflow we need to run is ops

I then made a change to two projects. The ci workflow determined that we needed to run project1 and project2 workflows.

The build failed and so I fixed the issue. Because I committed to the main branch the ci workflow set the base revision to the commit of the last successful build for the main branch, which was the build prior to the one that failed, and so it determined that it needed to run three workflows, ops, project1, and project2