Monorepo and long builds

We recently converted to a monorepo (lerna). The root is our application and then there are a couple other projects in a packages directory, both component libraries which are used by the root project but also published to npm for use by other projects. Everything is working but this has almost doubled our build time. The libraries code is change fairly infrequently. It would be nice to cache the builds/deps of the projects in packages directory. It’s not clear to me based on documentation surrounding the new monorepo support whether there is anything I can do. Separate toml files with build.ignore? Doesn’t sound like quite what I need. Anyone have ideas?

Hi, @alisman, does your site build have a way of detecting the cached libraries?

If so, there is a directory which is saved at the end of each build. Netlify restores some contents of this cache directory automatically. My point is, if you store things in this directory which aren’t one of the directories there that Netlify automatically caches and restores then you will need to store/restore your cached directory (or directories manually).

The repository for a site is cloned to /opt/build/repo. The cache directory is found at /opt/build/cache (or ../cache relative to the base of the repo).

I’m going to take an example site build which uses the following build command:

yarn run build

And, in this example, let’s pretend that this build process creates something we want to cache in a directory named example (found in the base directory of the cloned repo)

Then your new build command might become:

cp -R ../cache/example . ; yarn run build ; cp -R example ../cache/

Breaking that down:

  • restore cache => cp -R ../cache/example .
    • this report will ouput an error in the build log if the directory doesn’t exist (because, for example, the build cache was cleared) but the build will continue even if this copy fails
  • the original build command => yarn run build
  • save the directory to the cache directory for next time => cp -R example ../cache/

You might even put all of this (and more) in a bash script and make that script your Netlify build command instead. The script below checks for the presence of the directory and creates it if it is missing.

#!/bin/bash

# create if directory missing
if [ ! -d ../cache/example  ] ; then
  mkdir ../cache/example
fi

# restore our "example" directory
cp -R ../cache/example .

# build
yarn run build

# save the "example" directory
cp -R example ../cache/

You can then check this script into your repo and track it in git with the rest of your code.

Regarding the build.ignore option, it can be used to detect if something has changed and therefore if there should be a build or not. You can use a custom command here to do this check and the build will be cancelled if this check returns a zero exit code. (If there is a non-zero exit there is a difference and therefore the build should be continued - not cancelled. Zero means no difference and we cancel on that.)

How you do this check is up to you. I believe our default check is this:

git diff --quiet HEAD^ HEAD

However, if you only care if directory example-2 has changed, then make the build.ignore something like so:

git diff --quiet HEAD^ HEAD example-2/

So, you might use the cache directory, the build.ignore, or both to prevent unwanted build usage. If more than one directory needs to be cached, then adjust the commands to accomplish this (or add additional ones).

​Please let us know if there are questions about either method and we’ll be happy to go into more detail about it.

@alisman, I want to mentioned I made some edits above which corrected a small but important error. The build is cancelled on a zero exit from the build.ignore not the non-zero exit. Non-zero means there is a difference so the build should continue - not cancel.

I had mistakenly gotten it backwards before. I added clarification about this in the post above as well.

Last, but not least, welcome to our Netlify community site as well. :+1:

Hey luke, thanks so much for you super helpful response.

1 Like

Hi Luke,

I wrote the following script, which works locally, but not on Netlify. On Netlify, I do get the “cache primed” log, so I think the artifact is being stored in the cache folder, but when i rerun the build, the same check [ -d “$CACHE_DIR/vendor_$vendorKey” ]; fails to find the artifact. It’s as if the cache is getting cleared.

Could you link us to a deploy where the cache appears to get cleared, please?

Here is the evidence for site (Netlify App)

You can see the script here: Cachbuild by alisman · Pull Request #3013 · cBioPortal/cbioportal-frontend · GitHub

Evidence:

Log from build 1:
5:34:11 PM: cache primed for /opt/build/cache/packages-root/vendor_78ddbcaa7a3beeabe73a39f11ea0c3da

Log from build 2 where i expect to find cache primed. (this runs the same command as was used above for the cache primed log)
5:44 no cache detected for /opt/build/cache/packages-root/vendor_78ddbcaa7a3beeabe73a39f11ea0c3da

That deploy you linked definitely had a cache used, as you can see at the top of the build logs:

2:30:51 PM: Starting to download cache of 627.9MB

I should have asked for you to link the second one too since you build so much and I can’t easily find it - hopefully you can and share it?

1st one where i log that session is primed

2nd one where i expect cache to be primed but is not Netlify App

Note in my script that i run the SAME test both after priming and before building the module. The test resolves true after i prime it but not when the next build starts. That’s why it seems to me that the cache is getting cleared.

if [ -d “$CACHE_DIR/vendor_$vendorKey” ];

And thanks very much for your help!

Hmm…second one shows we started with a cache as well:

2:41:37 PM: Starting to download cache of 628.1MB

Couldn’t really say what your code is doing exactly; have you tried running locally to try to understand better?

GitHub - netlify/build-image: This is the build image used for running automated builds (this specifically describes how to save and use a cache)