Hugo build fails without clearing cache

I have a site and different sub-branches which all of them integrated to Github
CI-CD pipeline to auto-build/deploy. The site (with branches) runs almost for 3
years. Recently I got persistent error on every builds. The build can’t
complete the task and give similar error every time:

5:12:01 PM: Build ready to start
5:12:03 PM: build-image version: d84c79427e8f83c1ba17bcdd7b3fe38059376b68
5:12:03 PM: build-image tag: v3.6.1
5:12:03 PM: buildbot version: 28b3358b4c94ec0a3a8c81b155535f3f8c2745f1
5:12:03 PM: Fetching cached dependencies
5:12:03 PM: Starting to download cache of 287.0MB
5:12:06 PM: Finished downloading cache in 3.066177938s
5:12:06 PM: Starting to extract cache
5:12:15 PM: Finished extracting cache in 9.140952106s
5:12:15 PM: Finished fetching cache in 12.308016169s
5:12:15 PM: Starting to prepare the repo for build
5:12:16 PM: Preparing Git Reference refs/heads/tech
5:12:20 PM: Error fetching branch: https://github.com/SerhatTeker/serhatteker.com refs/heads/tech
5:12:20 PM: Failing build: Failed to prepare repo
5:12:20 PM: Failed during stage 'preparing repo': exit status 1
5:12:20 PM: Finished processing build request in 17.17616741s

OK you can’t fetch the branch but why? You already have access. Can’t see the
detail.

After Clear Cache and Deploy build works and deploys the site.

How can we resolve this isssue?

  • My static site builder: HUGO
  • My site: friendly-mcclintock-e11d29.netlify.app
  • Branch/Subdomain : tech

And you can see my netlify.toml below:

[build]
command = "hugo --gc --minify"
publish = "public"

[dev]
  autoLaunch = false

[context.production.environment]
HUGO_VERSION = "0.80.0"
HUGO_ENV = "production"
HUGO_ENABLEGITINFO = "true"

[context.split1]
command = "hugo  --gc --minify --enableGitInfo"

[context.split1.environment]
HUGO_VERSION = "0.80.0"
HUGO_ENV = "production"

[context.deploy-preview]
command = "hugo  --gc --minify --buildFuture -b $DEPLOY_PRIME_URL"

[context.deploy-preview.environment]
HUGO_VERSION = "0.80.0"

[context.branch-deploy]
command = "hugo  --gc --minify"

[context.branch-deploy.environment]
HUGO_VERSION = "0.80.0"

[context.next.environment]
HUGO_ENABLEGITINFO = "true"

Btw I read the below guide but none of the options below apply to my site since I have a HUGO
builder therefore I don’t have any node_modules nor package.json nor yarn.lock.

Also can’t clear cache from netlify.toml. Tried below suggestions:

Hello there, @serhat :wave: !

Sorry for the delayed response on our end. I have sent this question over to another member of my team so that they can have a look. If you have made any additional progress in the past 11 days, please include that here!

I am hoping to have some next steps for you soon, and thanks again for your patience.

Hi @hillary!

Thank you for the return.

Unfortunately no progress since then; still looking for a fix/work-around etc.

Hi, @serhat. I’m seeing the following error occurring in our backend:

Error fetching branch: https://github.com/SerhatTeker/serhatteker.com refs/heads/tech: From https://github.com/SerhatTeker/serhatteker.com
 * branch            tech       -> FETCH_HEAD
Fetching submodule themes/hello-friend
Fetching submodule themes/hello-friend-ng
Warning: Permanently added 'github.com' (RSA) to the list of known hosts.
fatal: remote error: upload-pack: not our ref 1d438b9303c6b16a61f0061e48281bdf32762227
From https://github.com/SerhatTeker/hugo-theme-hello-friend
 * branch            0b3e41ba1bc76ca2bcd13007d30c32f1425bb002 -> FETCH_HEAD
 * branch            768227b8f0113c3beddea6109bc9d94847f5707a -> FETCH_HEAD
 * branch            77c36a2bbb8c5ad2476fb5088faa9dc4ea3af887 -> FETCH_HEAD
 * branch            d397311c45483fd3a7c39dd04ac5c6b646a3e8aa -> FETCH_HEAD
Errors during submodule fetch:
	themes/hello-friend-ng

Note, if that is a private submodule, the following support guide covers how to enable access:

If there are other questions after reading that support guide or if the issue is still not resolved, please let us know.

Hi @luke,

Thank you for your answer.
However I believe private repo access is not the issue -at least on my side. Since:

  1. I already followed and added SSH Deploy Key long ago, as described in the link you shared.
    Key added date: 13.01.2021. Also as you can see in the below screenshot it was already used before.

  1. I can deploy the site and got no error after Clear Cache and Deploy.
    If netlify couldn’t have accessed to my private repo, it shouldn’t be deployed after Clear Cache and Deploy as well?

P.S:
I added new SSH Deploy Key today (19.02.2021), and deleted old one. But still same error like above. You can see it was used after Clear Cache and Deploy. Screenshot:

I think that the problem Luke quotes is not about repo access, @serhat . It’s a problem in your codebase around submodule references, which you’ll need to fix.

This stack overflow article has more details:

Thank you @fool.

It leads to an interesting direction.

The article points out the silver lining behind below 3:

  1. either the tag in the submodule was never pushed
    Submodule or superproject doesn’t have any tag.
  2. or (since it is working from other sources) XXX does not have access to the submodule - private dependencies
    We saw that netlify already access to the private repo.
  3. or it has some cached version of that submodule
    Yes, it seems this is our cause. After “Clear cache and deploy” build works.

I don’t mind and can use auto clear cache solution. However I tried below one before opening this topic, but it doesn’t clear the cache`.

For instance you can look at this deploy:- 6037e159e814fd1a435f6edd. Deploy 6037e2390b77810008db2ecd failed after it.

And if I use sudo it gives below error:

5:15:06 PM: ────────────────────────────────────────────────────────────────
5:15:06 PM:   "build.command" failed                                        
5:15:06 PM: ────────────────────────────────────────────────────────────────
5:15:06 PM: ​
5:15:06 PM:   Error message
5:15:06 PM:   Command failed with exit code 127: hugo --minify && sudo rm -rf $NETLIFY_CACHE_DIR/*
​
  Error location
  In build.command from netlify.toml:
  hugo --minify && sudo rm -rf $NETLIFY_CACHE_DIR/*
​
  Resolved config
5:15:06 PM:   build:
5:15:06 PM:     command: hugo --minify && sudo rm -rf $NETLIFY_CACHE_DIR/*
    commandOrigin: config
    environment:
      - HUGO_VERSION
    publish: /opt/build/repo/public
Caching artifacts

deploy ID : 601d52bf5610f900ea4b188a

Hi, @serhat. I show the build command used in that deploy here:

https://app.netlify.com/sites/friendly-mcclintock-e11d29/deploys/6037e159e814fd1a435f6edd

Was this:

9:41:55 AM: Different build command detected, going to use the one specified in the Netlify configuration file: 'git submodule update --force --init --recursive --depth=1 && hugo --gc --minify' versus 'hugo --gc --minify' in the Netlify UI

So the manual cache clearing wasn’t attempted. Would you be willing to test the rm -rf $NETLIFY_CACHE_DIR/* addition again? I don’t think it actually occurred in your previous test.

Hi @luke,

It seems I didn’t add it to branch/subdomain deploy. Now I added and ensured build command runs the rm -rf $NETLIFY_CACHE_DIR/:

https://app.netlify.com/sites/friendly-mcclintock-e11d29/deploys/604e17e75052f602b921e3b7#L91

5:05:22 PM: $ git submodule update --force --init --recursive --depth=1 && hugo --gc --minify && rm -rf $NETLIFY_CACHE_DIR/*
Submodule path 'themes/hello-friend-ng': checked out 'd50d87c311d6969c4ec6714066b6a6ad50aae1e5'

However it gives permission errors like below:

5:05:22 PM: $ git submodule update --force --init --recursive --depth=1 && hugo --gc --minify && rm -rf $NETLIFY_CACHE_DIR/*
Submodule path 'themes/hello-friend-ng': checked out 'd50d87c311d6969c4ec6714066b6a6ad50aae1e5'
5:05:22 PM: Submodule path 'themes/hugo-coder': checked out '4fea900ddc8e6973bc98b56e8d1ce98e362fcb22'
5:05:22 PM: Submodule path 'themes/hugo-devresume-theme': checked out 'c20034ec0a31ba255dec6b0983ecc1ea8804ef1b'
5:05:22 PM: Submodule path 'themes/hugo-notice': checked out 'ff7aa2a40c2f122ceb502150d77b0c55f4ce7878'
5:05:22 PM: Start building sites …
5:05:23 PM:                    | EN
5:05:23 PM: -------------------+------
5:05:23 PM:   Pages            | 332
5:05:23 PM:   Paginator pages  |  17
5:05:23 PM:   Non-page files   |   1
5:05:23 PM:   Static files     |  59
5:05:23 PM:   Processed images |   0
5:05:23 PM:   Aliases          |   1
5:05:23 PM:   Sitemaps         |   1
5:05:23 PM:   Cleaned          |   0
5:05:23 PM: Total in 962 ms
5:05:23 PM: rm: cannot remove '/bin/pwd': Permission denied
5:05:23 PM: rm: cannot remove '/bin/lsblk': Permission denied
5:05:23 PM: rm: cannot remove '/bin/bzexe': Permission denied
5:05:23 PM: rm: cannot remove '/bin/grep': Permission denied
5:05:23 PM: rm: cannot remove '/bin/zgrep': Permission denied
5:05:23 PM: rm: cannot remove '/bin/findmnt': Permission denied
5:05:23 PM: rm: cannot remove '/bin/dir': Permission denied
5:05:23 PM: rm: cannot remove '/bin/date': Permission denied
...
...

After very long line of permission errors (about 85K) the build failed:

...
...
5:13:52 PM: rm: cannot remove '/var/backups': Permission denied
5:13:52 PM: Caching artifacts
5:13:52 PM: chmod: cannot access '/opt/buildhome/.gimme_cache': No such file or directory
5:13:52 PM: mkdir: cannot create directory ‘/opt/build/cache/node_version’: No such file or directory
5:13:52 PM: mv: cannot stat '/opt/buildhome/.nvm/versions/node/*': No such file or directory
5:13:52 PM: mkdir: cannot create directory ‘/opt/build/cache/ruby_version’: No such file or directory
5:13:52 PM: mv: cannot stat '/opt/buildhome/.rvm/rubies/ruby-2.3.6': No such file or directory
5:13:52 PM: Cached ruby version 2.3.6
5:13:52 PM: Build failed due to a user error: Build script returned non-zero exit code: 2
5:13:52 PM: Failing build: Failed to build site
5:13:52 PM: Finished processing build request in 9m26.699202884s

Hi Serhat! Sorry - that should have worked but in fact that $NETLIFY_CACHE_DIR is somehow not set during your build so you are removing the root directory - including your just-built site! So even if it worked better - didn’t error out - your site would not deploy in a useful way.

We did some testing and this should work better as a build command for you:

git submodule update --force --init --recursive --depth=1 && hugo --gc --minify && rm -rf /opt/build/cache

Let me know how it goes!

Hi Chris!

I see. Thanks a lot for detail and clarification.

I have integrated the new command — rm -rf /opt/build/cache, however unfortunately didn’t work. There is no permission error this time but the new build loaded cache again.

Failed build can be found:
https://app.netlify.com/sites/friendly-mcclintock-e11d29/deploys/605b6e586e629a0007805cb2

First implementation of new rm can be found at this build:
https://app.netlify.com/sites/friendly-mcclintock-e11d29/deploys/605b6d4edc2fe000b516bd54#L92

P.S: these two are consecutive builds.

Well, I guess we can stop trying to hack that into place. Sorry we didn’t find a good solution for you.

Our dev team is working on an improvement to our webhooks to allow cache clearing as part of them, which will be your solution once it is shipped. I do not have a firm ETA on this work, but will report back here when it is ready.

Good enough. Thank you a lot for your time and interest.

Till then, best…

1 Like