Sporadic deployment failures for Large Media site

Hi,

We have a Netlify Large Media-enabled site (afs-media.netlify.app) fail deployment about once every two weeks with the following error:

6:09:12 PM: ────────────────────────────────────────────────────────────────
6:09:12 PM:   Internal error during "Deploy site"                           
6:09:12 PM: ────────────────────────────────────────────────────────────────
6:09:12 PM: ​
6:09:12 PM:   Error message
6:09:12 PM:   Error: Deploy did not succeed: Failed to execute deploy: [PUT /sites/{site_id}/deploys/{deploy_id}][500] updateSiteDeploy default  &{Code:0 Message:}
6:09:12 PM: ​
6:09:12 PM:   Error location
6:09:12 PM:   During Deploy site
6:09:12 PM:       at handleDeployError (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/plugins_core/deploy/buildbot_client.js:87:18)
6:09:12 PM:       at deploySiteWithBuildbotClient (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/plugins_core/deploy/buildbot_client.js:68:12)
6:09:12 PM:       at processTicksAndRejections (node:internal/process/task_queues:96:5)
6:09:12 PM:       at async coreStep (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/plugins_core/deploy/index.js:45:5)
6:09:12 PM:       at async fireCoreStep (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/steps/core_step.js:39:9)
6:09:12 PM:       at async tFireStep (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/time/main.js:20:59)
6:09:12 PM:       at async runStep (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/steps/run_step.js:88:7)
6:09:12 PM:       at async pReduce.index (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/steps/run_steps.js:91:11)
6:09:12 PM:       at async runSteps (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/steps/run_steps.js:51:7)
6:09:12 PM:       at async runBuild (file:///opt/buildhome/node-deps/node_modules/@netlify/build/src/core/main.js:610:7)
6:09:12 PM: ​
6:09:12 PM:   Resolved config
6:09:12 PM:   build:
6:09:12 PM:     base: /opt/build/repo/media
6:09:12 PM:     command: npm run build-media
6:09:12 PM:     commandOrigin: ui
6:09:12 PM:     environment:
6:09:12 PM:       - INCOMING_HOOK_BODY
6:09:12 PM:       - INCOMING_HOOK_TITLE
6:09:12 PM:       - INCOMING_HOOK_URL
6:09:12 PM:       - NETLIFY_LFS_ORIGIN_URL
6:09:12 PM:       - ONEGRAPH_AUTHLIFY_TOKEN
6:09:12 PM:     publish: /opt/build/repo/media/dist
6:09:12 PM:     publishOrigin: ui

The site consists only of image files, but there a lot of them (~9000 / 2GB). While the site is connected to a GitHub repo, we keep builds deactivated until we need one, then we activate, call a build hook, then deactivate. This is to avoid needlessly triggering builds when no images are affected. Most of the time this works perfectly, but every so often this error occurs.

Is there sometime we can do to avoid this error? Any help would be much appreciated!

Hi, @gsjen123. I wish I could determine the root cause but I cannot. Our developers will need to research to find the root cause here.

I’ve filed an issue to track this and we will follow-up here to let you know if the issue is resolved. In the meantime, triggering a new deploy is the only workaround. While I agree this workaround is far from ideal, we don’t count build minutes for failed deploys so it won’t impact your build minutes use.

I do see this happening more frequently for sites with many files (as you mentioned, this site has over 9000). However, it also happens for sites with less than 50 files (but far less often).

I do think the error itself is happening randomly and isn’t specific to what your site does. For example, we see this error for sites not using Large Media so it isn’t Large Media causing it.

Larger sites tend to spend longer uploading (more files means it takes longer to upload) and I believe this is why it is seen more frequently for larger sites. It is a random error so the longer spent in this stage the more likely the error is to occur.

Again, I don’t have a fix today but I wanted to let you know it is being tracked now. If there are other questions or concerns, please let us know.