From what I can tell you’re serving page views directly via function calls at runtime, is that correct?
That’s correct. I think there’s Node 20.x hosting a Sveltekit site using a standard Sveltekit adaptor. The Netlify stuff doesn’t give much technical information out, but it says “Functions level 0”. The site “endpoint” is /.netlify/functions/sveltekit-render
The Mongo access is caused by auth.js using a standard MongoDB adaptor, which makes a connexion as a client of Mongo in start up. I have one “scheduled function”, which uses an identical MongoDB client, but that works fine/ never errors. I can’t log the error to MongoDB as it’s happening in the making of that connexion, but I do have Netlify’s server logs.
Typically Netlify “wake from idle” works fine, I can see the smooth start up in the Netlify server logs there. The issue occurs for me only when upgrading the system.
What Works
Here’s a typical upgrade scenario, showing fault-free behaviour:
That shows my site’s startup preamble with a truncated Mongo connexion string, the hash and date of the build etc. The first startup attempt fails with the Error 128, which is coming from Mongo. The failure looks like it may take a couple of minutes in the log, although the timeout is presumably 30s.
What doesn’t work
The user symptom is that the first http request to the Netlify cluster [?} after an upgrade typically returns the 128 error instead of the expected site html; there’s a black screen with a Neflify error box in the middle of it. Refresh that page, and all is well again. Upgrades are pulled by Netlify from GitHub on the main branch.
Looking at those Netlify logs, this below is a typical extract. The error comes from MongoDB. That’s something I need to work out with them. However it’s always going to be possible that such an error may take place in operational conditions for whatever reason. I need to understand how to catch that type of error a bit more gracefully if I’m to use this for paying customers.
I suppose I could change my deploy so I push from GitHub and always hit the site a couple of times after deploy, that would get around this issue.
My Mongo connexion is failing, and that looks like it’s causing the two subsequent errors below. A simple site refresh and Mongo is all happy again, so the site starts and continues to work. I’m trying to work out if there’s anything I can do to make that a bit smoother, especially for the one user who will get the black screen error message thing.
Apr 23, 03:22:27 PM: f526cad1 ERROR Unhandled Promise Rejection {"errorType":"Runtime.UnhandledPromiseRejection",
"errorMessage":"MongoServerSelectionError: Server selection timed out after 30000 ms",
"reason":{"errorType":"MongoServerSelectionError",
"errorMessage":"Server selection timed out after 30000 ms",
"reason":{"type":"ReplicaSetNoPrimary","servers"
...
at process.<anonymous> (file:///var/runtime/index.mjs:1276:17)","
at process.emit (node:events:518:28)","
at emit (node:internal/process/promises:150:20)","
at processPromiseRejections (node:internal/process/promises:284:27)","
at processTicksAndRejections (node:internal/process/task_queues:96:32)","
at runNextTicks (node:internal/process/task_queues:64:3)","
at listOnTimeout (node:internal/timers:540:9)","
at process.processTimers (node:internal/timers:514:7)"]}
Apr 23, 03:22:27 PM: f526cad1 ERROR [ERROR] [1713882147748] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 400.Apr 23, 03:22:27 PM: f526cad1 ERROR RequestId: f526cad1-6302-47a5-b765-c2208f7be8fa Error: Runtime exited with error: exit status 128
Apr 23, 03:22:27 PM: f526cad1 ERROR Runtime.ExitError