Hello
I am processing Paddle webhooks with the Netlify function and see silent fails with 503 code.
The function validates the webhook, saves the purchase to db, and sends an email to the customer, in this order and if any step fails, the function returns Response(500) and webhook retries.
In the last week, webhook started to fail often with 503 errors according to Paddle logs (I never return it in my code), the customer receives an email for every attempt, and there are only logs of successful attempts in the Netlify function.
It looks like on a cold start, Netlify responds with 503, but still processes the requests and does not log anything. The second webhook attempt is usually successful, but for some requests, I see up to 4 fails in a row.
This problem did not exist before Jul 6, 2024.
Any advice on how to fix it or how to investigate it further is greatly appreciated.
Yup, I am seeing 503 responses without the text in Paddle logs for the webhook for every failed attempt. There are usually 2 attempts per request, but sometimes 3 or 4.
Paddle might interpret the request as failed if it did not respond in 5 seconds, but documentation does not specify how it will be recorded in the logs. It retries after 60 s with the exponential backoff with a 1.1 multiplier.
According to the logs I see in Netlify, my code finishes well in under 2 seconds. Is it possible, that the cold start and network take more than 3s?
3s of cold start is unlikely - or at least not something I’ve seen before for the heaviest of the websites. With that being said, 5 seconds of wait time is fairly low. Does upgrading the wait time on Paddle solve the issue?
Do you have any error logs from Paddle? Specifically the response we served or at least the response headers? Finding an error masked by some other service as a different error is difficult.
The error logs there are pretty minimalistic. It basically says the number of attempts and error code. No headers, no body.
Anyway, I solved my problem by recording the successful attempt in the DB and checking if there was one before sending the email to customers. I still get errors, it just matters less.