NextJS middleware strange errors in the edge functions log

Site id: ba2febd5-460e-4cc1-9476-36a9938140ce

Hi!

I’ve noticed some issues with certain crawlers being blocked on a few of my sites using tools such as ahrefs or https://validator.schema.org/. Rebuilding the site seems to have solved the issue, however, I got another site (6bde3e56-8b56-4747-b487-d346311e0d97) that I haven’t rebuilt that is still experiencing the issue.

Example log output from ba2febd5-460e-4cc1-9476-36a9938140ce:

Jan 15, 02:51:10 PM: 01JHN3PY error  SyntaxError: Unexpected end of JSON inputJan 15, 02:51:10 PM: 01JHN3PY error      at parse (<anonymous>)Jan 15, 02:51:10 PM: 01JHN3PY error      at packageData (ext:deno_fetch/22_body.js:394:14)Jan 15, 02:51:10 PM: 01JHN3PY error      at consumeBody (ext:deno_fetch/22_body.js:260:12)Jan 15, 02:51:10 PM: 01JHN3PY error      at eventLoopTick (ext:core/01_core.js:175:7)Jan 15, 02:51:10 PM: 01JHN3PY error      at async Object.<anonymous> (file:///root/.netlify/edge-functions/___netlify-edge-handler-src-middleware/server/src/middleware.js:10920:22)Jan 15, 02:51:10 PM: 01JHN3PY error      at async Object.<anonymous> (file:///root/.netlify/edge-functions/___netlify-edge-handler-src-middleware/server/src/middleware.js:10738:35)Jan 15, 02:51:10 PM: 01JHN3PY error      at async adapter (file:///root/.netlify/edge-functions/___netlify-edge-handler-src-middleware/server/src/middleware.js:2161:20)Jan 15, 02:51:10 PM: 01JHN3PY error      at async handleMiddleware (file:///root/.netlify/edge-functions/___netlify-edge-handler-src-middleware/edge-runtime/middleware.ts:58:20)Jan 15, 02:51:10 PM: 01JHN3PY error      at async FunctionChain.runFunction (file:///root/src/bootstrap/function_chain.ts:489:22)Jan 15, 02:51:10 PM: 01JHN3PY error      at async FunctionChain.run (file:///root/src/bootstrap/function_chain.ts:359:20)

Is there any way for you to tell the cause of the error? The mw is responsible of redirects, that’s all that it does. It doesn’t consider any user agents, it fetches the redirects from the database and matches it to the current path and performs a redirect if necessary. We’ve made no changes to this functionality within the last couple of months.

The crawlers seemingly being blocked might just be a coincidence, however it started to work again after having rebuilt the site and the continuous log output stopped.

Thanks in advance. Let me know if you need me to elaborate on anything.

To add to this, this issue seems to go away after having rebuilt the site. Then it comes back after a while. This morning after having waited for 16 hours, the issue remains.

Trying the ahrefs audit tool here: What is AhrefsSiteAudit | Learn about Ahrefs' Web Crawler - it returns a 403 for the robots.txt file.

Using robots.txt Validator and Testing Tool | TechnicalSEO.com with Ahrefs set as the user agent, it should be allowed.

Improving my mw and changing the cache headers seems to have solved the issue. Regarding the 403s, looks like this is the norm at least when using the ahrefs tool.

Strange though how this hasn’t been an issue for ~2 years.