I have a Next.js site deployed at qbcentre.netlify.app. It uses the custom domain qbcentre.org.uk and the Essential Next.js plugin is active.
___netlify-handler function has had over 116,000 requests in the past two weeks. The requests are a mix of POST requests to ‘/autodiscover/autodiscover.xml’ which are accompanied by GET requests for URLs from an old website with a different domain (houseofillustration.org.uk) that don’t exist in the Next.js site, e.g. ‘/whats-on/past-exhibitions/start-date/21-11-2018/end-date/21-11-2019/start-date/05-05-2018/end-date/05-05-2019/event-type/exhibition’.
I’ve tried deploying the site without the custom domain and it doesn’t get any of the POST/GET requests to
I don’t know how these requests are being generated – search crawlers? – but would like to stop or limit them if possible. The requests come at a steady stream of a few per minute, 24 hours a day. Has anyone had similar experience? Anything I can change in Netlify DNS or Google Search Console?
Hiya @humour and sorry to hear about the trouble! Our team looked into this and there are a few things going on:
/autodiscover/autodiscover.xml is a service discovery path used by clients such as skype, microsoft outlook, and more. This article describes it in some detail: How Does Autodiscover Work? - AC Brown's IT World but for your purposes, you should either:
- deploy the intended content so that exchange/outlook will work right for your domains (this is probably not something you use or need, but I don’t know your network/services)
- or, deploy a blank page at that path, which will be served in preference to your functions, saving you and us money.
That change shouldn’t impact any of the rest of your site and is a typical pattern that a large percentage of our customers see for their custom domains.
Next, for the requests targeting
houseofillustration.org.uk, if they aren’t supposed to get that content, I suppose you should configure things differently. We did a quick scan and the top requesters are all crawlers, and not abusive ones:
- ahrefs bot
- and bing
were well over 95% of the requests to that domain in the past 3 hours, which did happen thousands of times over that timespan.
I don’t know what your intentions with that domain are, but to prevent unnecessary functions runs, you have a few options:
- perhaps that should be a separate site, so that all requests go to a codebase intended to handle them? That codebase could have no functions, or at least serve reasonable content for bad requests that doesn’t use a function invocation?
- assuming you do need this site to handle these requests for some reason, you could use a domain level redirect to point the traffic to a static page or something else that doesn’t run unwanted functions
- you could also consider using robots.txt to block access to bots - all 3 of those bots should respect it - but since robots.txt applies to all domains on a site, you’d certainly want to move that name to a different site before you did that so you don’t impact indexing on your primary domain.
Hopefully those details help you handle the traffic in a better way.
Hi @fool thanks for the detailed reply and looking into this so quickly.
I’ve redeployed with a blank page at the
/autodiscover/autodiscover.xml path and am optimistic that’s solved the issue. Checking the real-time logs for the
___netlify-handler function it’s slowed down from hundreds to just two GET requests in the past ten minutes. What a relief!
I’ve also moved the old domain to a separate site for the time being.