Bug (and workaround): 404 page returns HTTP 200

My point was that if that heuristics are prone to failure, so they can’t be relied upon.

I agree, and that’s exactly why I created this thread: so that Netfilty aligns with web standards while it currently does not. (Obviously some people in this thread disagrees with me about said standards, but I have yet to see authoritative evidence in support of their arguments.)

I care about by website having a good user experience, regardless of whether a human or a bot uses it.

You have no control over that beyond your own websites though, hence why it’s not a good solution to prevent search engine indexing despite a good probability of success. That’s one way search engines may find pages disallowed by robots.txt. And besides, it doesn’t resolve the core issue anyway.

Returning HTTP 410 in Netlify for the 404 file requires the same workaround from my initial post to be effective, so that doesn’t help. As for the “noindex” solution, I would like to point out that it doesn’t work along with a robots.txt Disallow directive, and that it’s just a worse workaround for the soft 404 problem than the one I proposed in my initial post as it only applies to search engines and has other negative SEO consequences.

I have no idea how anybody would ever believe such a thing, as it should be obvious from the name itself that robots.txt only applies to robots. This is not what I was referring to, and I’m pretty sure that’s not what the Google documentation I linked to in my previous post meant either. The big red warning box there clearly explains that the misconception is about hiding pages from search results, not from the public.

That was never my intention. SEO is just one aspect of the soft 404 problem and the most obvious one to web administrators, hence why I gave it as an example. The Internet Archive relies on HTTP response codes for archival. API users rely on HTTP response codes in their business logic. Security assessment tools partially rely on HTTP response codes for analysis. Returning the wrong response may have far-fetched and unexpected undesired consequences to a bunch of systems. That’s what I want Netlify to fix.