I confirm, Facebook crawler doesn’t work anymore.
Interesting. How are you testing? I just tested myself and it worked as expected:
- I got prerendered content when I use a user agent of
facebookexternalhit/1.1
using curl - and their link sharing debugger (Sharing Debugger - Meta for Developers) showed reasonable output.
So, I am pretty sure our prerender service is working, and what you are seeing is more likely a config error that this article was written to address What site are you working with, so I can check out the settings, and what debugging have you already done?
We’ve setup Netlify’s prerendering service at usebubbles.com but are having issues where the service is running too slowly causing Slack to often not show any previews at all.
When running our app locally in Chrome, window.prerenderReady = true
is set in about 3-5 seconds.
However, when running the prerender service locally, it takes about 80 seconds:
2020-06-07T20:25:39.483Z getting https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj
2020-06-07T20:26:55.145Z S3 GET failed error="MissingRequiredParameter: Missing required key 'Bucket' in params" url="https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj"
2020-06-07T20:26:59.611Z got 200 in 80128ms for https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj
2020-06-07T20:26:59.613Z method=GET status=200 url=/https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj prerender_url=https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj cache_type=CACHE_MISS timing=80126 referrer="-" user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36" request_id=unknown
This seems to be causing issues where Slack, Facebook messenger, and other services sometimes just give up on Netlify’s prerendering service and fail to show the opengraph unfurled preview of our links.
What can we do to debug this?
Hi, @tommedema, we need some way to identify the slow HTTP response.
Often the simplest way to do this is to send us the x-nf-request-id
header which we send with every HTTP response.
There more information about this header here:
If that header isn’t available for any reason, please send the information it replaces (or as many of these details as possible). Those details are:
- the complete URL requested
- the IP address for the system making the request
- the IP address for the CDN node that responded
- the day of the request
- the time of the request
- the timezone the time is in
@fool Semrush has responded to me.
They said they would like us to add all of their crawlers so that all their tools work with the site. Depending on how you “match” user-agents, adding “SemrushBot” might be enough. Otherwise you can add all of the user-agents from their site (Semrush Bot | Semrush).
Thank you!
That wasn’t actually the question I asked. I asked which of the user agents expected prerendered content - not “which ones they would like us to add”
Could you confirm that they all NEED prerendered content, please? That they are used in social sharing, and cannot parse javascript, would be what might mean “needing”…
Ok forwarded your question “which bots can parse JavaScript” and got this new reply from them:
I’m afraid, none of our bots are capable of rendering JS.
thanks,
thomas
This pre-rendering service has been working great for us, however there’s an error in this documentation that bit us in a pretty big way.
In the 2nd caveat, you mention that if window.prerenderReady is not set a snapshot is taken after 10 seconds regardless of the state of the page. However, this doesn’t seem to be strictly true. If there are no ongoing network requests, a snapshot is taken before 10 seconds.
Just wanted to give a heads up in case anyone else runs into this issue.
Appreciate that, thanks for the intel!
Have you guys seen a case where prerendering appears to not work on a particular site? I’ve tried disabling, reenabling and re-deploying on a particular site, but it appears that prerendering isn’t completing properly. We’re using the same JS code on another site, where everything seems to be working just fine.
Hi, @dcastro, there is a site setting to enable prerendering so it is possible that the setting just needs to be activated for that site. Also, once the setting is changed, a new deploy is needed before it will take effect. (I don’t know if this is the root cause but it is a possibility.)
If this has already been done, please let us know which site this is for and we’ll be happy to take a close look to see why it isn’t working.
Thanks. And yes, the setting is already configured and I’ve disabled / re-enabled and re-deployed with no change.
If you email me direct, I can give you the site that is having issues.
I’ve escalated your case to our helpdesk since you are a Pro customer and could have contacted us there too, where you can tell us your site name so we can look
Is there a list of UA strings that will trigger prerendering?
I have working OG tags and unfurling works pretty much everywhere but not when sharing via WhatsApp.
Hiya @mrazzari!
Yup. The list changes somewhat frequently, but today it is this CASE INSENSITIVE SUBSTRING REGULAR EXPRESSION:
(baiduspider|twitterbot|facebookexternalhit| facebot|rogerbot|linkedinbot|embedly|quora link preview| showyoubot|SocialFlow|Net::Curl::Simple|Snipcart|Googlebot|outbrain|pinterestbot|pinterest/0|slackbot|vkShare|W3C_Validator|redditbot|Mediapartners-Google|AdsBot-Google|parsely|DuckDuckBot|whatsapp|Hatena|Screaming Frog SEO Spider|bingbot|Sajaribot|DashLinkPreviews|Discordbot|RankSonicBot|lyticsbot|YandexBot/|YandexWebmaster/|naytev-url-scraper|newspicksbot/|Swiftbot/|mattermost|Applebot/|snapchat|viber)
I just tested “whatsapp - 1234” as a user agent and got prerendered content. Have you enabled prerendering on your site, and redeployed after doing so to activate it? It’s in the build & deploy settings page near the bottom
Hi @fool, thanks for that list.
I should have tested that curl myself
After doing so, and verifying prerender is working for this UA, I realized the problem was Gatsby inlining all CSS in the <head>
before the all the <meta>
tags. And WhatsApp is only downloading the first N bytes of the HTML response, so it wasn’t even reaching those meta. Quick fix: made it external.
Thanks!
Dang, I hadn’t even considered that as a failure mode! I will update the doc that is the beginning of this thread to mention that. TBQH, that sounds like a broken workflow to me (the way WhatsApp does the partial download, that is) - that’s a very hard to discover gotcha that can slowly impact you without a clear cause.
Thanks so much for following up to let us know that detail, you have already taught one fool a new trick, but hopefully together, we can teach future explorers
Turns out it’s not that uncommon. Just depends on what “N bytes” means for each crawler. Eg for Facebook it’s the first 1MB and the rest gets cutoff. I’m pretty sure others are doing the same, but it’s hard to find conclusive info. Eg. Google’s cutoff is said to be ‘a few hundred MB’
I am learning so much this week - thank you again!
A megabyte (or more) feels pretty unlikely to be a problem (or, put another way, your webpage has a problem if it’s bigger than that in html, even prerendered, before the OG tags) - but smaller N could have wide impact, particularly if N is not documented publicly.
However, I don’t get to tell them how to do business so for now I super appreciate you taking the time to share all these details so others can discover it as well, hopefully with less head scratching!
Is there a way to force a rerender of the prerendered site? Let’s say the only thing that I’ve updated is an OpenGraph image. Or is the only solution currently to wait out the 24h (~48h) cache period?