I have contacted SEMRush and pointed them to this thread so hopefully they can chime in here directly.
I am not sure what downsides there would be to add too many user-agent in this case. Their tools are all meant to emulate a search engine and report on what the search engine sees. I think just adding SemrushBot would be fine.
So, I am pretty sure our prerender service is working, and what you are seeing is more likely a config error that this article was written to address What site are you working with, so I can check out the settings, and what debugging have you already done?
We’ve setup Netlify’s prerendering service at usebubbles.com but are having issues where the service is running too slowly causing Slack to often not show any previews at all.
When running our app locally in Chrome, window.prerenderReady = true is set in about 3-5 seconds.
However, when running the prerender service locally, it takes about 80 seconds:
2020-06-07T20:25:39.483Z getting https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj
2020-06-07T20:26:55.145Z S3 GET failed error="MissingRequiredParameter: Missing required key 'Bucket' in params" url="https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj"
2020-06-07T20:26:59.611Z got 200 in 80128ms for https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj
2020-06-07T20:26:59.613Z method=GET status=200 url=/https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj prerender_url=https://app.usebubbles.com/fHJhWgxnWdKDHFzT3hiQWj cache_type=CACHE_MISS timing=80126 referrer="-" user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36" request_id=unknown
This seems to be causing issues where Slack, Facebook messenger, and other services sometimes just give up on Netlify’s prerendering service and fail to show the opengraph unfurled preview of our links.
They said they would like us to add all of their crawlers so that all their tools work with the site. Depending on how you “match” user-agents, adding “SemrushBot” might be enough. Otherwise you can add all of the user-agents from their site (https://www.semrush.com/bot/).
That wasn’t actually the question I asked. I asked which of the user agents expected prerendered content - not “which ones they would like us to add”
Could you confirm that they all NEED prerendered content, please? That they are used in social sharing, and cannot parse javascript, would be what might mean “needing”…
This pre-rendering service has been working great for us, however there’s an error in this documentation that bit us in a pretty big way.
In the 2nd caveat, you mention that if window.prerenderReady is not set a snapshot is taken after 10 seconds regardless of the state of the page. However, this doesn’t seem to be strictly true. If there are no ongoing network requests, a snapshot is taken before 10 seconds.
Just wanted to give a heads up in case anyone else runs into this issue.
Have you guys seen a case where prerendering appears to not work on a particular site? I’ve tried disabling, reenabling and re-deploying on a particular site, but it appears that prerendering isn’t completing properly. We’re using the same JS code on another site, where everything seems to be working just fine.
Hi, @dcastro, there is a site setting to enable prerendering so it is possible that the setting just needs to be activated for that site. Also, once the setting is changed, a new deploy is needed before it will take effect. (I don’t know if this is the root cause but it is a possibility.)
If this has already been done, please let us know which site this is for and we’ll be happy to take a close look to see why it isn’t working.
I’ve escalated your case to our helpdesk since you are a Pro customer and could have contacted us there too, where you can tell us your site name so we can look
Is there a list of UA strings that will trigger prerendering?
I have working OG tags and unfurling works pretty much everywhere but not when sharing via WhatsApp.
Yup. The list changes somewhat frequently, but today it is this CASE INSENSITIVE SUBSTRING REGULAR EXPRESSION:
(baiduspider|twitterbot|facebookexternalhit| facebot|rogerbot|linkedinbot|embedly|quora link preview| showyoubot|SocialFlow|Net::Curl::Simple|Snipcart|Googlebot|outbrain|pinterestbot|pinterest/0|slackbot|vkShare|W3C_Validator|redditbot|Mediapartners-Google|AdsBot-Google|parsely|DuckDuckBot|whatsapp|Hatena|Screaming Frog SEO Spider|bingbot|Sajaribot|DashLinkPreviews|Discordbot|RankSonicBot|lyticsbot|YandexBot/|YandexWebmaster/|naytev-url-scraper|newspicksbot/|Swiftbot/|mattermost|Applebot/|snapchat|viber)
I just tested “whatsapp - 1234” as a user agent and got prerendered content. Have you enabled prerendering on your site, and redeployed after doing so to activate it? It’s in the build & deploy settings page near the bottom
After doing so, and verifying prerender is working for this UA, I realized the problem was Gatsby inlining all CSS in the <head> before the all the <meta> tags. And WhatsApp is only downloading the first N bytes of the HTML response, so it wasn’t even reaching those meta. Quick fix: made it external.
Dang, I hadn’t even considered that as a failure mode! I will update the doc that is the beginning of this thread to mention that. TBQH, that sounds like a broken workflow to me (the way WhatsApp does the partial download, that is) - that’s a very hard to discover gotcha that can slowly impact you without a clear cause.
Thanks so much for following up to let us know that detail, you have already taught one fool a new trick, but hopefully together, we can teach future explorers
Turns out it’s not that uncommon. Just depends on what “N bytes” means for each crawler. Eg for Facebook it’s the first 1MB and the rest gets cutoff. I’m pretty sure others are doing the same, but it’s hard to find conclusive info. Eg. Google’s cutoff is said to be ‘a few hundred MB’