Proxying to another service fails with CORS issue and a 302 response

I’m following the advice in the docs around proxying to another service. In essence, I have a route /cors-proxy/* which allows me to fetch stuff from the client that’s out there on the web (I’ve detailed this setup on my blog). Here’s the lines from my netlify.toml

[[redirects]]
  from = "/cors-proxy/*"
  to = ":splat"
  status = 200
  force = true

This has worked great in a lot of ways. But now I am encountering a strange bug that—to the best of my knowledge—is resulting from something on Netlify’s side and not my own (maybe some weird rate-limiting thing?)

Background & Problem

What’s happening is in my client side JS, I have an array of like 10 image urls. On the client, i’m iterating through each one and fetching that image out on the internet. Because the images are on other domains, i can’t just do:

fetch("https://somedomain.com/path/to/image.png")

So I use the a server-side proxy (via netlify) that gets the image for me and returns it back. Given my netlify configuration described above, I can do this on the client:

fetch("/cors-proxy/https://somedomain.com/path/to/image.png")

This works well in a lot of scenarios. For example, in one of my uses cases, I’m fetching 85 images from out on the web and it’s working just fine (a screenshot from my console):

What’s strange is that for a specific scenario, this is failing.

What’s even more strange is that it works on localhost, but when i ship to prod on readlists.jim-nielsen.com it works for like the first three images, but then just starts failing on the other seven. Here’s an example screenshot from my console:

Note that the first couple images fetched just fine from the proxy, but subsequent ones failed due to CORS?

What’s even stranger is that, for a given URL that failed, if I run a fetch() on that same url right in the console, it works just fine:

Note in the screenshot above that the fetch("/cors-poxy/https://...") call failed when executing client-side code, but if I ran that same fetch right in my browser’s console fetch("/cors-proxy/https://...") it worked just fine.

Looking closer at the calls being made, the ones that succeed are giving back 200:

But the ones that fail are returning HTTP 302 with a location header that’s pointing to the image URL.

Additional notes:

  • Doing a cURL on any of these images (ones that succeed with 200 and ones that fail with a 302 in the browser) results in no obvious difference. They all succeed with cURL.
  • I’m using netlify dev on localhost, which (in theory) means it should work both locally and in prod. It’s strange that it works locally but when i put live on my URL some of those image fetches (not all) begin to fail.

Try It Yourself

You can try this yourself by visiting the live project: https://readlists.jim-nielsen.com/

Then import a readlist from this remote url: https://cdn.jim-nielsen.com/readlists/lean-ux.json

Then open your browser console and hit the “export to epub” link:

As an alternative test to see something working as expected, you can import this URL - https://cdn.jim-nielsen.com/readlists/shape-up.json - and hit “export to epub” and you’ll notice that it fetches all the images and everything works

Thoughts?

I can’t understand why this is failing. And not only failing, but only partially failing on some imges, but the proxy seems to continue to work for others.

hi there, sorry to be slow to respond. We haven’t forgotten about you - i will try and get some :eyes: on this.

Hey @jimniels,
In going through your repro (which… thanks so much for laying it all out for us like that- makes life 1000x easier!), I’m seeing that for the list that’s failing, 10 images load and then no more. Is that the number you’re seeing on your end as well?

The ones that succeed have an x-nf-request-id in the browser response headers, which means they made it to us. The ones that fail didn’t show an x-nf-request-id in the browser on my end (screenshot is for lux2_0409.png):

so I thought you were onto something re: being rate-limited. To test, I did a curl loop:

for ((i=1;i<=20;i++)); do curl -v "https://readlists.jim-nielsen.com/cors-proxy/https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0305.png"; done

expecting to not see x-nf-request-id’s after 10 requests. But that’s not what I saw. All the curls returned request-ids, although our logs for those requests show ERROR_CLIENT_ABORT, even though I did not abort the requests- another mystery. I was also not able to get rate-limited by directly curling https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0305.png in a loop. So we’ll have to escalate this internally. Do you have another example of a failure you could share so we could compare failures:failures?

@Jen ooooh interesting. All very intriguing, especially that ERROR_CLIENT_ABORT when proxying to O’reilly through netlify (but curl works just fine when hitting o’really directly?)

Unfortunately I have not encountered any other failures like this. As mentioned in my post, when fetching images from other domains (i.e. basecamp.com) I fetched up to 85 images in a single go without any issue.


As I’m looking at this again with fresh eyes, it seems the calls aren’t doing the same things they did for me when I first posted this?

For me now, all calls except one are failing. Note in the screenshot that every call except one returns a 302 (the image lux2_0306.png returns a 200, all others a 302 and then a CORS error)

Here’s the one with the 200.

All the others, when I hit the proxy /cors-proxy/URL they are returning a 302 for the URL and so it appears the browser is then looking at the returned location header and trying to download that URL, but that is gonna give a CORS error because it’s hitting that URL directly (rather than using the proxy)

I’m not an HTTP expert, but I guess this is expected behavior

An HTTP response with [a 302] status code will additionally provide a URL in the header field Location. This is an invitation to the user agent (e.g. a web browser) to make a second, otherwise identical, request to the new URL specified in the location field. The end result is a redirection to the new URL.

Which makes sense. If I include those images via the <img> tag, they work because the browser is handling the cross site request for me.

But since I’m making these requests with fetch I get a CORS instead.

When hit my netlify proxy (/cors-proxy/URL) asking for this URL:

https://learning.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png

Netlify returns a 302 with the location header pointing to:

https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png

Which the browser tries to fetch (since it was executed from JS) and hence I get a CORS.

I guess this is expected? Even though my redirect is telling Netlify to force return a 200?

[[redirects]]
  from = "/cors-proxy/*"
  to = ":splat"
  status = 200
  force = true

Now that I see this, maybe the netlify proxy is actually doing the right thing? I guess I was expecting a 200 irregardless since I had it as force = true

If I curl that learning.oreilly.com image URL, it gives me a 302 as well pointing to the www.oreilly.com URL

❯ curl --verbose https://learning.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png
...
> GET /library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png HTTP/2
> Host: learning.oreilly.com
> User-Agent: curl/7.64.1
> Accept: */*
> 
...
< HTTP/2 302 
< content-type: text/html
< location: https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png
< accept-ranges: bytes
< date: Sat, 10 Apr 2021 03:10:44 GMT
< via: 1.1 varnish
< x-client-ip: 74.211.35.34
< x-served-by: cache-bur17539-BUR
< x-cache: MISS
< x-cache-hits: 0
< x-timer: S1618024245.801617,VS0,VE142
< content-length: 0
< 
* Connection #0 to host learning.oreilly.com left intact
* Closing connection 0

If this is the expected behavior, it seems there’s no real solution for me in my case. Does that seem right? I just have to accept that those URLs are returning a 302 and Netlify’s proxy can’t help me (i.e. by making the request to what is specified in the location header and giving me that back instead of the 302)?

1 Like

Yes, great walkthrough! To highlight a point that you laid out that trips up many people: when you create a 200 rule, we proxy the request and return whatever (headers, status code, etc.) the proxy destination returns. So even though the rule says status = 200, the destination could return a 404 or in your case a 302, so that’s what we’ll return to the browser.

Still! This does not really make sense to me. If you do the export to epub thing, then grab a path that fails, then go to it in the browser with your proxy path in front, i.e. https://readlists.jim-nielsen.com/cors-proxy/https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0701.png… then it works. Refresh again and again… still works. Export to epub again and it’s available- a new path fails. Reliably, 10 succeed and the rest fail :face_with_monocle:

I wonder if there’s an encoding thing happening? When I curl through your proxy, I see

< content-type: image/png
...
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
* Failed writing body (0 != 16384)
* stopped the pause stream!
* Connection #0 to host readlists.jim-nielsen.com left intact
* Closing connection 0`

Seems bad but exporting to a .png file works:

% curl -v -o ~/Desktop/proxied.png "https://readlists.jim-nielsen.com/cors-proxy/https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0801.png"

When I curl in a for loop to oreilly, I see:

< content-type: text/html; charset=utf-8

with no warning at the bottom.

but if I make a single curl request to oreilly:

curl -v -o ~/Desktop/not-proxied.png "https://www.oreilly.com/library/view/lean-ux-2nd/9781491953594/assets/lux2_0801.png"

I see:

< content-type: image/png

I wonder if there’s a way to force your filesaver thing to save as png instead of text, even though the browser says it’s a text file?

Ha wow, this is a rabbit hole. But I’m learning a lot as I go.

As far as I can tell, there’s no way for fetch to know that a response is a 302. This is an interesting article

fetch cannot capture 302 , the browser will retrieve the data from the Location header of the 302 response.

So, since I’m proxying the request for the image through netlify, which is returning the O’Reilly response (a 302), the browser handles that by grabbing what’s in location and then trying that request. In my case, that’s failing because that 2nd request is not being proxied so it’s a CORS error.

As far as I can tell, there’s no way for me to catch a 302 using fetch, i.e. something like this would not work:

const res = await fetch("/cors-proxy/https://...oreilly/302/url/response.png");
if (res.status === 302) {
  const newLocation = res.headers.get("location");
  const newUrl = await fetch(`/cors-proxy/${newLocation}`);
}
// handle response...

The browser throws an error before I can ever get to res.status === 302 because in that first fetch it sees the 302 response, then tries to get the new location itself, which runs into a CORS error because that request is not being proxied.

Doesn’t seem like there’s a way for me to get around this. That said, I need to do a better job in my client side code of catching the error earlier on. If an image fails to fetch, I’ll just exclude it from the ebook. At least then the ebook still gets generated, just without the proper image.

2 Likes

Maybe you can use this “manual” redirect option?

Not sure exactly what it does but seems like it could allow you to tell the browser not to follow the 302.

At any rate, glad you have some workable next steps. Thanks for the rabbit hole!

1 Like