Our production traffic broke overnight on Nov 13-14 (CET) without us making any changes to our systems. After a lot of debugging we isolated the outage to be cause by Netlify percent-encoding query parameters when proxying them to our backend.
We’ve had the proxy rule in place for a couple of months without any problems, so we’re pretty sure this was changed around the time of our outage.
For a bit more detail: our backend protects certain resources using Nginx Secure Link. This involves adding an expires
(irrelevant for this issue) and an auth
query parameter. The auth
param is a filename-safe base64 encoded md5 hash, basically using the dash (-
) and underscore (_
) characters instead of the more common plus (+
) and slash (/
).
Now here is our problem, the Nginx secure_link module doesn’t decode the query parameters, and therefore detects a mismatch of the md5 hash, returning 403 (Unauthorized).
I think percent-encoding the -
and _
characters is not necessarily the right move, as they belong to the “unreserved set” of characters. The Wikipedia article on percent-encoding states:
Characters from the unreserved set never need to be percent-encoded.
Moreover, JavaScript’s encodeURIComponent()
doesn’t encode -
and _
either:
encodeURIComponent('hello-there') // 'hello-there'
We figured out that our frontend (running on Netlify) is not encoding the query parameters, we’ve confirmed this by running the frontend locally, and proxying to the backend using http-proxy
. Our Nginx logs showed unencoded strings coming from the locally hosted frontend. So really all signs point to the Netlify redirects being the culprit here.
We’ve implemented a workaround on our end, but wanted to open this issue regardless, if anything to make you guys aware of the breaking change that was likely introduced.