I’m archiving an old PHP website, and turning it into a static site. I want all the URLs (
some-page.php?someParam=this&otherParam=2 etc) to remain exactly as they were, so that old links and bookmarks etc all still work.
I spidered the whole live site with
wget --recursive --adjust-extension --restrict-file-names=windows which gave me the static files I want to serve.
--adjust-extension part adds
.html at the end of filenames which didn’t already have that, so now I have for example
--restrict-file-names=windows alters filenames in a few ways, the most important of which is that it replaces
@. This is needed since Netlify doesn’t let me deploy files with
? in the name.
But I want the old URLs to still work, so I’ve got a
To serve as an example, there were the legacy URLs
These are saved as
and so on.
So having read the documentation, I have things like this in my
/photos.php group=:g pic=:p /photos.php@group=:g&pic=:p.html 200 /photos.php group=:g /photos.php@group=:g.html 200 /photos.php /photos.php.html 200
Unfortunately this is not behaving as I expect. For all of the legacy URLs listed above I’m getting
/photos.php.html served up to me.
I noticed that other similar rewrites were working just fine. I boiled it down and down until I found the one difference. I found that in the above case if I just remove the
/photos.php.html file, all the rewrites on the other
/photos.php*.html files suddenly work just fine.
I ended up with this test case: https://chipper-platypus-8de0b0.netlify.app/
Source code here: GitHub - tremby/netlify-redirect-test
In this test case, the links menu is the same on every page. There are two sets of URLs: test1 and test2. There are identical redirects written for both sets:
/test1.php x=:x y=:y /test1.php@x=:x&y=:y.html 200 /test1.php x=:x /test1.php@x=:x.html 200 /test1.php /test1.php.html 200 /test2.php x=:x y=:y /test2.php@x=:x&y=:y.html 200 /test2.php x=:x /test2.php@x=:x.html 200 /test2.php /test2.php.html 200
(That last line exists just to show that it’s not the redirect line itself which is the problem; read on…)
Then these files exist:
test1.php.html test1.php@x=0.html test1.php@x=0&y=1.html test1.php@x=0&y=2.html test2.php@x=0.html test2.php@x=0&y=1.html test2.php@x=0&y=2.html
(There’s also an index.html, just to serve as an entry point.)
Note that they both have a similar set files except there’s no
When clicking through the links, you’ll find that all the test2 links work just fine: each one loads its own corresponding file. But the test1 links do not work as expected: each one loads
test1.php.html and the other
test1.php*.html files are never served.
This strikes me as a bug. I think there is possibly some (undocumented?) “magic” happening to do with the presence of a file named as the requested URL with the query string stripped, which is affecting how the rewriting logic works.
Working from that assumption I experimented and found that if I rename my “query-string-free” file
test1.php@.html, and adjust the rewrite rule accordingly, things work fine. This is what I’ll do for now.