Weird behavior by netlify-identity-widget and role based access control / _redirects

Dear all,

I have been facing some issues with identity over the last 16 hours. Been working and found the following issues and investigation notes.

Build process:
I am using 11ty to build static pages and with a eleventyConfig.addPassthroughCopy('./_site/_redirects'); directive.

Widget:
Using the netlify-identity-widget

Functions:
Created a simple function to set role default to customer when signing up.

exports.handler = async (event) => {
  const { user } = JSON.parse(event.body);
  return {
    statusCode: 200,
    body: JSON.stringify({
      app_metadata: {
        roles: ['customer'],
      },
    }),
  };
};

Restricting sensitive data (not yet sensitive and you may feel free to test):
Restricted a few pages using the redirect rules below:

/houses-for-sale/* 200! Role=customer
/gated-community/* 200! Role=customer
/plots-for-sale/* 200! Role=customer
/tags/* 200! Role=customer
/gated-community/* /no-access/ 403!
/houses-for-sale/* /no-access/ 403!
/plots-for-sale/* /no-access/ 403!
/tags/* /no-access/ 403!

The /no-access/ page is the page where one would be redirected if there is no role to the user.

Build checks

  • Redirect rules and header rules processing (no redirect rules in the header)

Testing the web app

  • Sign up successful
  • Login successful and email / name shows up in the footer
  • Sign in successful and functions showing the right details

Problems I am facing:

  • Problem 1 - When logged out and the localStorage in the browser and the cookies has to sign up jwt in it, gated content can still be displayed after a hard refresh as well.
    I do not have a screenshot of this, missed it while making this post. (most weird part)

  • Problem 2 - Gated content not showing up even if I have all the right roles in place.

I am unaware if I am making any mistake at this point of time. I thought I figured out how this out a while ago but these sessions made me confused, Am I missing something?

To test, you could try by testing the app. There is no sensitive info yet on the app as it is in testing stage.

zen-lewin-04367e.netlify.app

Greetings @sachinsancheti1! :wave:t2:

And welcome to The Community :netliheart:

Thanks for providing all that context and information. That’s super helpful and usually takes a few more back-and-forths to get to, so I appreciate you diligence there :slight_smile: From here on out I’m going to use “RBAC” instead of typing out “Role based access control” which is what the Role=xyz stuff in _redirects is. Just alerting you of the acronym ahead of time :stuck_out_tongue:

For the record, is this to imply that everything was working fine before and is now failing, or that you’re building a new site and haven’t quite gotten things working yet? The context you added later suggest the latter but I do want to clarify that this never actually reached a ‘fully working’ state yet.

In short, RBAC works by parsing and validating the JWT inside the cookie that’s sent for any given route requested. If the logout action doesn’t clear the cookie, RBAC will continue to work. The JWT / stateless auth system doesn’t really maintain the premise of whether or not you’re “logged in” — distributed / stateless auth systems sort of work that way. It’s very much just “if the cookie is present and valid, let the person through.” Now, that said, the logout() action from netlify-identity-widget should clear out that cookie (in a manner of speaking) so that RBAC doesn’t pass through.

Elaborating on my prior paragraph, while it’s good to check the local storage value for the gotrue.user, the gotrue-js library itself can be quirky — actually any JavaScript that attempts to finagle cookies that are HttpOnly can be quirky lol. Not getting into that but my point here is don’t watch the local storage keys/values for how a request behaved. Watch the Network log (with “preserve log” enabled❗).

Now, beyond the theory — let me dig into your site :+1:t2:

1 Like

So here’s a few things I did for debugging.

I always like to start with the command line — it’s the most “true” way to see what the redirects engine is doing. I use a tool called httpe so the command I use here is just https which fires off a basic GET request.

 ~ https https://zen-lewin-04367e.netlify.app/houses-for-sale/
GET /houses-for-sale/ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: zen-lewin-04367e.netlify.app
User-Agent: HTTPie/2.3.0



HTTP/1.1 403 Forbidden
Age: 0
Cache-Control: public,max-age=300
Connection: keep-alive
Content-Type: text/html; charset=UTF-8
Date: Fri, 12 Feb 2021 17:40:59 GMT
Etag: "2fca8a1083ab4bf690b9086844222bff-ssl"
Server: Netlify
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Transfer-Encoding: chunked
X-Frame-Options: SAMEORIGIN
X-NF-Request-ID: 70885c53-8056-40dc-b76c-856069614619-53932943
X-Xss-Protection: 1; mode=block

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta http-equiv="Content-Security-Policy" content="">
etc...

So the role-gate for /houses-for-sale/ is working. I sent a GET with no JWT and got kicked out — with a 403 response and the /no-access/ page being rendered as the content of the page. Working great :+1:t2:

I did make an account on the site real quick, so I’m going to grab the JWT that gave me and use that in another command-line request to the same path, using the JWT as the appropriate cookie header. I want to make sure that the role-gate allows me through with a valid JWT. Obfuscating my JWT for the sake of your site security :stuck_out_tongue:

 ~ https https://zen-lewin-04367e.netlify.app/houses-for-sale/ Cookie:nf_jwt='eyJhbInR5cCI6IkpXVCJ9.eyJleHAiOjE2MTMxNTUxMzMsInN1YiI6IjhjZTA3ODY3LTNlMmYtNGMyN5In19.ixsIuHniUZ8lk79oS_P7Tft_tVR1a9w'
GET /houses-for-sale/ HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: nf_jwt=eyJhbGciOiJ9.eyJleHAiOjE2MTMxNWV0YWRhdGEiOnsiZnVsbF9uYW1lIjoidGVzdGVyIHN1bGx5In19.ixsPnmP7Tft_tVR1a9w
Host: zen-lewin-04367e.netlify.app
User-Agent: HTTPie/2.3.0



HTTP/1.1 200 OK
Age: 1
Cache-Control: public,max-age=300
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html; charset=UTF-8
Date: Fri, 12 Feb 2021 17:47:51 GMT
Etag: "4691a63d0bf91d06e0c8a0bc4ae7004e-ssl-df"
Server: Netlify
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Transfer-Encoding: chunked
Vary: Accept-Encoding
X-Frame-Options: SAMEORIGIN
X-NF-Request-ID: e55d6fdf-23ed-4d03-a917-8bc5e15611f3-4538712
X-Xss-Protection: 1; mode=block

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta http-equiv="Content-Security-Policy" content="">
    <link rel="preload" as="style" href="https://fonts.googleapis.com/css?family=Poppins:400,500&display=swap" />
    <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Poppins:400,500&display=swap" media="print" onload="this.media='all'" />
    <title>Houses for Sale in Coonoor, Ooty and Kotagiri, The Nilgiris</title>
    <meta content="Villa and Houses for Sale in Coonoor, Ooty and Kotagiri. We deal wi....
etc. — the correct page đŸ‘đŸ»

So with a valid JWT the page is allowed through. The _redirects are working great. Generally speaking, that means something’s going on with the javascript-level / client-side code that’s running on your site.

I popped open your site and hit the /houses-for-sale/ route in the browser. Here’s what I saw in the Network tools:

There’s a pretty big flag in there. “Source: Service Worker”.

If I go ahead and log out, I can see in the network log that the correct logout request was made by netlify-identity-widget to your Netlify Identity instance — a POST to https://zen-lewin-04367e.netlify.app/.netlify/identity/logout

But again, the service worker is playing a middle-role that it really shouldn’t be. Using a service worker on sites that have both public and private content can be really tricky. If it’s not absolutely critical to your development (which
 I won’t pass opinions on but lots of folks have been fine without SW’s for a long time :stuck_out_tongue: ) I would advise you to just not use service worker(s) on the site. That should shore up any of the oddness where your local storage and cookies may be clear but you’re getting gated content back anyway. In that case, the SW is probably giving you back the private content because it has no idea about auth and that you logged out :stuck_out_tongue:

Hope that’s helpful!

–
Jon

2 Likes

Dear Jon,

Thank you for the warm welcome. I did read other community topics and discussions and then realized that presenting like this would be most helpful for anyone to assist me on this. Also, thank you for the detailed information and debugging report after signing up to the site. This is very insightful and I am sure it will be helpful whenever I am taking up such projects, and surely others in the community as well.

Yes, this is a new project. I did use RBAC for another project but that was not so consequential and had only 1 single page which needed it. Rightly said, this is not a “fully working” site just yet. In my previous tests, I not quite come across this, probably because I was focusing on other aspects of the project and running it locally.

Httpie - Thank you for the tip, I shall try working with such a tool too.

First and foremost - Removing the service worker. Done, completed, deployed and unregistered the sw, cleared cache etc.

Additional step - I change the cache setting in the _headers to no-store, max-age=0

Now testing all the steps in the following sequence.

  • Enter the website home page (no RBAC here) without logging in
  • Trying to access a webpage which needs RBAC
  • Logging in
  • Reloading the same webpage
  • Navigating to other RBAC pages
  • Logging out
  • Reloading to RBAC pages

Here are the results to all the above.

  • Enter the website home page (no RBAC here) without logging in - All is well here. No jwt or cookie in place.

  • Trying to access a webpage which needs RBAC - All is well here too. Booted out with a 403 and returned the /no-access/ page

  • Logging in - All seems well here too. Got my authentication token and other things in place.

  • Reloading the same webpage - Got hit with 403 though there is a cookie with the jwt. I am not sure what went wrong here. Is there something I am missing?

  • Navigating to other RBAC pages - Voila! Here is works! This is probably because the browser is visiting this page for the first time. The content of the page is also loading correctly as seen with the jpeg images loaded shown below the selected html request.

  • Logging out - Logout took place and the authentication and cookies was removed

  • Reloading to RBAC pages - The logout took place and the page returned 403

I hope this helps further to understand what could be going wrong here. Thank you once again for the prompt response.

1 Like

Nice! Looks like just about everything across the board is working as planned, short of the first load after logging in sputtering for some reason :thinking: is that repeatable? You’ve tested that workflow multiple times with the same results?

–
Jon

1 Like

Hi Jon,

Thanks for the reply. Yes , I tried multiple times similar workflows. Infact, after logging out and then trying to log back in, I find that the pages are still not showing, even after a hard refresh. I am unable to figure out what could be going wrong here yet.

hi @lauramiller199846 , I do not know what is a treat, but definitely if you could guide me on how to clear the treat, it may help.

hi there, i think that was a spammer who was just randomly copy/pasting unrelated content :frowning:

Unfortunately, I don’t have any further information for you, but i will try and find someone who does!

Dear @perry ,

Thank you. I am yet to figure out what could be done to fix such a thing.

Dear All,

I have been trying and reading other forum messages but not quite able to figure out what needs to be done to avoid the redirects even when I have a valid login

@sachinsancheti1, one thing I noticed is that you are using a 403 status code. Our documentation recommends that the fallback redirect use a 401 status code. Could you try changing to that and see if that works better?

Dear @Dennis ,

Thank you for the message. I tried with 401 but it still did not work. I checked in the console and it still responds with a 401 after completing the sign-in. The deployment is on here

Hi! If anyone has a clue, that would really help me and the future projects I undertake with netlify identity.

@sachinsancheti1 I have a similar setup and am having very similar problems on https://80northseries.com.

I personally can’t replicate the issue, but I’ve had hundreds of users email me with bug reports saying that they’re getting redirected to the payment page after they’ve already paid or are stuck in a redirect loop. Often someone will sign up, pay, watch 2 episodes, and then come back a few days later and get redirected to the payment page. In most cases, if they sign out and sign back in the issue is fixed.

Here are the relevant parts of my _redirects file:

# Show paid users the goods
/watch /watch 200! Role=paid
/watch/* /watch/:splat 200! Role=paid
/ /watch 302! Role=paid
/payment /watch 302! Role=paid

# Show unpaid users the payment screen
/watch /payment 302! Role=user
/watch/* /payment 302! Role=user
/ /payment 302! Role=user
/payment /payment 200! Role=user

# Show anonymous users the login/homepage
/watch /login 302!
/watch/* /login?to=/watch/:splat 302!
/ / 200!

I have two functions that manage roles.

  1. identity-signup.js marks all new users with the user role

    // Empty handler to allow signup without confirming email, this is handled in payment step
    exports.handler = async (event, context) => {
      return {
        statusCode: 200,
        body: JSON.stringify({"app_metadata": {"roles": ["user"]}})
      }
    }
    
  2. payment.js handles payment and replaces role with paid

    exports.handler = async (event, context) => {
      const { identity, user } = context.clientContext;
    
      // Handle payments
    
      // Mark user as paid
      const res = await fetch(`${identity.url}/admin/users/${user.sub}`, {
        method: "PUT",
        headers: { Authorization: `Bearer ${identity.token}` },
        body: JSON.stringify({ app_metadata: { roles: ['paid'] }})
      });
    
      if(res.ok) {
        return {
          statusCode: 200,
          body: JSON.stringify({"app_metadata": {"roles": ["user"]}})
        }
      } else {
        return {
          statusCode: 422,
          body: "Error"
        }
      }
    }
    

Initially I was using netlify-identity-widget to handle signup, but after a few weeks of being unable to identify the issue, I thought I could get better logging/error handling if I wrote my own. So I’m now using gotrue-js directly and implemented my own views to manage sign up, sign in, etc.

I attempt to refresh the jwt before it expires with client-side code that looks something like this:

const auth = new GoTrue({
  APIUrl: '/.netlify/identity',
  audience: '',
  setCookie: true,
});

const user = auth.currentUser();

async function refresh(force = false) {
  await user.jwt(force);
  const ttl = +new Date(user.token.expires_at) - new Date()
  setTimeout(() => refresh(true), ttl);
}

refresh();

I also attempt to refresh to the local user data to get the updated roles after calling either of the functions above:

await user.getUserData()

I’m at a loss for what else to do.

I’ve been a happy Netlify customer for years (I used to work with the the founder at GitHub), but I am about to move off Netlify because this experience has caused me so much embarrassment with my customers.

After 6 months of struggling with Netlify Identity, it feels like an alpha product at best. The docs are incomplete, it doesn’t work with netlify dev, and there are a ton of gotchas. But maybe I’m just doing something wrong. :man_shrugging:

Hey @bkeepers :wave:t2:

And welcome to The Forums :netliheart:

I’m not a Netlify Employee but I’ve had a lot of experience with Netlify Identity, wrote a pure-react-wrapper (and Gatsby) because I too was dissatisfied with gotrue-js and have read the GoTrue source itself many times over to wrap my brain around the process end-to-end. I’ve headed a number of discussions around Netlify Identity + Role-Based Access Control via _redirects, and, while my personal general disposition on the matter is to not use it (and instead use a front-end-based gating methodology using N-ID + Functions), hopefully I can shed some light on your particular situation.

I’m sorry to hear this. But I also know too well this feeling. Luckily I dug really, really deep into NID (and built my own library) before shipping it live for any of my clients, but I empathize with you on the
 “it sounds super easy but once you get into it suddenly it doesn’t really work right” vibe.


Anyway, there’s a consistent / recurring issue I see folks fall into with RBAC that I imagine you’re feeling, and one more specific issue I’m seeing in your particular implementation that may be causing headaches too.

Expired JWT in cookie => redirector

The more pervasive issue is the simpler one — you wrote front-end code to automatically refresh JWTs (which is actually similar to what my react library does) but that doesn’t help anybody that doesn’t actively have a tab open. If someone logs in on your site then closes the tab and waits an hour (the default JWT TTL) then comes back to a gated page, they’ll be kicked out to (reading your RBAC rules
) /login (possibly with the ?to=...etc depending on where said user was attempting to load). That’s because the Netlify Redirector considers an expired JWT to be no JWT at all.

Once that happens and /login loads, do your gotrue-js scripts run to refresh the JWT?

gotrue-js handles the JWT/Refresh Token and User objects separately

So, even though the JWT contains the User object / data, gotrue-js handles and treats them as separate objects. That’s not a bad thing per se (my library does too) but the issue comes in this unfortunate reality: .refreshToken() doesn’t update the user object
 even though the new/refreshed JWT contains the updated user data. It just refreshes the in-memory / Localstorage JWT string.

Which leads to this situation: if a user is logged in and you have a Function that updates that user’s Roles, then client-side you run a user.jwt(true), it won’t actually update the user object in the client-side Javascript context with the new roles. :neutral_face: You would need to call user.getUserData() to force gotrue-js to update its user data.

A Side Note

You mention

Which is good, but then why do the functions return body: JSON.stringify({"app_metadata": {"roles": ["user"]}})? My opinion here would be that the function shouldn’t return that data, the refresh to the user’s object data should provide that. So, client side you’d want to call the function .then(_ => user.getUserData()). This situation is what I’ve described in the docs for my react-library an “external alteration” — meaning that the data on a User’s object has been altered outside of actions that specific user took (e.g. executed via the client-side library). Your Function updating their role is exactly one of these “external alterations”. That is specifically why my library has .refreshUser():

This method is a utility to forcibly refresh the local user’s information and authorization. While not ostensibly the most useful functionality, it presents a particular use case for when you know a user’s data has been altered externally. This typically isn’t the case - a user’s own identity.user data tends to only be changed by that user but if the user kicks off a process that externally alters the user data, this method can be useful.

The demo site exhibits this use-case for clarity - when clicking the “Make me a member!” or “Make me an admin!” buttons, the Netlify Function that runs behind the scenes makes a change to the user data - it adds or removes role(s). Since we know that’s what’s happening, we can use .refreshUser() once the button execution has completed in order to refresh the user and pull down the new role(s).

(The ‘demo site’ it’s referencing is here — feel free to sign up. It’s built around the premise of Functions changing the user Roles then forcibly updating the user data client-side, so we’re talking apples to apples here :+1:t2:)

Not bringing that up for no reason, just supporting that what you’re doing isn’t an uncommon thing and should be handled more gracefully / better.

A Hunch

Again, I can’t actually see much of your site’s code (I imagine this is a private repo) but based on the snippets you shared I’m wondering if you’re using the response from the function that contains the new role somehow more directly to un-gate content, maybe in some kind of session storage or just memory(?) then when the user comes back a few hours later (after closing any tabs) perhaps your refresh code isn’t running?

Mulling through this quite a bit but it’s a little tough to postulate without making a user / playing with roles on the site etc. Clearly there’s an issue in the client-side JWT not being fresh (either by TTL or by Roles) by the time it gets sent to the Redirect engine that handles RBAC.

Anyway, other bits:

Totally true. RBAC doesn’t work with Netlify Dev, but FWIW, if you do end up taking a client-side-gating approach in future projects, that does work in local development, you just have to use your production Identity instance. That has its own pros and cons, but it does work and that makes for a much nicer development process for sure. (One could also spin up a second Netlify Site and enable Identity on the second site to get a “second instance”. Close enough :stuck_out_tongue: )


Sorry for the mouthful. I’ll leave it there for now. Let’s iterate on this, I would love to help you find a properly-working solution. Even if that’s not re-writing in React and using my library :stuck_out_tongue_winking_eye: I’m more than happy to look at specific site code and/or create (an) account(s) on the site to help debug things through if you’re up for it.

–
Jon

1 Like

Wow! @jonsully - that is a lot of information here. I shall definitely go through the same and see what all may help me.

@bkeepers Thank you for your insights. Glad to hear you setup 80northseries. I see the logic you have used and then finally used the gotrue-js directly.

I too wishes the ntl dev or netlify dev command would have replicated a property identity scenario locally. Creating a duplicate repo or a second instance as jonsully put it, is not an ideal way of working on a project.

I have created a public repo with the issue that I am facing. This is for all the be able to check out and help me with the problem I am facing. I have done my testing and face the same issues as I had indicated earlier.

Netlify App - https://nifty-payne-1700e9.netlify.app/
Github Repo - https://github.com/sachinsancheti1/netlify-identity-redirect-check/

@sachinsancheti1 I really appreciate you recreating your case as a public example / repository. Helps so much :sweat_smile: I started looking through it for just a moment. Here’s what I’m seeing

  1. You’ve got redirects in both your netlify.toml and a _redirects file — definitely want to put all redirects in one or the other.
  2. I am seeing some oddness where I log in and then /houses-for-sale/ still shows the “Access Forbidden” screen, even though I can manually get to /houses-for-sale/title-2 and requesting /houses-for-sale/ with my JWT via CLI brings back the proper /houses-for-sale/ markup. I’m wondering if this is some odd HTTP caching due to headers. For the sake of the experiment, could you remove your _headers file? Beyond this test scope, I don’t recommend folks override Netlify’s caching headers (Netlify sets them well), but if somehow that’s causing the content of the /houses-for-sale/ path to be browser-cached then that could be the issue.

–
Jon

Dear @jonsully

I have done the following:

  • removed the redirect from the netlify.toml file as suggested
  • removed the _headers file completely

I seem to be facing the same behaviour pattern as before. Have I made a mistake in the function/script or is there something I have missed. I am getting the feeling that I may be lost right now, though I have tried to follow all guides and a good number of github public repos that use netlify identity, the widget and the redirect documentation.

@jonsully: thanks for all the info.

@sachinsancheti1 thanks for creating the demo app. Hopefully someone will be able to duplicate the issue and show us what we’re doing wrong.

I also had a _redirects file that looked like this:

/watch*
  Cache-Control: max-age=0,no-cache,no-store,must-revalidate

I added that because I was having issues where users would visit a page they’ve been to before, appear they were logged in, and then get redirected away because they couldn’t be authenticated with GoTrue. By setting the header, at least they were redirected to sign in right away.

I just removed _headers and will try again.