Hands-On with PostgreSQL Authorization – Part 2 – Row-Level Security

ggregoire · on March 16, 2022

I discovered row-level security when I started using PostgREST [1] [2].

It was eye opening for me. In every traditional codebase I worked on, this is usually handled is such a slow and messy way, adding another layer of filtering on top of already slow and complex queries. This is always one of the first things that needs to be cached in Redis. Instead, row-level security solves the problem in a very elegant, simple and performant way in my opinion.

Obviously it works better when all your logic is already at the DB level (e.g. PostgREST). I wouldn't imagine using DB roles and row-level security in a traditional backend where all the logic is at the application level (e.g. Django, Rails…). Edit: seems like there are workarounds to use RLS with Django [3].

[1] https://postgrest.org

[2] https://postgrest.org/en/stable/auth.html#roles-for-each-web...

[3] https://pganalyze.com/blog/postgres-row-level-security-djang...

RedShift1 · on March 16, 2022

Yeah it's ridiculous how many features of a modern database server we are leaving on the table in favour of spending more time re-inventing these things for every new app or middle layer. Even MSSQL has row level security, I doubt it's being used very much.

pow_pp_-1_v · on March 16, 2022

I don't see why RLS would mandate all your logic living in the DB level. Basically what the database does when you enable RLS is add the RLS policy clause to every query you run against a table that has the policy applied. So if you have a policy saying "A = 'blah'" on table "dummy", a query like "SELECT * FROM dummy WHERE a_col = 123" becomes "SELECT * FROM dummy WHERE a_col = 123 and A = 'blah'".

ggregoire · on March 16, 2022

Indeed. I was thinking about the RLS use case where the policy is based on the current user and its role. It wasn't obvious to me at first but you could just add a middleware to your app that dynamically set the role in the DB for the user making the request (as in the third article I posted in my edit). Basically what PostgREST does.

Winsaucerer · on March 17, 2022

Looking at that django link, rather than creating a new role for every user ID, you can set a value in each transaction that can contain whatever you like, including a user ID [1] [2]. What I don't like about that django solution is that it's very django dependent. If you ever had another system that can create users, the django signal wouldn't fire and the new role wouldn't be created. Apart from that, you'll have a lot of unnecessary roles in postgres itself.

[1] https://news.ycombinator.com/item?id=30706295

[2] https://news.ycombinator.com/item?id=30703881

paulryanrogers · on March 16, 2022

Some features just don't scale or cannot easily integrate into app layers which need them. For example Pg connections are expensive, so you need a Pooler, now you don't want a DB user per end user. FK constraints too can prove hard to scale as one ends up with extra writes and contention, or do sharing.

timando · on March 16, 2022

> now you don't want a DB user per end user.

`RESET ROLE; SET ROLE app_username;` could be done for each query / transaction / when fetching the connection from the pool.

paulryanrogers · on March 16, 2022

Interesting. But won't that require the connection user be privileged enough to do that? And therefore you're still one SQL injection away from someone taking on another role for escalation or impersonation?

sekun · on March 17, 2022

You can create a role whose sole job is to switch to the roles needed. Doesn’t require you to escalate to superuser-level privileges that way. But still, if SQL injections aren’t properly considered then it’s possible for a user to gain more privileges than planned. Although SQL injections are usually mitigated by the DB libraries these days.

Also, it’s more convenient to use SET LOCAL ROLE <ROLE_NAME>, since that only keeps the role for the transaction. Manually resetting it is error prone (IME), and forgetting will have the supposedly “temporary” role bleed to the next transaction.

alex-olivier · on March 16, 2022

Disclaimer: I work for Cerbos[0].

Whilst this is a very good approach when all your data is stored in a single datastore, as applications grow it is common to start breaking out into more optimised data stores eg you may have few relational databases, a fast lookup source and a search index. This presents a problem of enforcing authorization down into each system.

An alternate way to tackle this is to have the authorization system produce the conditions which need to be applied dynamically at request time (with all the relevant context) which can then be pushed down to each fetching layer as needed [2][3]. This gives far more flexibility in the sorts of authorization rules which can be applied to the data and doesn't tie it to a single bit of technology.

As a real world example we have an integration with Prisma[3] which maps a query plan into a Prisma query format dynamically based on the context of the user and the currently live policies[4].

[0]: https://cerbos.dev

[1]: https://cerbos.dev/blog/filtering-data-using-authorization-l...

[2]: https://docs.cerbos.dev/cerbos/latest/api/index.html#resourc...

[3]: https://prisma.io/

[4]: https://youtu.be/lqiGj02WVqo?t=3601

nhoughto · on March 16, 2022

ah cool, i've implemented a similar thing but baked into the app dao+authz layer (so easier to do).

I would never guess people would use row level security for this for the reasons you've outlined, rarely (?) is one database the only resource you need to authorise access to, so you will need an authz for all non-db things anyway. Always assumed row-level authz was more for data warehouse type applications where a User has a client directly connected to a database, not intermediated thru multiple levels of abstraction.

Cerbos approach logically seems to make more sense to me than the general Zanzibar inspired methods like Authzed and others. I could never wrap my head around how they could authorise access (Pre and Post filtering?) to data without pushing down conditions to join with into the store. Actually having a 3rd party system like Cerbos be able to push down conditions and have good ergonomics is another thing, that is a tough problem.

gavinray · on March 16, 2022

This is a really neat and innovative idea. Just burned about an hour going through your website and watching the YouTube demo.

One piece of feedback I have -- I wasn't entirely sure what I was looking at from the homepage, there's a lot going on messaging and content-wise and I had to watch the video to get it.

Maybe something more to the point like "Takes policies, converts them to adapter-specific filter conditions that you tack on to your queries" might be helpful

ewuhic · on March 17, 2022

Does it provide a standalone UI with possibility to expose it to the end-users?

brownkonas · on March 16, 2022

Good article on what's possible and how to do it, but is row level security scalable in any way for a production application? Not so much on the performance impact of any one query but maintaining the definition of what a role can or can't do (if a db user = an application role). It also seems like it would complicate managing db connections as well, separate pools for each db user? If you have 10 roles, you have to open up at least 10 connections to avoid connection opening latency.

Leveraging most RDBMS security features seem to be geared for an ever shrinking set of use cases where a mostly static set of users are given direct access to a SQL prompt, or a simple record to GUI application interface.

jzelinskie · on March 16, 2022

Disclaimer: I am a founder of Authzed (W21)[0].

It always depends on the domain. If the data model for the app is simple enough, RLS can take you pretty far. Enterprise apps that require you to support the various vague interpretations of "RBAC" or domains that have more complex data models will eventually need some kind of more sophisticated authorization solution. There are a variety solutions at that point (e.g. SpiceDB[1], oso[2], OPA[3]) and you'll be making your decision based on not only the implementation of the technology, but concerns that have cropped in your business requirements:

- "How will additional microservices check permissions?"

- "How can we test and enforce that our authorization system is correct?"

- "Can I support user-defined permissions?"

[0]: https://authzed.com

[1]: https://github.com/authzed/spicedb

[2]: https://www.osohq.com

[3]: https://www.openpolicyagent.org

ewuhic · on March 16, 2022

And how exactly does one approach those 3 outlined questions?

jzelinskie · on March 16, 2022

These 3 questions aren't the only questions folks have, but they are ones that vary greatly depending on the solution you choose. I recommend asking the folks that work on these solutions questions like this, but because I work on SpiceDB[0], I can answer them for that.

> "How will additional microservices check permissions?"

SpiceDB is a database optimized for resolving subjects' access to resources. Being a database, it suggests storing the canonical authorization data within it and performing queries to it from various microservices. This is the strategy employed by most hyper-scalers and but also companies that have heavily invested in in-house authorization like like Airbnb and Carta.

> "How can we test and enforce that our authorization system is correct?"

SpiceDB has developers write schemas, but unlike other databases, it has tooling that can check assertions and audit all possible access. This tooling can be shared/explored via the Authzed Playground[1] or added to your CI/CD pipeline with GitHub Actions[2]

> "Can I support user-defined permissions?"

There are various ways to accomplish this with SpiceDB. User behavior can be used to pragmatically generate schemas or you can write very abstract schemas that push designs that are typically enforced at schema-validation/compile-time (think DDL) to runtime (think DML).

[0]: https://github.com/authzed/spicedb

[1]: https://play.authzed.com

[2]: https://github.com/authzed/action-spicedb-validate

RedShift1 · on March 16, 2022

You don't need 10 different connections, you can switch roles in a transaction. You connect using a role that can impersonate other roles and then run your queries like this:

begin;

set local role myrole; -- the important part

SELECT * FROM page;

commit;

WinterMount223 · on March 16, 2022

Can you refer to roles? Something like “set local role (SELECT name FROM roles WHERE id = 7)”

Winsaucerer · on March 17, 2022

I'm doing something like this. It's an experiment at the moment, so I haven't used it in anger yet. Rather than having a role for each application user, I have my own application's notion of a role/account/user, with its own table in my database. For each transaction:

  SET LOCAL ROLE webuser (used by all transactions that come from the web application)
  SET LOCAL "request.web.sub" = '<internal application's primary key for this specific user'

Then I can in queries check for the current role (where by 'role' I mean my application's user/account/role set via "request.web.sub", not a postgres role) via:

  create or replace function auth.fn_requesting_role()
    RETURNS uuid LANGUAGE sql AS
  $func$
    with crole as (
      select coalesce(
        nullif(current_setting('request.web.sub', true), ''),
        nullif(current_setting('request.jwt.sub', true), '')
      )::uuid as role_id
    )
    select crole.role_id::uuid from crole
    join auth.role on role.role_id = crole.role_id::uuid;
  $func$;

You can then find out the current requesting user/account/role ID in RLS policies and other functions, and apply whatever permissions you like there.

The reason I have 'request.jwt.sub' is just for future if I want to allow requests to come from PostgREST as well and use the same authorisation checks.

RedShift1 · on March 17, 2022

I'm sure you can but I'm not smart enough to figure that out (I think some quoting is missing):

EXECUTE 'SET ROLE ' || (SELECT rolname FROM pg_roles WHERE oid = 17026)

But I wouldn't use those oid's because you are leaking some implementation detail, it's probably best to just stick with the actual role names instead of a reference.

grschafer · on March 16, 2022

Good questions! Regarding maintaining the definition of what a role can or can't do -- I think this comes down to how you organize your SQL. If you keep authz declarations in one place, it's going to be more maintainable than if they're spread across many database migrations. One way you can keep those authz declarations in one place is by doing development/maintenance on that one place then using a database-diffing tool[1] to generate migrations based on whatever changes you made.

Regarding database connections -- one way to avoid needing a connection per user is to use something like PostgREST[2] to handle incoming requests, identify the user making the request, and use an existing db pool connection to switch roles and execute whatever queries are requested. EDIT: RedShift1 beat me to this explanation by a little bit! :)

RLS certainly isn't the answer for every domain or problem size, but I've been surprised by how powerful it is compared with how relatively unknown it is.

[1]: https://supabase.com/blog/2021/03/31/supabase-cli#migrations

[2]: https://postgrest.org/en/stable/auth.html

ctxc · on March 16, 2022

You can get pretty far with RLS. First discovered this when I started working with Supabase.

pow_pp_-1_v · on March 16, 2022

We use RLS on a multi-tenant application in production. It's used as a secondary level of protection that ensures that one tenant cannot see another tenant's data. The system hasn't been out in production for very long but, so far so good.

BrandiATMuhkuh · on March 16, 2022

Each time I'm reading about RLS in PostgreSQL they leave out how to actually get the `user` into the query.

You need to use

``` SET my.user = 'user1'; SELECT * FROM todos; ```

And in you RLS you can then use

``` CREATE POLICY owner ON todos USING (user = current_setting('my.user')); ```

montmorency88 · on March 16, 2022

I first realized the usefulness/minimalism of row level security when playing with the [ihp-backend](https://ihpbackend.digitallyinduced.com/) package. It's a really lean way of moving straight from your data definitions in a schema to your application logic written in react.

I thought it was interesting because it was a change from the usual authentication cycle of storing some session information and handling all the authentication through sessions and restricted queries.

osrec · on March 16, 2022

Is this a viable/scalable method for setting up a multi tenant DB?

RedShift1 · on March 16, 2022

cryptonector · on March 16, 2022

PostgreSQL RLS used to have problems with UPDATEs. Does it still?

sekun · on March 17, 2022

Could you elaborate? I can update things just fine with a combination of USING, and WITH CHECK. USING can imply WITH CHECK too IIRC.

kaladin_1 · on March 16, 2022

Great article!

Although, I would say that this merely shows what is possible with the database level security. It might be useful for an internal db with less complicated permission system.

Authorisation Libraries on application level are more scalable and more maintainable than this database level security. Also, just by reading the application code you can tell the expected behaviour...

sekun · on March 17, 2022

I don’t understand what metrics you use for “more scalable” and “more maintainable”. If your application’s needs is sufficiently fulfilled by RLS, you don’t have to reinvent the wheel in the application level. Less code there to maintain is good.

Plus, I don’t see how “just by reading application code you can tell the expected behavior” doesn’t apply to RLS. Policies are written in a consistent format. USING for visibility, WITH CHECK for altering. I only have to keep an eye for these, and I’ll already get a good summary on what it does, no?