> In Keycloak nothing made sense to me until I got myself familiar with OAuth 2.0 and OpenID Connect.
Hot take: OAuth2 is a really shitty protocol. It is one of those technologies that get a lot of good press, because it enables you to do stuff you wouldn't be able to do in standardized manner without resorting to abysmal alternatives (SAML in this case). And because of that it shines in comparison. But looking at it from a secure protocol design perspective it is riddled with accidental complexity producing unnecessary footguns.
The main culprit is the idea to transfer security critical data over URLs. IIUC this was done to reduce state on the involved servers, but that advantage has completely vanished, if you follow today's best practices to use the PKCE, state and nonce parameter (together with the authorization code flow). And more than half of the attacks you need to prevent or mitigate with the modern extensions to the original OAuth concepts are possible because grabbing data from URLs is so easy: An attacker can trick you to use a malicious redirect URL? Lock down the possible redirects with an explicitly managed URL allow-list. URLs can be cached and later accessed by malicious parties? Don't transmit the main secret (bearer token) via URL parameters, but instead transmit an authorization code which you can exchange (exactly) once for the real bearer token. A malicious app can register your URL schema in your smartphone OS? Add PKCE via server-side state to prove that the second request is really from the same party as the first request...
It could have been so simple (see [1] for the OAuth2 roles): The client (third party application) opens a session at the authorization server, detailing the requested rights and scopes. The authorization server returns two random IDs – a public session identifier, and a secret session identifier for the client – and stores everything in the database. The client directs the user (resource owner) to the authorization server giving them the public session identifier (thus the user and possible attackers only ever have the possibility to see the public session identifier). The authorization server uses the public session identifier to look up all the details of the session (requested rights and scopes and who wants access) and presents that to the user (resource owner) for approval. When that is given, the user is directed back to the client carrying only the public session identifier (potentially not even that is necessary, if the user can be identified via cookies), and the client can fetch the bearer token from the authorization server using the secret session identifier. That would be so much easier...
Alas, we are stuck with OAuth2 for historic reasons.
You're right about the complexity and the steep learning curve, but there's hope that OAuth 2.1 will simplify this mess by forcing almost everyone to use a simple setup: authorization code + PKCE + dPoP. No "implicit flow" madness.
Another big problem with OAuth is the lack of quality client/server libraries. For example, in JS/Node, there's just one lone hero (https://github.com/panva) doing great work against an army of rubbish JWT/OAuth libs.
The problem with the authorization code flow is, it was not build with SPAs in mind. I.e. you always need a server-side component that obtains those tokens.
So a 100% client/FE solution based on NextJS/React/angular/vue etc. can not simply be deployed to a CDN and then use Auth0/AWS Cognito/Azure AD whatever without running and hosting your own server-side component.
The catch is that since the client web origin and AS web origin are often different sites, the AS has to actually implement CORS on their token endpoint.
Some implementations unfortunately (perhaps due to a misunderstanding about what CORS is meant to accomplish) make this a per-tenant/per-installation allowlist of origins on the AS.
Auth0 and Ping Identity (my employer) document CORS settings for products. I'm not sure about AWS and you might need to add CORS via API gateway. Azure AD supports CORS for the token endpoint, but they may limit domains in some manner (such as redirect uri of registered clients).
FWIW, I created a demo ages ago (at https://github.com/pingidentity/angular-spa-sample), which by default is configured to target Google for OpenID Connect and uses localhost for local development/testing. It hasn't aged particularly well in terms of library choices, but I do keep it running.
A deployment based on older Angular is also at https://angular-appauth.herokuapp.com to try - IIRC I used a node server just to deal with wildcard path resolution of the index file, but there's otherwise no local logic.
I appreciate you work on clarifying the situation. But my statement still stands, and you seem to back it up in your draft:
> The JavaScript application is then responsible for storing the access token (and optional refresh token) as securely as possible using appropriate browser APIs. As of the date of this publication there is no browser API that allows to store tokens in a completely secure way.
So with OAuth 2.0 + PKCE and no BE component, the tokens are directly exposed to the client, just as they were with the implicit flow. Also, if I'm not mistaken, PKCE extension is optional in OAuth 2.0, and without it you cannot securely use the code flow (as you would have to expose the client secret).
Storing access tokens in Javascript and storing them in a native application have about equal protections - but by far most Javascript apps are left far more susceptible to third party code execution.
The answer is typically to make such credentials incapable of being exfiltrated by adding proof-of-possession, such as the use of MTLS or of the upcoming DPoP mechanism.
Note that preventing exfiltration doesn't prevent a third party from injecting logic to remote-drive use of those tokens and their access sans exfiltration.
While access tokens can be requested with specifically limited scopes of access, a backend server could potentially further control the level of access a front-end has. The problem is that the backend and frontend are typically defined in terms of business requirements. As such, there hasn't been a clear opportunity for standardizing such approaches.
When using a backend, my advice is to be sure you don't just have your API take a session cookie in lieu of an access token. API are typically not constructed with protections from XSRF and the like (or rather, an access token header serves as an XSRF protection while just a session cookie will not).
> Also, if I'm not mistaken, PKCE extension is optional in OAuth 2.0
Correct - although it is strongly recommended by best current practices, including recommending deployments limit/block access to clients which do not use PKCE.
> …and without it you cannot securely use the code flow (as you would have to expose the client secret).
PKCE has nothing to do with client secrets or client authentication. It provides additional strong correlation between the initial front-end request with the following code exchange request.
It was written to support native apps, as many such apps used the system browser for the authorization step and then redirected back into a custom URL scheme. Since custom URL scheme registrations are not regulated, malicious apps could attempt to catch these redirects. PKCE provides a verification that the same client software created both the redirect to the authorization endpoint and the request to the token endpoint. Even if a malicious piece of software got the code, they wouldn't have a way to exchange it for an access token.
Some of the original OAuth security requirements for clients have been found to be poorly implemented, but that PKCE provides equivalent protections against these particular issues. Unlike client-only implementation logic, PKCE support is something that an AS can audit. Hence it being likely that PKCE will be a requirement in future versions of OAuth.
I really appreciate your effort, but somehow you just produce a lot of text without arguing against my point: OAuth 2.0 in a FE SPA is just broken, or "difficult" at best.
a) The text explains how its not broken
b) Its not difficult if you use a library that ensures this is done correctly. Otherwise yes, secure auth is difficult, just like its difficult anywhere else.
Even for 100% FE solutions, the current best practice from OAuth authors [1][2] is to use authorization code + PKCE (optionally, +dPoP). The implicit flow is deprecated (since PKCE), and from OAuth 2.1 it will be removed entirely.
It depends on the provider. For Mastodon and Pleroma, there's an endpoint to get generate a client ID/secret that you can call on the client. The flow is basically
1. Prompt for an instance name
2. Get a client id/secret from the instance and put it in localStorage
3. Redirect to the login page
4. Once you get the callback, get the token using the code and the client ID/secret from localStorage
5. You're done. No server needed.
SPA is HTML/JS served by the server. We don't need client-only solutions. We need devs to understand how HTTP and browsers work.
It means that we simply keep using what actually works, i.e. serverside component that obtains authorization and we use simple mechanisms to ensure token stays at the server and FE speaks to the server which in turn speaks to the target app. Proxying is not that difficult of a problem and we don't have to run in circles, inventing different flows only to cater to devs who can't learn their field.
We need CDN solutions for front-ends because that's the best way to deliver great, scalable performance for complex SPAs.
We also need a purely client-side flow for mobile (native) apps.
Additionally, the authorization code flow (with PKCE) in Keycloak still supports pure client side authorization. Its more complex than the implicit flow, but it doesn't really matter as any library (including keycloak-js) will take care to ensure its done correctly.
Honestly, DPoP[1] is pretty horrible. It is a partial re-implementation of TLS client authentication deep inside the TLS connection. What's wrong:
- No mandatory liveliness check. That means you don't know whether the proof of possession was indeed issued right now or has been pre-issued by an attacker with past access. Quoting from the spec[2]: """Malicious XSS code executed in the context of the browser-based client application is also in a position to create DPoP proofs with timestamp values in the future and exfiltrate them in conjunction with a token. These stolen artifacts can later be used together independent of the client application to access protected resources. To prevent this, servers can optionally require clients to include a server-chosen value into the proof that cannot be predicted by an attacker (nonce).""" This is a solved problem in TLS.
- The proof of possession doesn't cover much of an HTTP request, just "The HTTP method of the request to which the JWT is attached" and "The HTTP request URI [...], without query and fragment parts." It doesn't even cover the query parameters or POST body. Given rational: """The idea is sign just enough of the HTTP data to provide reasonable proof-of-possession with respect to the HTTP request. But that it be a minimal subset of the HTTP data so as to avoid the substantial difficulties inherent in attempting to normalize HTTP messages.""" In short: Because it is so damn difficult to do it on this layer. Of course, TLS covers everything in the connection.
- Validating the proofs, i.e. implementing the server side of the spec, is super complicated, see [3]. And to do it right you also need to check the uniqueness of the provided nonce see [4] which brings its own potential attack vectors. And to actually provide liveliness checks (see above) you have to implement a whole extra machinery to provide server chosen nonces see [5]. I expect several years until implementations are sufficiently bug free. Again, TLS has battle tested implementations ready.
Best of all? There is already a spec to do certificate based proof of possessions using mutual TLS! See [6]. We really should invest our time in fixing our software stack to use the latter (e.g. by adding JavaScript initiated mTLS in browsers) instead of yet another band aid in the wrong protocol layer.
For most Keycloak users, a very tiny subset of OIDC is being used too. Usually there is no three way relationship between a third party developer, an API provider and a user anymore. You could rip scopes out of Keycloak and few users wouldn't be able to cover their use cases. Rarely is there more than one set of scopes being used with the same client.
Keycloak also supports some very obscure specs, my favourite probably being "Client Initiated Backend Authentication" which can enable a push message sent to authenticator app type authentication flow using a lot of polling and/or webhooks.
Hot take: OAuth2 is a really shitty protocol. It is one of those technologies that get a lot of good press, because it enables you to do stuff you wouldn't be able to do in standardized manner without resorting to abysmal alternatives (SAML in this case). And because of that it shines in comparison. But looking at it from a secure protocol design perspective it is riddled with accidental complexity producing unnecessary footguns.
The main culprit is the idea to transfer security critical data over URLs. IIUC this was done to reduce state on the involved servers, but that advantage has completely vanished, if you follow today's best practices to use the PKCE, state and nonce parameter (together with the authorization code flow). And more than half of the attacks you need to prevent or mitigate with the modern extensions to the original OAuth concepts are possible because grabbing data from URLs is so easy: An attacker can trick you to use a malicious redirect URL? Lock down the possible redirects with an explicitly managed URL allow-list. URLs can be cached and later accessed by malicious parties? Don't transmit the main secret (bearer token) via URL parameters, but instead transmit an authorization code which you can exchange (exactly) once for the real bearer token. A malicious app can register your URL schema in your smartphone OS? Add PKCE via server-side state to prove that the second request is really from the same party as the first request...
It could have been so simple (see [1] for the OAuth2 roles): The client (third party application) opens a session at the authorization server, detailing the requested rights and scopes. The authorization server returns two random IDs – a public session identifier, and a secret session identifier for the client – and stores everything in the database. The client directs the user (resource owner) to the authorization server giving them the public session identifier (thus the user and possible attackers only ever have the possibility to see the public session identifier). The authorization server uses the public session identifier to look up all the details of the session (requested rights and scopes and who wants access) and presents that to the user (resource owner) for approval. When that is given, the user is directed back to the client carrying only the public session identifier (potentially not even that is necessary, if the user can be identified via cookies), and the client can fetch the bearer token from the authorization server using the secret session identifier. That would be so much easier...
Alas, we are stuck with OAuth2 for historic reasons.
[1] https://aaronparecki.com/oauth-2-simplified/#roles