Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: saml federation #2653

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

feat: saml federation #2653

wants to merge 8 commits into from

Conversation

sebferrer
Copy link

@sebferrer sebferrer commented Aug 9, 2022

The original PR #2148 has been accidentally closed, so here's a new one.

Completion Progress

  • Main endpoints
  • SDK adaptation
  • Responses consumption
  • RelayState continuity
  • UI adaptation
  • YAML Configuration

Main endpoints

Concerning the first part, the goal is to develop the two main endpoints :

  • /metadata (GET) : Generate the metadata of the SP (Kratos)
  • /acs (POST) : Handle SAML request

Then we have to deal with the way endpoints work. The library already implements what we want to make these endpoints work. The library allows you to create a metadata file very easily so we will need to incorporate it into Kratos to allow the endpoint /metadata to create them easily. Concerning the endpoint /acs, the Crewjam library allows to receive the SAML requests, to understand them and to treat them accordingly.

SDK adaptation

The goal here is to allow the SDK to call our SAML methods. Currently, the SDK allows to protect a route via a redirection to the login page. We should copy this system a little and allow to protect a route via SAML by a redirection to the IDP. After authentication, the IDP will redirect the user to the desired page. There is also the very important problem of converting the session created by the Crewjam/saml library into a Kratos session to remain homogeneous.

Responses consumption

Now that the endpoints are created, the SAML responses must be processed by Kratos. This means that the endpoint /acs must receive the SAML responses, consume them and translate them into a language that Kratos can understand. More clearly, this endpoint must allow Kratos to support SAML requests and to perform the actions associated with these requests.

It is also in this part that you must check if the session has not expired (according to the duration indicated in option). If it is the case, you have to send a SAML Request to the IDP.

RelayState continuity

Kratos has a continuity management system that allows it to validate the continuity of a login flow (in particular to prevent the possibility of forging login requests). This system is based on the transit of a continuity cookie along the login flow. In the case of SAML, the cookie is lost at the end of the chain because the response is sent from the IdP to the SP in POST, and the SameSite cookies is set to "Lax". The idea is to keep the already present pattern which consists in starting from a continuity cookie to end with the same cookie, rebuilding the lost cookie that would have been kept via the RelayState. It is therefore necessary to add a continuity manager which uses the same principles as the current one, but which would also be based on the transit of the continuity value via the RelayState. More information here: #2486

UI adaptation

Now we need to make the buttons corresponding to our SAML configuration appear in the UI. This requires an adaptation of the Nodes in Kratos, but also in Elements.
Indeed, Ory moved all of the UI rendering and handling of the flow object to Ory Elements so that it is reusable across many examples as well as the kratos-selfservice-ui-node repository.
Right now they opted to expect certain flow nodes in the structure they want. Here is the related PR in ory/elements: ory/elements#54

YAML Configuration

Finally, the last part will concern the configuration. Not everyone wants to use SAML so we will have to use the YAML and Kratos configuration system to adapt it to SAML by adding new options to indicate if we want to use SAML and fill in the endpoints. The objective here is to make the final link between Kratos and SAML and thus be able to create instances of Kratos implementing SAML.

Concerning the options, here are the variables we can modify :

  • Bindings
  • Session duration
  • Level of security (Call the RequireAccount method every time that a route protected by Kratos is accessed or not)
  • Traits update

Related issue(s)

Design Document

Checklist

  • I have read the contributing guidelines.
  • I have referenced an issue containing the design document if my change
    introduces a new feature.
  • I am following the
    contributing code guidelines.
  • I have read the security policy.
  • I confirm that this pull request does not address a security
    vulnerability. If this pull request addresses a security. vulnerability, I
    confirm that I got green light (please contact
    [email protected]) from the maintainers to push
    the changes.
  • I have added tests that prove my fix is effective or that my feature
    works.
  • I have added or changed the documentation.

Disclaimer

At the moment, this is only a first version which is not intended to be merge. All the documentation and tests are still to be done.

@sebferrer sebferrer mentioned this pull request Aug 9, 2022
12 tasks
embedx/config.schema.json Outdated Show resolved Hide resolved
embedx/config.schema.json Outdated Show resolved Hide resolved
embedx/config.schema.json Show resolved Hide resolved
embedx/config.schema.json Show resolved Hide resolved
embedx/config.schema.json Show resolved Hide resolved
internal/httpclient/api/openapi.yaml Outdated Show resolved Hide resolved
internal/httpclient/docs/V0alpha2Api.md Outdated Show resolved Hide resolved
postgres.yaml Show resolved Hide resolved
@alexGNX alexGNX force-pushed the saml branch 2 times, most recently from 8f81269 to 3e5ad65 Compare August 12, 2022 10:18
@kmherrmann
Copy link
Contributor

Hello again - just wanted to check in on this: When would you like us to take a look at this PR?

@psauvage0
Copy link

psauvage0 commented Aug 25, 2022

Hi @kmherrmann, I think there is still some minor tidying up to do on the branch, but the people in our team working on it are currently on vacation, and should come back next week. However, as the PR is pretty large, I'd say you can start reviewing now.

@codecov
Copy link

codecov bot commented Sep 1, 2022

Codecov Report

Merging #2653 (6cf0778) into master (2d489e7) will decrease coverage by 4.05%.
The diff coverage is 37.91%.

❗ Current head 6cf0778 differs from pull request most recent head 7a827c6. Consider uploading reports for the commit 7a827c6 to get more accurate results

@@            Coverage Diff             @@
##           master    #2653      +/-   ##
==========================================
- Coverage   77.50%   73.46%   -4.05%     
==========================================
  Files         314      305       -9     
  Lines       19897    17517    -2380     
==========================================
- Hits        15421    12868    -2553     
- Misses       3285     3638     +353     
+ Partials     1191     1011     -180     
Impacted Files Coverage Δ
continuity/manager.go 72.41% <ø> (ø)
continuity/manager_relaystate.go 0.00% <0.00%> (ø)
identity/credentials.go 75.00% <ø> (-5.00%) ⬇️
...elfservice/strategy/saml/strategy/strategy_auth.go 0.00% <0.00%> (ø)
ui/node/node.go 91.20% <ø> (-1.10%) ⬇️
x/provider.go 75.00% <ø> (+25.00%) ⬆️
x/relaystate.go 0.00% <0.00%> (ø)
...lfservice/strategy/saml/strategy/strategy_login.go 18.00% <18.00%> (ø)
selfservice/strategy/saml/strategy/strategy.go 24.87% <24.87%> (ø)
...ce/strategy/saml/strategy/strategy_registration.go 32.14% <32.14%> (ø)
... and 327 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you for working on this feature! There is a couple of topics I'd like to address, please see my comments. Thanks! :)

@@ -0,0 +1,336 @@
package saml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow the rest of the design pattern, this handler should move into the strategy. I don't think that it belongs into the flow as it doesn't have its own flow object nor any other properties of a flow (hooks, ui, ...) as far as I can tell.

}

// Key pair to encrypt and sign SAML requests
keyPair, err := tls.LoadX509KeyPair(strings.Replace(c.SAMLProviders[len(c.SAMLProviders)-1].PublicCertPath, "file://", "", 1), strings.Replace(c.SAMLProviders[len(c.SAMLProviders)-1].PrivateKeyPath, "file://", "", 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "use the last entry" functionality is copy & pasted across several lines. Do we need it everywhere? ANd why do we care only about the last entry?

Copy link
Author

@sebferrer sebferrer Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This is a leftover from the very beginning of our work on SAML implementation. We implemented the Provider() method incorrectly (provider_config.go), I'll fix that quickly :)

}
keyPair.Leaf, err = x509.ParseCertificate(keyPair.Certificate[0])
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wrap errors in descriptive error messages using the herodot package (e.g. herodot.ErrBadRequest) and use errors.WithStack() to include stack traces


metadataURL := c.SAMLProviders[len(c.SAMLProviders)-1].IDPInformation["idp_metadata_url"]
// The metadata file is provided
if strings.HasPrefix(metadataURL, "file://") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a library for that in ory/x called fetcher, please use that one!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, we've done the file reorganization, it should fit your design pattern better now. We are now about to take into account the rest of your review :)

@splaunov
Copy link
Contributor

splaunov commented Sep 8, 2022

How could we design end-to-end tests?
Would it be a good choice as IdP for tests?
https://gluu.org/docs/gluu-server/4.1/admin-guide/saml/

Or are there any simpler solutions? Some SAML IdP reference implementation?

@aeneasr
Copy link
Member

aeneasr commented Sep 8, 2022

I think there is an online SAML test server, I saw it somewhere on slack. I'd like to avoid setting up Gloo, I've tried before but it was really difficult to get it working (it was ~1 year ago).

@splaunov
Copy link
Contributor

splaunov commented Sep 8, 2022

@sebferrer
Copy link
Author

Will look into these: https://samltest.id https://www.samltool.com https://mocksaml.com https://stackoverflow.com/questions/1125915/can-you-recommend-a-saml-2-0-identity-provider-for-test

We used a lot samltest.id and samltool, I recommend these :)

@sebferrer sebferrer marked this pull request as ready for review September 8, 2022 12:58
@sebferrer
Copy link
Author

Good news, this PR is ready for review!

We will soon take into account the first feedbacks and then take care of the missing tests.

Thank you!

// We have to get the SessionID from the cookie to inject it into the context to ensure continuity
cookie, err := r.Cookie(continuity.CookieName)
if err != nil {
h.d.SelfServiceErrorManager().Forward(r.Context(), w, r, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cookie is not set, nil pointer dereference error is thrown down the code.

@splaunov
Copy link
Contributor

Have added registration test: ovh#4

@sebferrer
Copy link
Author

Awesome, thank you for working on this feature! There is a couple of topics I'd like to address, please see my comments. Thanks! :)

Hey, I think we've taken in account all points of your review, don't hesitate to take a look!

@CLAassistant
Copy link

CLAassistant commented Dec 14, 2022

CLA assistant check
All committers have signed the CLA.

@sebferrer
Copy link
Author

sebferrer commented Dec 14, 2022

Hey @aeneasr, small update: we added unit tests to cover our continuity manager based on RelayState (essential to the SAML flow, more information here #2486).
Feel free to have a look at it, as well as at the previous modifications which take into account your last review :)

@aeneasr
Copy link
Member

aeneasr commented Jan 5, 2023

I tried pushing some changes required for merging the PR to your fork & branch, but it appears that I am not allowed to do so 😕

% git push ...
ERROR: Permission to push denied to aeneasr.
fatal: could not read from the remote repository.

Please make sure that you have the correct access rights
and the repository exists.

But the good news is, giving access is easy! ☺️ All you need to do is enable write access for maintainers. Thank you! 😄

If the repository belongs to an organization, please add me for the project as a collaborator!

@aeneasr
Copy link
Member

aeneasr commented Jan 5, 2023

I wanted to fix the formatter error as well as the SDK generation error :)

@sebferrer
Copy link
Author

@aeneasr thanks a lot for your help! You should have access to push on ovh/kratos now :)

@aeneasr
Copy link
Member

aeneasr commented Jan 5, 2023

Great, thanks! I have allocated some time on monday to look into the PR :)

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is already in a great state. I've reviewed everything outside the SAML module for now, the saml module itself is still required to review on my end.

@@ -0,0 +1,151 @@
package continuity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is almost a 1:1 copy of manger_cookie. Can you please remove it and use manager_cookie instead? If you need to parametrize things, feel ree to adjust the manager_cookie file :)

Copy link
Author

@sebferrer sebferrer Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! We now use the manager_cookie and make the use of the RelayState parametrizable :)

@@ -1,3 +1,6 @@
// 20221125150145
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove please - invalid json

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 672 to 696
// If m.continuityManager is nil or not a continuity.ManagerCookie
switch m.continuityManager.(type) {
case *continuity.ManagerCookie:
default:
m.continuityManager = continuity.NewManagerCookie(m)
}
return m.continuityManager
}

func (m *RegistryDefault) RelayStateContinuityManager() continuity.Manager {
// If m.continuityManager is nil or not a continuity.ManagerRelayState
switch m.continuityManager.(type) {
case *continuity.ManagerRelayState:
default:
m.continuityManager = continuity.NewManagerRelayState(m, m)
}
return m.continuityManager
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect and racy. If you need two separate managers, save them in separate variables

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need separate managers anymore, so problem solved :)

// swagger:model identityCredentialsSamlProvider
type CredentialsSAMLProvider struct {
Subject string `json:"subject"`
Provider string `json:"samlProvider"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use underscore :)

saml_provider

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

}

// Create an uniq identifier for user in database. Its look like "id + the id of the saml provider"
func NewCredentialsSAML(subject string, provider string) (*Credentials, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -0,0 +1 @@
INSERT INTO identity_credential_types (id, name) SELECT 'ff5a1823-8b47-4255-860f-4b70ed122740', 'saml' WHERE NOT EXISTS ( SELECT * FROM identity_credential_types WHERE name = 'saml');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you generate this UUID? :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generated it by hand because I still didn't understand how it all worked, how am I supposed to do that?

x/provider.go Outdated
@@ -29,6 +29,11 @@ type CookieProvider interface {
ContinuityCookieManager(ctx context.Context) sessions.StoreExact
}

type RelayStateProvider interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We potentially don't need this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we removed it

)

// SessionGetRelayState returns a string of the content of the relaystate for the current session.
func SessionGetStringRelayState(r *http.Request, s sessions.StoreExact, id string, key interface{}) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite complex - which means that it needs a test :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's covered by the test in continuity/manager_relaystate_test.go :)

// SessionGetRelayState returns a string of the content of the relaystate for the current session.
func SessionGetStringRelayState(r *http.Request, s sessions.StoreExact, id string, key interface{}) (string, error) {

cipherRelayState := r.PostForm.Get("RelayState")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be underscored and lowercase:

cipherRelayState := r.PostForm.Get("relay_state")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to use this "RelayState" notation because it is a SAML standard

@ThibHrrd
Copy link

ThibHrrd commented Feb 15, 2023

Hey @aeneasr, thank you very much for your review. We wanted to finish our current task before taking it on, so here is some information about it:
You had some concerns about the security of our SAML implementation. So we decided to set up some security tests. There are 27 new tests to check the main known SAML vulnerabilities. There is also a README summarizing what we did: https://github.com/ovh/kratos/blob/saml/selfservice/strategy/saml/saml_vulnerabilities_check.md

Don't hesitate if you have any questions or feedback!

@lol768
Copy link

lol768 commented Feb 18, 2023

I don't have much technical to contribute here, but just wanted to say this is really exciting to see being developed and it'll be great to see Kratos get SAML SP support when this is merged 🚀

ThibHrrd and others added 8 commits February 22, 2023 17:02
Signed-off-by: ThibaultHerard <[email protected]>

Co-authored-by: sebferrer <[email protected]>
Co-authored-by: psauvage <[email protected]>
Co-authored-by: alexGNX <[email protected]>
Co-authored-by: Stoakes <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]>

Co-authored-by: sebferrer <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]>

Co-authored-by: sebferrer <[email protected]>
Signed-off-by: sebferrer <[email protected]>

Co-authored-by: ThibaultHerard <[email protected]>
+ update saml tests
Signed-off-by: ThibaultHerard <[email protected]>

Co-authored-by: sebferrer <[email protected]>
Signed-off-by: sebferrer <[email protected]>

Co-authored-by: ThibaultHerard <[email protected]>
Signed-off-by: ThibaultHerard <[email protected]>

Co-authored-by: sebferrer <[email protected]>
+ test new credentials saml
+ resolving conflicts with master
Signed-off-by: sebferrer <[email protected]>

Co-authored-by: ThibaultHerard <[email protected]>
@sebferrer
Copy link
Author

sebferrer commented Feb 22, 2023

Hey @aeneasr! I think all your last reviews have been taken into account, we did a refactoring of the continuity manager in the case of using RelayState.
Feel free to give us feedback on it, we can't wait to move forward with the Merge! :)

@mbessette-cleo
Copy link

Conflicts @sebferrer

And thank you for all the work on this! Definitely excited to see this get in!

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally had some time to look more into the code base - thank you very much. It's a lot of comments and I think there are several areas that have to be addressed before this rolls out in any production system.

As it stands currently, there are several areas of improvements. This ranges from code path issues (forgot to add return) to incorrect function use that would end up with responses that clients can't understand (e.g. sending a HTTP redirect instead of XML).

The traits pull-model on login is not working at the moment, as the identity is not updated as part of login currently.

Further, the traits mapping is naive in a sense that it just copies anything the SAML client sends into the traits, and it does not run the vital validation pipeline. Identities will not be able to perform e.g. account recovery or other basic flows (adding 2FA for example) as far as I can read this.

There seems to be the concept of JsonNet in the test data, but it is - as far as I can tell - not being used by the code base.

There are a couple more topics such as missing thread safety which will eventually end up with panics and invalid configuration. The problem is that we're using a global map-based cache for the configs. This will not work with hot-reloading.

Lastly, there are no functional / integration / e2e tests that perform the complete flow from a-z and prove that it works / test for regression.

In summary, it is great how far this has come! But there are stil miles to do before this can run in a production system.

config := h.d.Config()
pid := ps.ByName("provider")

if samlMiddlewares[pid] == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe - need to address

config := h.d.Config()
pid := ps.ByName("provider")

if samlMiddlewares[pid] == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here


// Checks if the user already have an active session
if e := new(session.ErrNoActiveSessionFound); errors.As(e, &e) {
// No session exists yet, we start the auth flow and create the session
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this enough? What if we are refreshing the flow?

}

func DestroyMiddlewareIfExists(pid string) {
if samlMiddlewares[pid] != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not thread-safe

}

// Key pair to encrypt and sign SAML requests
keyPair, err := tls.LoadX509KeyPair(strings.Replace(providerConfig.PublicCertPath, "file://", "", 1), strings.Replace(providerConfig.PrivateKeyPath, "file://", "", 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing: base64 / https source. Should use fetcherx instead

if err != nil {
return err
}
i.Traits = identity.Traits(traits)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is doing - it is updating the traits of the user with the data from the SAML provider. Is that intentional?

Comment on lines +62 to +67
s.d.Logger().
WithRequest(r).
WithField("oidc_provider", provider.Config().ID).
WithSensitiveField("identity_traits", i.Traits).
WithField("mapper_jsonnet_url", provider.Config().Mapper).
Debug("Merged form values and OpenID Connect Jsonnet output.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an incorrect log

Comment on lines +53 to +55
delete(traitsMap, "iss")
delete(traitsMap, "email_verified")
delete(traitsMap, "sub")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we delete these fields?

}

i := identity.NewIdentity(s.d.Config().DefaultIdentityTraitsSchemaID(r.Context()))
if err := s.setTraits(w, r, claims, provider, jsonClaims, i); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive approach to traits mapping, will not work in practice. Need to add JsonNet parsing and more

dsig "github.com/russellhaering/goxmldsig"
"gotest.tools/assert"
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question to self: are we testing the underlying library here?

@sebferrer
Copy link
Author

Hey @aeneasr, thanks a lot for your review!

The points of improvement are very clear to us and will keep us moving in the right direction.

We had to move on to other priority tasks so we won't be making the changes right away. We can't make any commitment on a specific date to consider your review, but this PR is definitely still on the table for us.

If it takes too long for you, feel free to take over the code yourself, from our side we will keep you informed when we resume our work on this project.

Thank you again! 🙂

@Satish-Karunanithi
Copy link

Satish-Karunanithi commented Apr 2, 2024

Hi @aeneasr, we are looking for this SAML support for quite some time. Would like to know when this feature will be merged and available? This PR merge is blocked for some code issue? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.