Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

constraints need id's #2049

Open
3 tasks
wandmagic opened this issue Sep 26, 2024 · 11 comments
Open
3 tasks

constraints need id's #2049

wandmagic opened this issue Sep 26, 2024 · 11 comments

Comments

@wandmagic
Copy link
Collaborator

wandmagic commented Sep 26, 2024

User Story

as an oscal developer, to build tools around validation and automation, it important that the sarif output contains ID's for all constraints

Goals

all constraints in oscal have logical names

Dependencies

Possible dependency on #2050.

Acceptance Criteria

  • All OSCAL website and readme documentation affected by the changes in this issue have been updated. Changes to the OSCAL website can be made in the docs/content directory of your branch.
  • A Pull Request (PR) is submitted that fully addresses the goals of this User Story. This issue is referenced in the PR.
  • The CI-CD build process runs without any reported errors on the PR. This can be confirmed by reviewing that all checks have passed in the PR.

(For reviewers: The wiki has guidance on code review and overall issue review for completeness.)

Revisions

No response

@aj-stein-gsa
Copy link
Contributor

As a consumer of upstream NIST OSCAL models and a developer of derived models and generic and FedRAMP-custom constraints, I want to strongly encourage and support the backporting of IDs into OSCAL models.

@iMichaela
Copy link
Contributor

@wandmagic - Thank you for submitting this issue. ID's for constraints can only be considered if implemented as optional due to backwards compatibility,. Otherwise, the issue will be considered during a major release of OSCAL.
@aj-stein-gsa - can you elaborate more on the "derived models" that need IDs for constraints.

@aj-stein-gsa
Copy link
Contributor

@aj-stein-gsa - can you elaborate more on the "derived models" that need IDs for constraints.

I would like to evaluate and propose changes to constraints and perhaps to model(s) themselves, perhaps the inclusion of new models, with NIST and the community if possible.

@aj-stein-gsa
Copy link
Contributor

Also, @iMichaela, we may want to add another issue I added to the backlog that is arguably a dependency to this one to further prepare for this change, if you would accept it. Would you or other members of this team be amenable to adding #2050 as a dependency to this issue? I do not the ability to edit wandmagic's issue here after the fact.

Thanks in advance for your help, understanding, and cooperation.

@iMichaela
Copy link
Contributor

@aj-stein-gsa - can you elaborate more on the "derived models" that need IDs for constraints.

I would like to evaluate and propose changes to constraints and perhaps to model(s) themselves, perhaps the inclusion of new models, with NIST and the community if possible.

Sure, everything is possible, as long as we follow the same process with the prototype models we currently have (Shared Responsibility and Controls Mapping) and which have been reviewed by the community (CNSC WG) but are still waiting for just a little longer FedRAMP's review. So what new models do you think OSCAL is missing - just curious?

@aj-stein-gsa
Copy link
Contributor

So what new models do you think OSCAL is missing - just curious?

I have no specific model recommendations at this time. Many of them could be as simple as identifying and relocating constraints. Others way require relocating or creating new data structures. We will let you know as the need arises, but without constraint IDs and refactors in #2050, any such change will require much more effort to develop and test before final consensus is reached.

@wendellpiez
Copy link
Contributor

IDs can be added. I recommend descriptive/mnemonic IDs - @wandmagic @aj-stein-gsa please feel free to suggest names or a rule for how to coin them. Or if you trust me I can make them up and you can critique.

As for refactoring constraints I recommend doing that separately, so as to enable expediting, the first task?

As for backward compatibility, this is what we can do without breaking it:

  • Add IDs on constraints
  • Removing constraints
  • Relaxing constraints - either relaxing rules, or tightening context (scope of application)

We can't add or tighten constraints (or broaden scope) without risk of invalidating data that is currently valid. I would also recommend against changing any IDs already assigned.

@iMichaela if you agree with above, I suggest two separate Issues:

  • Add IDs to constraints (requiring definition of naming convention if any, plus assigning IDs)
  • Refactor constraints (requiring discussion and probably testing)

Happy for help with either of these - across all the models these are not small tasks.

But at least the first could be done with a single clean PR, right?

As for #2050 - that's a complication but if we can figure out how to expedite the work, I am not against it.

@RS-Credentive
Copy link
Contributor

There are (possibly) two problems here:

  1. How to unambiguously identify the externally defined constraint that should apply to a given element in a Metaschema model (such as OSCAL) when validating a model. Saying inside an element, "please apply constraint X when validating"
  2. When validation of a constraint fails, it is important to unambiguously identify which constraint was violated in the output of the tooling (SARIF or otherwise)

I haven't studied the external constraint model in depth, so it may solve (1) somehow.

Is it primarily (2) that we're worried about? Is it more useful to say "validation failed because this string was not 'x'" vs. "validation failed because of constraint X". Is this something that the tooling should handle vs defining it in the specification?

@wendellpiez
Copy link
Contributor

wendellpiez commented Sep 27, 2024

We are worried about both, but problem no 1 is a problem with a solution. For the Schematron or XSLT analogue to this, see https://www.w3.org/TR/xslt-30/#patterns specifically "Selection patterns".

Examples - XSLT match='p' matches any element named p (in whatever namespace is bound to unprefixed names). In XSLT such a match is used to define the applicability of a template or a key declaration. Schematron context='p' similarly matches p elements (or ns:p for a p in a namespace bound to prefix ns) - in Schematron this means to apply the rules given (here) to all p elements (or ns:p elements) in a document.

But p//p matches only p elements that appear inside other p elements (nested inside at any level). A rule targeted at p//p elements will not be applied to any control/part/p (which can't validly appear inside a p).

This makes room not only for @ns-based qualification but much else, for example target='.[@class='mine']//prop, which would apply to any prop element contained in the definition context iff the containing element has @class='mine' (and not otherwise).

To be unsparing, here is .[@class='mine']//prop using XPath full (not abbreviated) syntax (will also work in Metapath):

self::node()[attribute::class='mine']/descendant-or-self::node()/child::prop

For an XPath overview, https://en.wikipedia.org/wiki/XPath is actually not a bad place to start.

Now as for IDs for the sake of reporting - an error can be reported with a path -

ERROR: constraint violation in element /catalog/group[3]/control[5]/prop[2]

or by a smarter system,

ERROR: constraint violation in element //control[@id='au.5']/prop[@name='label'][1]

But these of course tell where the rule is violated, not which rule it was. That info can be given in a hand-written error message, but that can also be botched, while giving a path to the rule's definition (in a metaschema module?) is next to useless ... hence an ID to enable easier documentation, indexing, analysis and traceability.

@iMichaela
Copy link
Contributor

Thank you, @wendellpiez and @RS-Credentive .

  1. In addition to technical approaches discussed here, we need to acknowledge that our documentation generation pipeline will break when constraints are externalized and that is not easy lift. It impacts all adopters !

  2. I would like to know how many solutions exist today out there that are implementing OSCAL based on the metaschema definitions of OSCAL and which will have to be updated. We cannot ignore the impact on their work.

@wendellpiez
Copy link
Contributor

Excellent point @iMichaela.

Dropping constraints from the present modules will drop those constraints from the documentation - so that is not 'breaking' but it is opening a big hole.

Externalized constraints do have to be documented, of course - perhaps with a specific pipeline designed to produce HTML docs from metaschema modules. Additionally, anyone developing their own constraints (module) will want a way to document those.

How to integrate documentation across multiple layers is another question -- although not a blocker for documenting constraint sets separately.

An implementation of a constraints model such as oscal-cli or InspectorXSLT could offer a 'verbose' mode that would trace wherever rules are tested (and pass), over and above errors. This could also be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Needs Triage
Development

No branches or pull requests

5 participants