Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial draft spec for version ranges #11

Merged
merged 13 commits into from
Nov 30, 2021
Merged

Add initial draft spec for version ranges #11

merged 13 commits into from
Nov 30, 2021

Conversation

pombredanne
Copy link
Member

@pombredanne pombredanne commented Oct 29, 2021

Copy link
Collaborator

@Hritik14 Hritik14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things that caught my eye

VERSION-RANGE-SPEC.rst Outdated Show resolved Hide resolved
Comment on lines 209 to 212
- "!=": Version exclusion or inequality comparator. This means a version must
not be equal to the provided version and this version must be excluded from
the range. For example: "!=1.2.3" means that version "1.2.3" is not part of
the range.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this version must be excluded from the range

Should we use the term "excluded from the version-constraint" rather than version range as in an "OR of ANDs" it is not possible to exclude a version explicitly (in the version range specifier) when there are looser constraints available.
Eg: "!=1.2.3, >1.0.0" implies versions should be "greater than 1.0.0 or not equal to 1.2.3" which will consist a set of every possible version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eg: "!=1.2.3, >1.0.0" implies versions should be "greater than 1.0.0 or not equal to 1.2.3" which will consist a set of every possible version.

yes, but you may want to write instead "!=1.2.3 & >1.0.0" which means what you intended: ""greater than 1.0.0 and not equal to 1.2.3"

Would you have a an example of something that cannot be expressed with this notation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be easier to read a "vers" if this is in CNF, e.g., an AND of ORs?
.. possibly using a coma for AND (like is done in pypi) and a pipe for OR?

  • AND is the default for pypi (there is no OR) with a coma; same for debian, rpm and rust: AND is the first thing, and is a coma
  • same for node-semver and composer where AND is the default

With that said, the two use cases may be conflicting:

  • for vulnerabilities, it is common to have a the same vulnerability affect multiple branches, such as httpd 1.x and 2.x. Here a version that's within any of these ranges would be vulnerable. e.g. any woudl mean an OR first.

  • for dependencies, we are trying to resolve and pick one version and in the common case, all the constrainst would need to be satisfied. Some notations do not even support an OR.

So vulnerabilities, OR is first. Dependencies AND is first.

In order to adopt the most common notations in use, it may be better to use AND first as this would be more natural.
But this would mean that for vulnerabilities multiple "vers" might need to be provided, essentially one for each "branch"; it not seem completely crazy but I am not sure I like it.

For now, I think keeping an OR as a the default is best. But adapting the notation to have a comma mean "AND" may be easier since that's what a coma means elsewhere.

Comment on lines 310 to 312
- If the constraint contains a star "*", validate that it is equal to "*".
<version-constraint> is "*". Parsing is done and no further processing is
needed for this ``vers``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there exists something after the "*" should an exception be raised than silently ignoring rest of the part ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a tool implementation issue. A tool may prefer to be strict and report some error, or be lenient and ignore silently the rest. Spec-wise, what is important is what is valid or not valid. That said let me add exactly this phrase.

- Yield the accumulated list of (comparator and version) that must apply for
this constraint

- Finally return the <versioning-scheme> and the list of <version-constraint>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So an input like

vers: semver / <=1.2.3, >2.2 & <2.5

turns into

("semver", ["<=1.2.3", ">2.2&<2.5"])
or better
("semver", ["<=1.2.3", (">2.2","<2.5")])

Right ?
It'll be nice to have short examples of input and output when this is implemented in python (we can have other languages later probably).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep the spec free of code snippets. But to your point, a tool could use a list of lists to represent the constraints.... I am working a rev of univers in a branch to match this spec, and this comes out a rather natural way to do model the constraints.

VERSION-RANGE-SPEC.rst Outdated Show resolved Hide resolved
VERSION-RANGE-SPEC.rst Outdated Show resolved Hide resolved
comparison semantics for dependencies. And for security advisories, the lack of
portable notation for vulnerable package version ranges means that these ranges
may be either ambiguous or hard to compute and are typically replaced by
complete enumerations of all versions, such as the NVD CPEs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think NVD CPEs provide complete enumerations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this is a bit more complex indeed.

  1. there are version ranges as part of the configuration https://github.com/nexB/univers/pull/11/files#diff-dda3c48de6da2f63bfa5902e41755b27a2f1d8c186a21dcc2641c42d4f5cfaebR450
  2. and there are daily recomputed concrete ranges available as a feed at https://nvd.nist.gov/vuln/data-feeds#cpeMatch ... which look to me as complete enumerations.

This feed contains both ranges and the enumerations. I wonder if they compute these or add them by hand?

VERSION-RANGE-SPEC.rst Show resolved Hide resolved
such as star and caret.


Note that there is a closely related problem: the way two versions are compared
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe breakdown the problems into a bullet list ? This sentence is hard to follow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me if this more readable with the latest push.

VERSION-RANGE-SPEC.rst Outdated Show resolved Hide resolved

The ``<versioning-scheme>`` (such as ``semver``,
``debian``, etc.) determines:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, many of the version range schemas are not standards, they are policies or conventions. What happens when those conventions change over time? For example, if in a year, Debian changes how they represent relationships between packages, will that alter the meaning of every vers:debian being used?

Additionally, what happens when one of the schemes adopts another scheme? For example, if Maven were to adopt semver in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevespringett hey!
Thank you ++ for the feedback. That's a good point.

Even though there may not always be a spec or clear documentation there is always code in the package management tools that provide something IMHO as good as a spec.

That said, to your point we cannot trust that upstream ecosystems will not change their ways.

A few thoughts:

  1. If they do this in a non-compatible way, this will wreak havoc on their ecosystem. Not a reassuring thought, but at least this makes these events less likely. But they can happen. See Go for instance.
  2. When I helped cleanup the linux kernel licensing using SPDX identifiers there was a concern if the SPDX were to change. The solution was simply to incorporate the few bits we depended on in the kernel process documentation thus making the doc self standing (and this was a good decision as some key license identifiers were deprecated mid-way of this work). To make it short, we can incorporate these external specs as needed to "freeze" them in this spec.
  3. And if they change in the future in a non-compatible, they would essentially create a new versioning scheme, and we could also create a new identifier for this versioning scheme such as vers:debian2, or vers:maven-semver

Unrelated, this makes me think that vers:semver is not a versioning scheme IMHO ... I need to update the text: node-semver is rather it as used in a package.json: e.g. the version are semver but the version ranges are node-specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, this makes me think that vers:semver is not a versioning scheme IMHO ... I need to update the text: node-semver is rather it as used in a package.json: e.g. the version are semver but the version ranges are node-specific.

I evolved this and now there is rather almost a 1-to-1 between a purl type and a vers scheme

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as another example of a convention that changed over time is the most widely used database in the world (probably billions of instances, would it be only on our phones, including the pre-smartphone area): sqlite.

In 2015 it changed its versioning scheme to semver. And with a twist to make matter worst: the downloadable archives are not named sqlite-MAJOR.MINOR.PATCHLEVEL, but sqliteMAJORMINORPATCHLEVELWITHPADDINGZEROS. For example 3.37.0 is 3370000. (It also uses its own CVS, Fossil, not git, but that's another concern.)

Clarify that extra characters in a star "*" range can be treated by
tools strictly or not.

Add URL.

Reported-by: Hritik Vijay <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
or feeds under "configurations"::

"versionStartIncluding": "7.3.0",
"versionEndExcluding": "7.3.31",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While looking into https://nvd.nist.gov/vuln/data-feeds#cpeMatch there are also:

    "versionStartExcluding" : "9.0.0",
    "versionEndIncluding" : "9.0.46",

and concrete enumeration of versions updated daily.

Reported-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Reported-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
node-semver is a range notation, while semver is a version string
syntax.

Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
These recent and closely related specs (they share authors) provide
another approach to documeting vulnerable ranges.

Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne
Copy link
Member Author

I have been playing more with the code to implement this and what transpires from this is that there seems to be no such thing as a "versioning scheme" that would be different from a purl package type. Even when I thought that semver may have been something that applies to several ecosystem, each seem to have their own small variations that may mean that most code would be shared but still each package type may define as overall scheme.... so I am inclined to reuse the purl type instead of the versioning scheme specified here.

Use pipe and comma separators and drop using ampersand.
This avoid a possible ambiguity when using a vers as a query string
argument in a URL (such as with a Package URL qualifier).

Signed-off-by: Philippe Ombredanne <[email protected]>
Improve documentation for existing versioning schems
Explain why we do not use mathematical intervals
Fix minor typos and improve formatting

Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne
Copy link
Member Author

I am merging this now here to get closure and will start a new PR at https://github.com/package-url/purl-spec for this spec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants