From 69b67709958d1fcb3598530a75b680b099b832ff Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Tue, 30 Nov 2021 15:08:38 +0100 Subject: [PATCH 01/21] Add mostly universal version range spec draft Originally at https://github.com/nexB/univers/pull/11 Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 656 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 656 insertions(+) create mode 100644 VERSION-RANGE-SPEC.rst diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst new file mode 100644 index 0000000..9ebc644 --- /dev/null +++ b/VERSION-RANGE-SPEC.rst @@ -0,0 +1,656 @@ +====================================================== +vers: a mostly universal version range specifier +====================================================== + +This specification is a new syntax for dependency and vulnerable version ranges. + + +Context +-------- + +Software package version ranges and version constraints are essential: + +- When resolving the dependencies of a package to express which subset of the + versions are supported. For instance a dependency or requirement statement + such as "I require package foo, version 2.0 or later versions" defines a + range of acceptable foo versions. + +- When stating that a known vulnerability or bug affects a range of package + versions. For instance a security advisory such as "vulnerability 123 affects + package bar, version 3.1 and version 4.2 but not version 5" defines a range of + vulnerable "bar" package versions. + +Version ranges can be replaced by a list enumerating all the versions of +interest. But in practice, all the versions may not yet exist when defining an +open version range such as "v2.0 or later". + +Therefore, a version range is a necessary, compact and practical way to +reference multiple versions rather than listing all the versions. + + +Problem +-------- + +Several version range notations exist and have evolved separately to serve the +specific needs of each package ecosystem, vulnerability databases and tools. + +There is no (mostly) universal notation for version ranges and there is no +universal way to compare two versions, even though the concepts that exist in +most version range notations are similar. + +Each package type or ecosystem may define their own ranges notation and version +comparison semantics for dependencies. And for security advisories, the lack of +a portable and compact notation for vulnerable package version ranges means that +these ranges may be either ambiguous or hard to compute and may be best replaced +by complete enumerations of all impacted versions, such as in the `NVD CPE Match +feed `_. + +Because of this, expressing and resolving a version range is often a complex, or +error prone task. + +In particular the need for common notation for version has emerged based on the +usage of Package URLs referencing vulnerable package version ranges such as in +vulnerability databases like `VulnerableCode +`_. + +To better understand the problem, here are some of the notations and conventions +in use: + +- ``semver`` https://semver.org/ is a popular specification to structure version + strings, but does not provide a way to express version ranges. + +- Rubygems strongly suggest using ``semver`` for version but does not enforce it. + As a result some use semver and several popular package do not use strict + semver. Rubygems use their own notation for version ranges which ressembles + the ``node-semver`` notation with some subtle differences. + See https://guides.rubygems.org/patterns/#semantic-versioning + +- ``node-semver`` ranges are used in npm at https://github.com/npm/node-semver#ranges + with range semantics that are specific to ``semver`` and npm. + +- Dart pub versioning scheme is similar to ``node-semver`` and the documentation + at https://dart.dev/tools/pub/versioning provides a comprehensive coverage of + the topic of versioning. Version resolution uses its own algorithm. + +- Python uses its own version and version ranges notation with notable + specificities on how how pre- and post-release suffixes are used + https://www.python.org/dev/peps/pep-0440/ + +- Debian and Ubuntu use their own notation and are remarkabel for their use of + ``epochs`` to disambiguate versions. + https://www.debian.org/doc/debian-policy/ch-relationships.html + +- RPM distros use their own range notation and use epochs like Debian. + https://rpm-software-management.github.io/rpm/manual/dependencies.html + +- Perl CPAN defines its own version range notation similar to this specification + and uses two-segment versions. https://metacpan.org/pod/CPAN::Meta::Spec#Version-Ranges + +- Apache Maven and NuGet use similar math intervals notation using brackets + https://en.wikipedia.org/wiki/Interval_(mathematics) + + - Apache Maven http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html + - NuGet https://docs.microsoft.com/en-us/nuget/concepts/package-versioning#version-ranges + +- gradle uses Apache Maven notation with some extensions + https://docs.gradle.org/current/userguide/single_versions.html + +- Gentoo and Alpine Linux use comparison operators similar to this specification: + - Gentoo https://wiki.gentoo.org/wiki/Version_specifier + - Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c + +- Arch Linux https://wiki.archlinux.org/title/PKGBUILD#Dependencies use its + own simplified notation for its PKGBUILD depends array. + +- Go modules https://golang.org/ref/mod#versions use semver versions with + specific version resolution algorithms. + +- Haskell Package Versioning Policy https://pvp.haskell.org/ provides a notation + similar to this specification based on a modified semver with extra notations + such as star and caret. + +- The NVD https://nvd.nist.gov/vuln/data-feeds#cpeMatch defines CPE ranges as + lists of version start and end either including or excluding the start or end + version. And also provides a concrete enumeration of the available ranges as + a daily feed. + +- The version 5 of the NVD CVE JSON data format at + https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0.schema#L303 + defines version ranges with a starting version, a versionType, and an upper + limit for the version range as lessThan or lessThanOrEqual. Or an enumeration + of versions. The versionType is defined as ``"The version numbering system + used for specifying the range. This defines the exact semantics of the + comparison (less-than) operation on versions, which is required to understand + the range itself"``. + +- The OSSF OSV schema https://ossf.github.io/osv-schema/ defines vulnerable + ranges with version events using "introduced" and "limit" fields and an + enumeration of all the versions in these ranges, except for semver-based + versions. A range may be ecosystem-specific based on a provided package + "ecosystem" value that ressembles closely the Package URL package "type". + + +The way two versions are compared as equal, lesser or greater is a closely +related topic: + +- Each package ecosystem may have evolved its own peculiar version string + conventions, semantics and comparison procedure. + +- For instance, ``semver`` is a prominent specification in this domain but this is + just one of the many ways to structure a version string. + +- Debian, RPM, PyPI, Rubygems, and Composer have their own subtly different + approach on how to determine which version is greater or lesser. + + +Solution +--------- + +A solution to the many version range syntaxes is to design a new notation to +unify them all with: + +- a mostly universal and minimalist, compact notation to express version ranges + from many different package types and ecosystems. + +- the package type-specific definitions to normalize existing range expressions + to this common notation. + +- the designation of which algorithm or procedure to use when comparing two + versions such that it is possible to resolve if a version is within or + outside of a version range. + +We call this solution "version range specifier" or "vers" and it is described +in this document. + + +Version range specifier +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A version range specifier (aka. "vers") is a URI string using the ``vers`` +URI-scheme with this syntax:: + + vers:/|,... + +For example to define a set of versions that contains either version ``1.2.3``, +or any versions greater than or equal to ``2.0.0`` but less than ``5.0.0`` using +the ``node-semver`` versioning scheme, the version range specifier will be:: + + vers:npm/1.2.3|>=2.0.0,<5.0.0 + +Each ```` in the pipe-separated list is either a simple +constraint such as:: + + + +Or a composite constraint grouping multiple ```` joined by +a comma such as:: + + ,... + +The pipe is a logical `OR` and the comma is a logical `AND`. + +A version range specifier is therefore an "OR of ANDs" where there are two +levels of constraints that a version should satisfy to be part of the range: + +- At the first level, anyone of the constraints should be satisfied +- At the second level, all of the constraints must be satisfied + +This is also called a "disjunctive normal form" in boolean logic. +See https://en.wikipedia.org/wiki/Disjunctive_normal_form for details. + +``vers`` is the URI-scheme and is an acronym for "VErsion Range Specifier". It +has been selected because it is short, obviously about version and available +for a future formal registration for this URI-scheme at the IANA registry. + + +```` +------------------------ + +The ```` (such as ``npm``, ``deb``, etc.) determines: + +- the specific notation and conventions used for a version string encoded in + this scheme. Versioning schemes often specify a version segments separator and + the meaning of each version segments, such as [major.minor.patch] in semver. + +- how two versions are compared as greater or lesser to determine if a version + is within or outside a range. + +- how a versioning scheme-specific range notation can be transformed in the + ``vers`` simplified notation defined here. + +- by convention the versioning scheme should be the same string as the Package + URL package type for a given package ecosystem. It is OK to have other schemes + beyond the purl type and even schemes that are specific to a single package. + +The ```` is followed by a slash "/". + + +```` +---------------------------- + +After the ```` and "/" there are one or more +```` separated by a pipe "|". The pipe "|" means that +**any** of these constraints must be satisfied for a version to be resolved as +within this version range. + +Each ```` of this pipe-separated list can be either a +single constraint or a list of constraints separated in turn by an comma "," as +in ``1.2.3|>=2.0.0,<5.0.0``. + +Multiple ```` combined with a comma means that **all** these +constraints must be satisfied for a version to be resolved as contained in this +range. + +Each simple version constraint has this syntax:: + + + +The ```` is one of these comparison operators: + +- "=": Version equality comparator. It is the default and implied if not + present and means that a version must be equal to the provided version. + For example: "=1.2.3". It must be omitted in the canonical representation. + Equality is based on the equality of two lower-cased and normalized version + strings and is typically not versioning scheme-specific, though some + scheme such as pypi PEP440 may apply some version string normalization + before testing for equality. + +- "!=": Version exclusion or inequality comparator. This means a version must + not be equal to the provided version and this version must be excluded from + the range. For example: "!=1.2.3" means that version "1.2.3" is excluded. + +- "<", "<=": Less than or less-or-equal version comparators points to all + versions less than or equal to the provided version. For example "<=1.2.3" + means less than or equal to "1.2.3". + +- ">", ">=": Greater than or greater-or-equal version comparators points to + all versions greater than or equal to the provided version. For example + ">=1.2.3" means greater than or equal to "1.2.3". + +- The way two version strings are compared using these comparators is defined + by the ````. + +- The structure and meaning of a version string such as "1.2.3" is defined by + the ````. For instance, ``semver`` defines three + dot-separated segments name major, minor and patch. + +- The special star "*" ```` matches any version. This star + constraint must be used **alone** in a version range, exclusive of any other + constraint. For example "vers:deb/\*" resolves to any version of a Debian + package. + +- The way each of these comparators work when doing a version comparison is + specific to a versioning scheme. + + +Examples +~~~~~~~~~ + +TODO. + + +Normalized or canonical representation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- A version range specifier contains only printable ASCII letters, digits and + punctuation. + +- Spaces are not significant and are removed in the canonical form. For example + "!=1.2.3" and " ! = 1.2. 3" are equivalent. And so are "1.2.3 & < = 2.0.0" and + "1.2.3&<=2.0.0" + +- A version range specifier is case-insensitive and lowercase in canonical form. + +- The ordering of multiple ```` in a range specifier is not + significant. The canonical ordering is by sorting these by lexicographical + order applied with this two steps approach: + + - first to each sub-list of comma-separated ````. + - then to the top level list of pipe-separated ````. + +- A version in a ```` can only contain printable ASCII + characters excluding the special characters used as separators and comparators + ``><=!,&*|``. If it contains special characters (which should be rare in + practice) the version string in a constraint must be quoted using the URL + quoting rules. + + +Using version range specifiers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``vers`` primary usage is to test if a version is within or outside a range. + +An version is within a version range if satisfies or is "contained" in +**any one** of the first level of constraints. To satisfy or be "contained" in +a first level constraint, a version must satisfy or be "contained" in +**all** the second level of constraints. Otherwise, the input version is outside +of the version range. + +Some important usages derived from this primary usage include: + +- **Resolving a version range specifier to a list of concrete versions.** + In this case, the input is the set of known versions of a package (typically + obtained from some package repository or registry). Each version is then + tested individually to check if it is within or outside the range. For + example, this can be used to determine which existing package versions are + affected by a known vulnerability if they match the vulnerability version + range specifier. + +- **Selecting one of several versions that are within a range.** + In this case, given several versions that are within a range and several + packages that each expression inter dependencies together with version ranges, + package management tools need to determine and select a set of package versions + that satify all the version ranges of all dependencies. This usually requires + deploying heuristics and algorithms (possibly complex such as sat solvers) + that are ecosystem- and tool-specific and outside of the scope for this + specification; yet ``vers`` could be used in tandem with ``purl`` to provide + an input to a dependencies resolution process. + + +Parsing version range specifiers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To parse a version range specifier string: + +- Remove all spaces and tabs. +- Start from left, and split once on colon ":". +- The left hand side is the URI-scheme that must be lowercase. + - Verify that the URI-scheme value is ``vers``. +- The right hand side is the specifier. + +- Split the specifier from left once on a slash "/". +- The left hand side is the that must be lowercase. +- The right hand side is a list of one or more constraints. + +- If the constraints string is equal to "*", the is "*". + Parsing is done and no further processing is needed for this ``vers``. A tool + may be strict and report an error if there are extra characters beyond "*" or + be lenient. + +- Split the constraints on pipe "|". The result is a list of top-level + lists. Consecutive pipes should be treated as one. + +- For each list: + + - Split on comma ",". Consecutive commas should be treated as one. The result + is a sub-list of . + + - For each in this sub-list: + + - Identify the comparator and version based on the + start of the in this sequence: + + - If it starts with "=", then the comparator is "=" + - If it starts with "!=", then the comparator is "!=". + - If it starts with "<=", then the comparator is "<=". + - If it starts with ">=", then the comparator is ">=". + - If it starts with "<", then the comparator is "<". + - If it starts with ">", then the comparator is ">". + - Else the comparator is "=" (default) and the + version is the full string. + + - After the operation and removing the comparator from + string, the remaining string is the version. + + - Validate that the version is not empty. + + - If the version contains a percent "%" character, apply URL quoting rules + to unquote this string. + + - Append the comparator and version of this constraint to the inner list + of constraints. + + - Append the accumulated list of (comparator and version) that must apply to + the top level list of constraints. + +- Finally return the and the nested list of + + +Notes and caveats +~~~~~~~~~~~~~~~~~~~ + +- Comparing versions from two different versioning schemes is unspecified. Even + though there may be some similarities between the ``semver`` version of an npm + and the `debian` version of its Debian packaging, these similarities are + specific to each versioning scheme. Tools should report an error in these + cases as it does not make sense to perform these comparisons. + + +Some of the known versioning schemes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +TODO: add details on how to convert to and from ``vers`` for a given versioning +scheme and package type. + +- **deb**: Debian and Ubuntu https://www.debian.org/doc/debian-policy/ch-relationships.html + The comparators are <<, <=, =, >= and >>. + +- **rpm**: RPM distros https://rpm-software-management.github.io/rpm/manual/dependencies.html + The version comparison routine of rmpvercmp is also used by archlinux Pacman. + +- **gem**: Rubygems https://guides.rubygems.org/patterns/#semantic-versioning + which is almost but not exactly ``node-semver``. + +- **npm**: npm uses node-semver which is based on semver with its own range + notation https://github.com/npm/node-semver#ranges + A similar but different scheme is used by Rust + https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html + and several other package types may use ``node-semver``-like ranges. But most + of these related schemes are not strictly the same as what is implemented in + ``node-semver``. For instance PHP ``composer`` may need its own scheme as this + is not strictly ``node-semver``. + +- **pypi**: Python https://www.python.org/dev/peps/pep-0440/ + +- **cpan**: Perl https://perlmaven.com/how-to-compare-version-numbers-in-perl-and-for-cpan-modules + +- **go**: Go modules https://golang.org/ref/mod#versions use semver versions + with a specific minimum version resolution algorithm. + +- **maven**: Apache Maven http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html + +- **nuget**: NuGet https://docs.microsoft.com/en-us/nuget/concepts/package-versioning#version-ranges + Note that Apache Maven and NuGet are following a similar approach with a + math-derived intervals syntax as in https://en.wikipedia.org/wiki/Interval_(mathematics) + +- **gentoo**: Gentoo https://wiki.gentoo.org/wiki/Version_specifier + +- **alpine**: Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c + which is using Gentoo-like conventions. + +- **generic**: a generic version comparison algorithm (which is TBD, likely a + split on punctuation and dealing with digit vs. strings comparisons, like is + done in libversion) + +TODO: add Rust, composer and archlinux + + +Implementations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Python: https://github.com/nexB/univers +- Yours! + + +Why not reuse existing version range notations? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Most existing version range notations are tied to a specific version string +syntax and are therefore not readily applicable to other contexts. For example, +the use of elements such as tilde and caret ranges in Rubygems, npm or Dart +notations implies that a certain structure exists in the version string (semver +or semver- like). The inclusion of these additional comparators is a result of +the history and evolution in a given package ecosystem to address specific needs. + +In practice, the unified and reduced set of comparators and syntax defined for +``vers`` has been designed such that all these notations can be converted to a +``vers`` and back from a ``vers`` to the original notation. + +In contrast, this would not be possible with existing notations. For instance, +the Python notation may not work with npm semver versions and reciprocally. + +There are likely to be a few rare cases where round tripping from and to +``vers`` may not be possible, and in any case round tripping to and from ``vers`` +should produce equivalent results and even if not strictly the same original +strings. + +Another issue with existing version range notations is that they are primarily +meant to be used for dependency constraints and may not readily be reusable for +the definitions of vulnerable ranges. In particular, a vulnerability may exist +for multiple "version branches" of a given package such as with Django 2.x and +3.x. Several version range notations have difficulties to communicate these +as typically all the version constraints must be satisfied. In constrast, +a vulnerability can affect multiple disjoint version ranges of a package and any +version satisfying these constraints would be vulnerable: it may not be possible +to express this with a notation designed exclusively for dependent versions +resolution. + + +Why not use the NVD CPE Ranges? +############################### + +See: + +- https://nvd.nist.gov/vuln/vulnerability-detail-pages#divRange +- https://nvd.nist.gov/developers/vulnerabilities#divResponse +- https://csrc.nist.gov/schema/nvd/feed/1.1/nvd_cve_feed_json_1.1.schema + +The version ranges notation defined in the JSON schema of the CVE API payload +uses these four fields: ``versionStartIncluding``, ``versionStartExcluding``, +``versionEndIncluding`` and ``versionEndExcluding``. For example:: + + "versionStartIncluding": "7.3.0", + "versionEndExcluding": "7.3.31", + "versionStartExcluding" : "9.0.0", + "versionEndIncluding" : "9.0.46", + +In addition to these ranges, the NVD publishes a list of concrete CPE with +versions resolved for a range with daily updates at +https://nvd.nist.gov/vuln/data-feeds#cpeMatch + +Note that the NVD CVE configuration is a complex specification that goes well +beyond version ranges and is used to match comprehensive configurations across +multiple products and version ranges. ``vers`` focus is exclusively versions. + +The NVD JSON notation is verbose in contrast with ``vers`` that attempts to +provide a compact notation. It provides the same =, <=, < and > comparators +specified in ``vers`` and found in other notations. + + +Why not use node-semver ranges? +############################### + +See: + +- https://github.com/npm/node-semver#ranges + +The node-semver spec is similar to this spec but is an AND of ORs constraints +with a few practical issues: + +- The space means "AND", therefore whitespaces are significant. Having + significant whitespaces in a string makes normalization more complicated and + may be a source of confusion if you remove the spaces from the string. Using + a comma as an "AND" operator in ``vers`` makes this explicit and avoids the + ambiguity of a space. + +- There is no negation "!=" operator meaning that some version constraints are + difficult to express and require combining < and > comparators. For instance + stating that a vulnerability affects babel 6.2 or later but not babel 7.0 is + possible but complicated. + +- The advanced range syntax has grown to be rather complex using hyphen, stars, + carets and tilde constructs that are all tied to the JavaScript and npm ways + of handling versions in their ecosystem and are bound furthermore to the + semver semantics and its npm implementation. These are not readily reusable + elsewhere and these extended multiple comparators and modifiers make the + notation grammar more complex to parse for a machine and harder to read for + human. + +Notations that are directly derived from node-semver as used in Rust and PHP +Composer have the same issues. + + +Why not use Python pep-0440 ranges? +##################################### + +See: + +- https://www.python.org/dev/peps/pep-0440/#version-specifiers + +The Python pep-0440 "Version Identification and Dependency Specification" +provides a comprehensive specification for Python package versioning and a +notation for "version specifiers" to express the version constraints of +dependencies. + +This specification is similar to this ``vers`` spec, but has a richer notation +with some aspects specific to the versions used only in the Python ecosystem. + +- In particular pep-0440 uses tilde, triple equal and wildcard star operators + that are specific to how two Python versions are compared. + +- The comma separator between constraints is a logical "AND" rather than an + "OR". The "OR" does not exist in the syntax making some version ranges + harder to express, in particular for vulnerabilities that may affect several + exact versions or version ranges such as when there are multiple release + branches that exist in parallel. For instance a statement such as: Django 1.2 + or later, or Django 2.2 or later or Django 3.2 or later is difficult to + express without an "OR" logic. + + +Why not use Rubygems requirements notation? +############################################### + +See: + +- https://guides.rubygems.org/patterns/#declaring-dependencies + +The rubygems specification suggests but does not enforce using semver. It is +similar to this spec's operators with the addition of the "~>" aka. pessimistic +operator or tilde-wakka which is similar to the "tilde" used in node-semver and +implies semver versioning. This makes the notation impractical to reuse +in places that do not use the same semver-like semantics. + + +Why not use fancier comparators such as a tilde, caret and star? +################################################################## + +Several existing notations such as used with npm, gem or python or composer +provide syntactic shorthands such as: + +- a tilde prefix or ~> prefix or =~ as in "~1.3" or "~>1.2.3" +- a caret ^ prefix as in "^ 1.2" +- using a star in a segment of a version as in "1.2.*" +- dash-separated ranges as in "1.2 - 1.4" + +These range syntaxes can typcially be reduced to a set of simpler operators. +Furthermore they are designed for the structure of a version string (most often +semver) as used in one ecosystem and therefore are not reusable in another +ecosystem that would not use the version string conventions. + + +Why not use mathematical interval notation for ranges? +####################################################### + +Apache Maven and NuGet make use of a mathematical interval with "[" and ")" as a +syntax for version ranges. + +All other notations are using >, <, and = as base symbols for ranges. ``vers`` +reuses this approach because it is more common across package ecosystems. + + +References +~~~~~~~~~~~~~~~~~~~~ + +Here are some of the discussions that led to the creation of this specification: + +- https://github.com/package-url/purl-spec/issues/66 +- https://github.com/package-url/purl-spec/issues/84 +- https://github.com/package-url/purl-spec/pull/93 +- https://github.com/nexB/vulnerablecode/issues/119 +- https://github.com/nexB/vulnerablecode/issues/140 +- https://github.com/nexB/univers/pull/11 + +License +~~~~~~~ + +This document is licensed under the MIT license From dceee6c3714318bec7dd599881336543f2212138 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Tue, 30 Nov 2021 17:43:03 +0100 Subject: [PATCH 02/21] Update VERSION-RANGE-SPEC.rst Correct typo and meaning Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 9ebc644..a4d042c 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -338,7 +338,7 @@ Some important usages derived from this primary usage include: - **Selecting one of several versions that are within a range.** In this case, given several versions that are within a range and several - packages that each expression inter dependencies together with version ranges, + packages that express dependencies to other other packages qualified by a version ranges, package management tools need to determine and select a set of package versions that satify all the version ranges of all dependencies. This usually requires deploying heuristics and algorithms (possibly complex such as sat solvers) From 69b5dd2fe73dda3e0156d7c2d4eef226b66adb6d Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Tue, 30 Nov 2021 17:55:33 +0100 Subject: [PATCH 03/21] Clarify NVD CVE 5.0 versionType relationship This is the same concept as a Package URL type. Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index a4d042c..c24877c 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -121,7 +121,7 @@ in use: of versions. The versionType is defined as ``"The version numbering system used for specifying the range. This defines the exact semantics of the comparison (less-than) operation on versions, which is required to understand - the range itself"``. + the range itself"``. A versionType ressembles closely the Package URL package "type". - The OSSF OSV schema https://ossf.github.io/osv-schema/ defines vulnerable ranges with version events using "introduced" and "limit" fields and an From ca1aba63431cd052cad13a41af1f9c881a467f31 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 2 Dec 2021 12:12:09 +0100 Subject: [PATCH 04/21] Remove leftover ampersands These are no longer used: we now use a coma. Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index c24877c..4c1d407 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -296,8 +296,8 @@ Normalized or canonical representation punctuation. - Spaces are not significant and are removed in the canonical form. For example - "!=1.2.3" and " ! = 1.2. 3" are equivalent. And so are "1.2.3 & < = 2.0.0" and - "1.2.3&<=2.0.0" + "!=1.2.3" and " ! = 1.2. 3" are equivalent. And so are "1.2.3 , < = 2.0.0" and + "1.2.3,<=2.0.0" - A version range specifier is case-insensitive and lowercase in canonical form. From c146f1690b73b99a9e1658069278b3afc7a67dfe Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 3 Dec 2021 11:34:40 +0100 Subject: [PATCH 05/21] Do not lowercase versions This may be problem in some cases. Best is to keep the version as-is even in canonical form. Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 4c1d407..292d942 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -250,10 +250,9 @@ The ```` is one of these comparison operators: - "=": Version equality comparator. It is the default and implied if not present and means that a version must be equal to the provided version. For example: "=1.2.3". It must be omitted in the canonical representation. - Equality is based on the equality of two lower-cased and normalized version - strings and is typically not versioning scheme-specific, though some - scheme such as pypi PEP440 may apply some version string normalization - before testing for equality. + Equality is based on the equality of two version strings and is typically + not versioning scheme-specific, though some schemes such as pypi PEP440 + may require a version string normalization before testing for equality. - "!=": Version exclusion or inequality comparator. This means a version must not be equal to the provided version and this version must be excluded from @@ -299,7 +298,10 @@ Normalized or canonical representation "!=1.2.3" and " ! = 1.2. 3" are equivalent. And so are "1.2.3 , < = 2.0.0" and "1.2.3,<=2.0.0" -- A version range specifier is case-insensitive and lowercase in canonical form. +- The case sensitivity of a version in a version range specifier is defined by + its versioning scheme. In canonical form, a version is case-sensitive. + +- The URI scheme and versioning scheme are always lowercase as in ``vers:npm``. - The ordering of multiple ```` in a range specifier is not significant. The canonical ordering is by sorting these by lexicographical @@ -354,12 +356,12 @@ To parse a version range specifier string: - Remove all spaces and tabs. - Start from left, and split once on colon ":". -- The left hand side is the URI-scheme that must be lowercase. +- The left hand side is the URI-scheme that must be lowercased. - Verify that the URI-scheme value is ``vers``. - The right hand side is the specifier. - Split the specifier from left once on a slash "/". -- The left hand side is the that must be lowercase. +- The left hand side is the that must be lowercased. - The right hand side is a list of one or more constraints. - If the constraints string is equal to "*", the is "*". From 2145b735c7820875a9a2c0491817940b1e29409e Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 10 Dec 2021 00:43:55 +0100 Subject: [PATCH 06/21] Adopt simpler semantics This new take is based on the proposal from Oliver Chang. The syntax is simpler and much improved. Reference: https://github.com/package-url/purl-spec/pull/139#discussion_r764551670 Reported-by: Oliver Chang Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 431 ++++++++++++++++++++++------------------- 1 file changed, 230 insertions(+), 201 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 292d942..5fe8983 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -60,9 +60,9 @@ in use: strings, but does not provide a way to express version ranges. - Rubygems strongly suggest using ``semver`` for version but does not enforce it. - As a result some use semver and several popular package do not use strict - semver. Rubygems use their own notation for version ranges which ressembles - the ``node-semver`` notation with some subtle differences. + As a result some gem use semver while several popular package do not use + strict semver. Rubygems use their own notation for version ranges which + ressembles the ``node-semver`` notation with some subtle differences. See https://guides.rubygems.org/patterns/#semantic-versioning - ``node-semver`` ranges are used in npm at https://github.com/npm/node-semver#ranges @@ -100,9 +100,10 @@ in use: - Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c - Arch Linux https://wiki.archlinux.org/title/PKGBUILD#Dependencies use its - own simplified notation for its PKGBUILD depends array. + own simplified notation for its PKGBUILD depends array and use a modified + RPM version comparison. -- Go modules https://golang.org/ref/mod#versions use semver versions with +- Go modules https://golang.org/ref/mod#versions use ``semver`` versions with specific version resolution algorithms. - Haskell Package Versioning Policy https://pvp.haskell.org/ provides a notation @@ -117,17 +118,19 @@ in use: - The version 5 of the NVD CVE JSON data format at https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0.schema#L303 defines version ranges with a starting version, a versionType, and an upper - limit for the version range as lessThan or lessThanOrEqual. Or an enumeration + limit for the version range as lessThan or lessThanOrEqual; or an enumeration of versions. The versionType is defined as ``"The version numbering system used for specifying the range. This defines the exact semantics of the comparison (less-than) operation on versions, which is required to understand - the range itself"``. A versionType ressembles closely the Package URL package "type". + the range itself"``. A "versionType" ressembles closely the Package URL package + "type". - The OSSF OSV schema https://ossf.github.io/osv-schema/ defines vulnerable - ranges with version events using "introduced" and "limit" fields and an - enumeration of all the versions in these ranges, except for semver-based - versions. A range may be ecosystem-specific based on a provided package - "ecosystem" value that ressembles closely the Package URL package "type". + ranges with version events using "introduced", "fixed" and "limit" fields and + an optional enumeration of all the versions in these ranges, except for + semver-based versions. A range may be ecosystem-specific based on a provided + package "ecosystem" value that ressembles closely the Package URL package + "type". The way two versions are compared as equal, lesser or greater is a closely @@ -136,8 +139,8 @@ related topic: - Each package ecosystem may have evolved its own peculiar version string conventions, semantics and comparison procedure. -- For instance, ``semver`` is a prominent specification in this domain but this is - just one of the many ways to structure a version string. +- For instance, ``semver`` is a prominent specification in this domain but this + is just one of the many ways to structure a version string. - Debian, RPM, PyPI, Rubygems, and Composer have their own subtly different approach on how to determine which version is greater or lesser. @@ -146,14 +149,14 @@ related topic: Solution --------- -A solution to the many version range syntaxes is to design a new notation to -unify them all with: +A solution to the many version range syntaxes is to design a new simplified +notation to unify them all with: - a mostly universal and minimalist, compact notation to express version ranges from many different package types and ecosystems. - the package type-specific definitions to normalize existing range expressions - to this common notation. + in this common notation. - the designation of which algorithm or procedure to use when comparing two versions such that it is possible to resolve if a version is within or @@ -164,44 +167,47 @@ in this document. Version range specifier -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~ A version range specifier (aka. "vers") is a URI string using the ``vers`` URI-scheme with this syntax:: - vers:/|,... + vers:/||... -For example to define a set of versions that contains either version ``1.2.3``, +For example ,to define a set of versions that contains either version ``1.2.3``, or any versions greater than or equal to ``2.0.0`` but less than ``5.0.0`` using -the ``node-semver`` versioning scheme, the version range specifier will be:: +the ``node-semver`` versioning scheme used with the ``npm`` Package URL type, +the version range specifier will be:: - vers:npm/1.2.3|>=2.0.0,<5.0.0 + vers:npm/1.2.3|>=2.0.0|<5.0.0 -Each ```` in the pipe-separated list is either a simple -constraint such as:: - +``vers`` is the URI-scheme and is an acronym for "VErsion Range Specifier". It +has been selected because it is short, obviously about version and available +for a future formal URI-scheme registration at IANA. -Or a composite constraint grouping multiple ```` joined by -a comma such as:: +The pipe "|" is used as a simple separator between ````. +Each ```` in this pipe-separated list contains a comparator +and a version:: - ,... + -The pipe is a logical `OR` and the comma is a logical `AND`. +These ```` are signposts in the version timeline of a +package that specify precisely non-overlaping version intervals. -A version range specifier is therefore an "OR of ANDs" where there are two -levels of constraints that a version should satisfy to be part of the range: +A ```` satisfies a version range specifier if it part of any of +intervals defined by these ordered ````. -- At the first level, anyone of the constraints should be satisfied -- At the second level, all of the constraints must be satisfied -This is also called a "disjunctive normal form" in boolean logic. -See https://en.wikipedia.org/wiki/Disjunctive_normal_form for details. +URI scheme +------------- -``vers`` is the URI-scheme and is an acronym for "VErsion Range Specifier". It -has been selected because it is short, obviously about version and available +The ``vers`` URI scheme is an acronym for "VErsion Range Specifier". +It has been selected because it is short, obviously about version and available for a future formal registration for this URI-scheme at the IANA registry. +The URI scheme is followed by a colon ":". + ```` ------------------------ @@ -218,9 +224,10 @@ The ```` (such as ``npm``, ``deb``, etc.) determines: - how a versioning scheme-specific range notation can be transformed in the ``vers`` simplified notation defined here. -- by convention the versioning scheme should be the same string as the Package - URL package type for a given package ecosystem. It is OK to have other schemes - beyond the purl type and even schemes that are specific to a single package. +- by convention the versioning scheme **should** be the same as the ``Package + URL`` package type for a given package ecosystem. It is OK to have other + schemes beyond the purl type. A scheme could be specific to a single package + name. The ```` is followed by a slash "/". @@ -229,57 +236,44 @@ The ```` is followed by a slash "/". ---------------------------- After the ```` and "/" there are one or more -```` separated by a pipe "|". The pipe "|" means that -**any** of these constraints must be satisfied for a version to be resolved as -within this version range. +```` each separated by a pipe "|". The pipe "|" has no +special meaning beside being a separator. -Each ```` of this pipe-separated list can be either a -single constraint or a list of constraints separated in turn by an comma "," as -in ``1.2.3|>=2.0.0,<5.0.0``. +Each ```` of this pipe-separated list is either a +single ```` as in ``1.2.3 or the combination of a ```` and +a ```` as in ``>=2.0.0`` with this syntax:: -Multiple ```` combined with a comma means that **all** these -constraints must be satisfied for a version to be resolved as contained in this -range. + -Each simple version constraint has this syntax:: +A single version that means that a version can be equal to the provided version +to satisfy the range spec. Equality is based on the equality of two normalized +version strings according to their versioning scheme. For most schemes, this is +a simple string equality. But some scheme -- such as ``pypi`` (e.g., PEP440)-- +specify some normalization before testing for equality. - -The ```` is one of these comparison operators: +The ```` is one of these **two** comparison operators: -- "=": Version equality comparator. It is the default and implied if not - present and means that a version must be equal to the provided version. - For example: "=1.2.3". It must be omitted in the canonical representation. - Equality is based on the equality of two version strings and is typically - not versioning scheme-specific, though some schemes such as pypi PEP440 - may require a version string normalization before testing for equality. +- ">=": Greater-or-equal version comparator points to all versions greater than + or equal to the provided version. For example ">=1.2.3" means greater than or + equal to version "1.2.3". -- "!=": Version exclusion or inequality comparator. This means a version must - not be equal to the provided version and this version must be excluded from - the range. For example: "!=1.2.3" means that version "1.2.3" is excluded. +- "<": Less than version comparator points to all versions strictly less than a + provided version . For example "<1.2.3" means less than version "1.2.3". -- "<", "<=": Less than or less-or-equal version comparators points to all - versions less than or equal to the provided version. For example "<=1.2.3" - means less than or equal to "1.2.3". -- ">", ">=": Greater than or greater-or-equal version comparators points to - all versions greater than or equal to the provided version. For example - ">=1.2.3" means greater than or equal to "1.2.3". +The ```` defines: -- The way two version strings are compared using these comparators is defined - by the ````. +- how to compare two version strings using these comparators, and -- The structure and meaning of a version string such as "1.2.3" is defined by - the ````. For instance, ``semver`` defines three - dot-separated segments name major, minor and patch. +- the structure of a version string such as "1.2.3" if any. For instance, the + ``semver`` specification for version numbers defines a version as composed + primarily of three dot-separated numeric segments named major, minor and patch. -- The special star "*" ```` matches any version. This star - constraint must be used **alone** in a version range, exclusive of any other - constraint. For example "vers:deb/\*" resolves to any version of a Debian - package. -- The way each of these comparators work when doing a version comparison is - specific to a versioning scheme. +The special star "*" ```` matches any version. This star +constraint must be used **alone** exclusive of any other constraint. For example +"vers:deb/\*" resolves to any version of a Debian package. Examples @@ -294,59 +288,54 @@ Normalized or canonical representation - A version range specifier contains only printable ASCII letters, digits and punctuation. -- Spaces are not significant and are removed in the canonical form. For example - "!=1.2.3" and " ! = 1.2. 3" are equivalent. And so are "1.2.3 , < = 2.0.0" and - "1.2.3,<=2.0.0" +- Spaces are not significant and removed in a canonical form. For example + "<1.2.3|>=2.0" and " < 1.2. 3 | > = 2 . 0" are equivalent. -- The case sensitivity of a version in a version range specifier is defined by - its versioning scheme. In canonical form, a version is case-sensitive. - - The URI scheme and versioning scheme are always lowercase as in ``vers:npm``. +- The versions are case-sensitive, and a versioning scheme may specify its own + case sensitivity. + +- A ``version`` in a ```` can only contain printable ASCII + characters excluding some special characters and the characters used as + separators and comparators ``><=!,*|``. If it contains special characters + (which should be rare in practice) a version string in a constraint must be + quoted using the URL quoting rules. + - The ordering of multiple ```` in a range specifier is not - significant. The canonical ordering is by sorting these by lexicographical - order applied with this two steps approach: + significant. The canonical ordering is the versions order. - - first to each sub-list of comma-separated ````. - - then to the top level list of pipe-separated ````. +- Each ```` is unique and can occur only once in a range + specifier. Duplicates must be removed in the canonical representation. -- A version in a ```` can only contain printable ASCII - characters excluding the special characters used as separators and comparators - ``><=!,&*|``. If it contains special characters (which should be rare in - practice) the version string in a constraint must be quoted using the URL - quoting rules. Using version range specifiers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -``vers`` primary usage is to test if a version is within or outside a range. +``vers`` primary usage is to test if a version is within a range. -An version is within a version range if satisfies or is "contained" in -**any one** of the first level of constraints. To satisfy or be "contained" in -a first level constraint, a version must satisfy or be "contained" in -**all** the second level of constraints. Otherwise, the input version is outside -of the version range. +An version is within a version range if falls in any of the intervals defined +by a range. Otherwise, the version is outside of the version range. -Some important usages derived from this primary usage include: +Some important usages derived from thisinclude: - **Resolving a version range specifier to a list of concrete versions.** - In this case, the input is the set of known versions of a package (typically - obtained from some package repository or registry). Each version is then - tested individually to check if it is within or outside the range. For - example, this can be used to determine which existing package versions are - affected by a known vulnerability if they match the vulnerability version - range specifier. + In this case, the input is one or more known versions of a package. Each + version is then tested to check if it lies within or outside the range. For + example, given a vulnerability and the ``vers`` describing the affected and + fixed versions of a package, this process is used to determine if an existing + package version is within the known vulnerable version range specifier. - **Selecting one of several versions that are within a range.** In this case, given several versions that are within a range and several - packages that express dependencies to other other packages qualified by a version ranges, - package management tools need to determine and select a set of package versions - that satify all the version ranges of all dependencies. This usually requires - deploying heuristics and algorithms (possibly complex such as sat solvers) - that are ecosystem- and tool-specific and outside of the scope for this - specification; yet ``vers`` could be used in tandem with ``purl`` to provide - an input to a dependencies resolution process. + packages that express package dependencies qualified by a version range, + a package management tools will determine and select the set of package + versions that satify all the version ranges constraints of all dependencies. + This usually requires deploying heuristics and algorithms (possibly complex + such as sat solvers) that are ecosystem- and tool-specific and outside of the + scope for this specification; yet ``vers`` could be used in tandem with + ``purl`` to provide an input to this dependencies resolution process. Parsing version range specifiers @@ -369,43 +358,34 @@ To parse a version range specifier string: may be strict and report an error if there are extra characters beyond "*" or be lenient. -- Split the constraints on pipe "|". The result is a list of top-level - lists. Consecutive pipes should be treated as one. +- Strip leading adn trailing pipes "|" from the constraints string. +- Split the constraints on pipe "|". The result is a list of . + Consecutive pipes should be must as one and leading and trailing pipes ignored. -- For each list: +- For each : - - Split on comma ",". Consecutive commas should be treated as one. The result - is a sub-list of . + - Determine if the starts with one of the two comparators: - - For each in this sub-list: + - If it starts with ">=", then the comparator is ">=". + - If it starts with "<", then the comparator is "<". - - Identify the comparator and version based on the - start of the in this sequence: + - Remove the comparator from string, and the + remaining string is the version. - - If it starts with "=", then the comparator is "=" - - If it starts with "!=", then the comparator is "!=". - - If it starts with "<=", then the comparator is "<=". - - If it starts with ">=", then the comparator is ">=". - - If it starts with "<", then the comparator is "<". - - If it starts with ">", then the comparator is ">". - - Else the comparator is "=" (default) and the - version is the full string. + - Otherwise the version is the full string (which implies + and equality comparator of "=") - - After the operation and removing the comparator from - string, the remaining string is the version. + - A tool should validate and report an error if the version is empty. - - Validate that the version is not empty. + - If the version contains a percent "%" character, apply URL quoting rules + to unquote this string. - - If the version contains a percent "%" character, apply URL quoting rules - to unquote this string. + - Append the parsed (comparator, version) to the constraints list. - - Append the comparator and version of this constraint to the inner list - of constraints. +Finally: - - Append the accumulated list of (comparator and version) that must apply to - the top level list of constraints. - -- Finally return the and the nested list of +- deduplicate and sort the list of (comparator, version) using the canonical order. +- return the and this list of constraints. Notes and caveats @@ -413,9 +393,9 @@ Notes and caveats - Comparing versions from two different versioning schemes is unspecified. Even though there may be some similarities between the ``semver`` version of an npm - and the `debian` version of its Debian packaging, these similarities are - specific to each versioning scheme. Tools should report an error in these - cases as it does not make sense to perform these comparisons. + and the ``deb`` version of its Debian packaging, these similarities are + specific to each versioning scheme and. Tools should report an error in these + cases as it does not make sense to compare such unrelated versions. Some of the known versioning schemes @@ -428,10 +408,11 @@ scheme and package type. The comparators are <<, <=, =, >= and >>. - **rpm**: RPM distros https://rpm-software-management.github.io/rpm/manual/dependencies.html - The version comparison routine of rmpvercmp is also used by archlinux Pacman. + The a simplified rmpvercmp version comparison routine is used by archlinux Pacman. - **gem**: Rubygems https://guides.rubygems.org/patterns/#semantic-versioning - which is almost but not exactly ``node-semver``. + which is similar to ``node-semver`` for its syntax, but does not use semver + versions. - **npm**: npm uses node-semver which is based on semver with its own range notation https://github.com/npm/node-semver#ranges @@ -446,10 +427,11 @@ scheme and package type. - **cpan**: Perl https://perlmaven.com/how-to-compare-version-numbers-in-perl-and-for-cpan-modules -- **go**: Go modules https://golang.org/ref/mod#versions use semver versions +- **go**: Go modules https://golang.org/ref/mod#versions use ``semver`` versions with a specific minimum version resolution algorithm. -- **maven**: Apache Maven http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html +- **maven**: Apache Maven supports a math interval notation which is rarely seen + in practice http://maven.apache.org/enforcer/enforcer-rules/versionRanges.html - **nuget**: NuGet https://docs.microsoft.com/en-us/nuget/concepts/package-versioning#version-ranges Note that Apache Maven and NuGet are following a similar approach with a @@ -460,9 +442,9 @@ scheme and package type. - **alpine**: Alpine linux https://gitlab.alpinelinux.org/alpine/apk-tools/-/blob/master/src/version.c which is using Gentoo-like conventions. -- **generic**: a generic version comparison algorithm (which is TBD, likely a - split on punctuation and dealing with digit vs. strings comparisons, like is - done in libversion) +- **generic**: a generic version comparison algorithm (which will be specified + later, likely based on a split on any wholly alpha or wholly numeric segments + and dealing with digit and string comparisons, like is done in libversion) TODO: add Rust, composer and archlinux @@ -497,15 +479,67 @@ should produce equivalent results and even if not strictly the same original strings. Another issue with existing version range notations is that they are primarily -meant to be used for dependency constraints and may not readily be reusable for -the definitions of vulnerable ranges. In particular, a vulnerability may exist -for multiple "version branches" of a given package such as with Django 2.x and -3.x. Several version range notations have difficulties to communicate these -as typically all the version constraints must be satisfied. In constrast, -a vulnerability can affect multiple disjoint version ranges of a package and any -version satisfying these constraints would be vulnerable: it may not be possible -to express this with a notation designed exclusively for dependent versions -resolution. +designed for dependencies and not for vulnerable ranges. In particular, a +vulnerability may exist for multiple "version branches" of a given package such +as with Django 2.x and 3.x. Several version range notations have difficulties to +communicate these as typically all the version constraints must be satisfied. +In constrast, a vulnerability can affect multiple disjoint version ranges of a +package and any version satisfying these constraints would be vulnerable: it +may not be possible to express this with a notation designed exclusively for +dependent versions resolution. + +Finally, one of the goals of this spec is to be a compact yet obvious Package +URL companion for version ranges. Several existing and closely related notations +designed for vulnerable ranges are verbose specifications designed for use +in API with larger JSON documents. + + +Why not use the OSV Ranges? +############################### + +See: + +- https://ossf.github.io/osv-schema/ + +``vers`` and the OSSF OSV schema vulnerable ranges are strictly equivalent and +``vers`` provides a compact range notation while OSV provides more verbose +JSON notation. + +``vers`` borrows the design from and was informed by the OSV schema spec and its +authors. + +The only high level difference between the two specifications are the +codes used to qualify a range package "ecosystem" value that ressembles closely +the Package URL package "type" used in ``vers``. This spec will provide a strict +mapping between the OSV ecosystem and the ``vers`` versioning schemes values. + + +Why not use the NVD CVE v5 API Ranges? +############################################ + +See: + +- https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0_schema.json#L303 +- https://github.com/CVEProject/cve-schema/blob/master/schema/v5.0/CVE_JSON_5.0_schema.json#L123 + +The version 5 of the NVD CVE JSON data format defines version ranges with a +starting version, a versionType, and an upper limit for the version range as +lessThan or lessThanOrEqual or as an enumeration of versions. The versionType +and the package collectionURL possible values are inidicative and left outside +of that specification and both seem strictly equivalent to the Package URL purl +"type" on the one hand and the ``vers`` versioning scheme on the other hand. + +The semantics and expressiveness of each range are similar and ``vers`` provides +a compact notation rather than a more verbose JSON notation. ``vers`` supports +strictly the conversion of any NVD v5 range to its notation and further +provides a concrete list of well known versioning schemes. ``vers`` design was +informed by the NVD CVE v5 API schema spec and its authors. + + +When NVD v5 becomes active, this spec will provide a strict mapping between the +NVD versionType and the ``vers`` versioning schemes values. Futhermore, this +spec and the Package URL "types" should be updated accordingly to provide +a mapping with the upcoming NVD collectionURL that will be effectively used. Why not use the NVD CPE Ranges? @@ -534,9 +568,8 @@ Note that the NVD CVE configuration is a complex specification that goes well beyond version ranges and is used to match comprehensive configurations across multiple products and version ranges. ``vers`` focus is exclusively versions. -The NVD JSON notation is verbose in contrast with ``vers`` that attempts to -provide a compact notation. It provides the same =, <=, < and > comparators -specified in ``vers`` and found in other notations. +In contrast with ``vers`` compact notation, the NVD JSON notation is more +verbose, yet ``vers`` supports strictly the conversion of any CPE range. Why not use node-semver ranges? @@ -546,26 +579,20 @@ See: - https://github.com/npm/node-semver#ranges -The node-semver spec is similar to this spec but is an AND of ORs constraints -with a few practical issues: +The node-semver spec is similar but much more complex than this spec. This is +an AND of ORs constraints with a few practical issues: -- The space means "AND", therefore whitespaces are significant. Having +- A space means "AND", therefore whitespaces are significant. Having significant whitespaces in a string makes normalization more complicated and - may be a source of confusion if you remove the spaces from the string. Using - a comma as an "AND" operator in ``vers`` makes this explicit and avoids the - ambiguity of a space. - -- There is no negation "!=" operator meaning that some version constraints are - difficult to express and require combining < and > comparators. For instance - stating that a vulnerability affects babel 6.2 or later but not babel 7.0 is - possible but complicated. - -- The advanced range syntax has grown to be rather complex using hyphen, stars, - carets and tilde constructs that are all tied to the JavaScript and npm ways - of handling versions in their ecosystem and are bound furthermore to the - semver semantics and its npm implementation. These are not readily reusable - elsewhere and these extended multiple comparators and modifiers make the - notation grammar more complex to parse for a machine and harder to read for + may be a source of confusion if you remove the spaces from the string. + ``vers`` avoids the ambiguity of spaces by ignoring them. + +- The advanced range syntax has grown to be rather complex using hyphen ranges, + stars ranges, carets and tilde constructs that are all tied to the JavaScript + and npm ways of handling versions in their ecosystem and are bound furthermore + to the semver semantics and its npm implementation. These are not readily + reusable elsewhere. The multiple comparators and modifiers make the notation + grammar more complex to parse and process for a machine and harder to read for human. Notations that are directly derived from node-semver as used in Rust and PHP @@ -584,8 +611,8 @@ provides a comprehensive specification for Python package versioning and a notation for "version specifiers" to express the version constraints of dependencies. -This specification is similar to this ``vers`` spec, but has a richer notation -with some aspects specific to the versions used only in the Python ecosystem. +This specification is similar to this ``vers`` spec, with more operators and +aspects specific to the versions used only in the Python ecosystem. - In particular pep-0440 uses tilde, triple equal and wildcard star operators that are specific to how two Python versions are compared. @@ -593,10 +620,9 @@ with some aspects specific to the versions used only in the Python ecosystem. - The comma separator between constraints is a logical "AND" rather than an "OR". The "OR" does not exist in the syntax making some version ranges harder to express, in particular for vulnerabilities that may affect several - exact versions or version ranges such as when there are multiple release - branches that exist in parallel. For instance a statement such as: Django 1.2 - or later, or Django 2.2 or later or Django 3.2 or later is difficult to - express without an "OR" logic. + exact versions or ranges for multiple parallel release branches. Ranges such as + "Django 1.2 or later, or Django 2.2 or later or Django 3.2 or later" are + difficult to express without an "OR" logic. Why not use Rubygems requirements notation? @@ -606,38 +632,41 @@ See: - https://guides.rubygems.org/patterns/#declaring-dependencies -The rubygems specification suggests but does not enforce using semver. It is -similar to this spec's operators with the addition of the "~>" aka. pessimistic -operator or tilde-wakka which is similar to the "tilde" used in node-semver and -implies semver versioning. This makes the notation impractical to reuse -in places that do not use the same semver-like semantics. +The Rubygems specification suggests but does not enforce using semver. It uses +operators similar to the ``node-semver`` spec with the different of the "~>" +aka. pessimistic operator vs. a plain "~" tilde used in node-semver. This +operator implies some semver-like versioning, yet gem version are not strictly +semver. This makes the notation complex to implment and impractical to reuse +in places that do not use the same Ruby-specific semver-like semantics. -Why not use fancier comparators such as a tilde, caret and star? -################################################################## +Why not use richers comparators such as >, <=, != a tilde, caret and 1.star? +############################################################################ -Several existing notations such as used with npm, gem or python or composer +Several existing notations such as used with npm, gem, python, or composer provide syntactic shorthands such as: +- negation with != +- richer comparisons with <= and >. - a tilde prefix or ~> prefix or =~ as in "~1.3" or "~>1.2.3" - a caret ^ prefix as in "^ 1.2" - using a star in a segment of a version as in "1.2.*" - dash-separated ranges as in "1.2 - 1.4" -These range syntaxes can typcially be reduced to a set of simpler operators. -Furthermore they are designed for the structure of a version string (most often -semver) as used in one ecosystem and therefore are not reusable in another -ecosystem that would not use the version string conventions. +These range syntaxes can **always** be reduced to a set of simpler operators +defined in ``vers``. Furthermore they assumed a certain structure in a version +string (most often semver or semver-like) as used in one ecosystem and therefore +are not reusable in another ecosystem that would not use the version conventions. Why not use mathematical interval notation for ranges? ####################################################### -Apache Maven and NuGet make use of a mathematical interval with "[" and ")" as a -syntax for version ranges. +Apache Maven and NuGet make use of a mathematical interval with "[", "]", "(" +and ")" with commas as a syntax for version ranges. -All other notations are using >, <, and = as base symbols for ranges. ``vers`` -reuses this approach because it is more common across package ecosystems. +All other known range notations use a more common set of ">", "<", and "=" as +range symbols. ``vers`` adopts this more common approach. References From d4a69619fb5a5a1d6706af450ac7042a703cbeec Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 10 Dec 2021 10:31:51 +0100 Subject: [PATCH 07/21] Fix minor typos Reported-by: Oliver Chang Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 5fe8983..67c3fee 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -174,14 +174,13 @@ URI-scheme with this syntax:: vers:/||... -For example ,to define a set of versions that contains either version ``1.2.3``, +For example, to define a set of versions that contains either version ``1.2.3``, or any versions greater than or equal to ``2.0.0`` but less than ``5.0.0`` using the ``node-semver`` versioning scheme used with the ``npm`` Package URL type, the version range specifier will be:: vers:npm/1.2.3|>=2.0.0|<5.0.0 - ``vers`` is the URI-scheme and is an acronym for "VErsion Range Specifier". It has been selected because it is short, obviously about version and available for a future formal URI-scheme registration at IANA. @@ -318,7 +317,7 @@ Using version range specifiers An version is within a version range if falls in any of the intervals defined by a range. Otherwise, the version is outside of the version range. -Some important usages derived from thisinclude: +Some important usages derived from this include: - **Resolving a version range specifier to a list of concrete versions.** In this case, the input is one or more known versions of a package. Each @@ -358,7 +357,7 @@ To parse a version range specifier string: may be strict and report an error if there are extra characters beyond "*" or be lenient. -- Strip leading adn trailing pipes "|" from the constraints string. +- Strip leading and trailing pipes "|" from the constraints string. - Split the constraints on pipe "|". The result is a list of . Consecutive pipes should be must as one and leading and trailing pipes ignored. From d8402e57ce920e6b3ba1271d583d15bd372d69e7 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Wed, 15 Dec 2021 22:03:02 +0100 Subject: [PATCH 08/21] Revert to use richer set of comparators Revert to use the original richer set of comparators: >,>=,<,<=,!= The OSV and Why not ...? sections explain why we need this rciher set of comparators. Add example section. Add a section on related efforts. Add constraint validation section. Add range containtment procedure pseudo code narrative. Add constraints deduplication procedure pseudo code narrative. Fix multiple typos. Reported-by: Stephen Milner @ashcrow Reported-by: Patrick Dwyer @coderpatros Reported-by: Oliver Chang @oliverchang Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 515 +++++++++++++++++++++++++++++++---------- 1 file changed, 393 insertions(+), 122 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 67c3fee..1c6cb22 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -53,8 +53,8 @@ usage of Package URLs referencing vulnerable package version ranges such as in vulnerability databases like `VulnerableCode `_. -To better understand the problem, here are some of the notations and conventions -in use: +To better understand the problem, here are some of the many notations and +conventions in use: - ``semver`` https://semver.org/ is a popular specification to structure version strings, but does not provide a way to express version ranges. @@ -62,21 +62,21 @@ in use: - Rubygems strongly suggest using ``semver`` for version but does not enforce it. As a result some gem use semver while several popular package do not use strict semver. Rubygems use their own notation for version ranges which - ressembles the ``node-semver`` notation with some subtle differences. + looks like the ``node-semver`` notation with some subtle differences. See https://guides.rubygems.org/patterns/#semantic-versioning - ``node-semver`` ranges are used in npm at https://github.com/npm/node-semver#ranges - with range semantics that are specific to ``semver`` and npm. + with range semantics that are specific to ``semver`` versions and npm. - Dart pub versioning scheme is similar to ``node-semver`` and the documentation at https://dart.dev/tools/pub/versioning provides a comprehensive coverage of the topic of versioning. Version resolution uses its own algorithm. - Python uses its own version and version ranges notation with notable - specificities on how how pre- and post-release suffixes are used + peculiarities on how how pre-release and post-release suffixes are used https://www.python.org/dev/peps/pep-0440/ -- Debian and Ubuntu use their own notation and are remarkabel for their use of +- Debian and Ubuntu use their own notation and are remarkable for their use of ``epochs`` to disambiguate versions. https://www.debian.org/doc/debian-policy/ch-relationships.html @@ -122,14 +122,14 @@ in use: of versions. The versionType is defined as ``"The version numbering system used for specifying the range. This defines the exact semantics of the comparison (less-than) operation on versions, which is required to understand - the range itself"``. A "versionType" ressembles closely the Package URL package + the range itself"``. A "versionType" resembles closely the Package URL package "type". - The OSSF OSV schema https://ossf.github.io/osv-schema/ defines vulnerable ranges with version events using "introduced", "fixed" and "limit" fields and an optional enumeration of all the versions in these ranges, except for semver-based versions. A range may be ecosystem-specific based on a provided - package "ecosystem" value that ressembles closely the Package URL package + package "ecosystem" value that resembles closely the Package URL package "type". @@ -143,7 +143,8 @@ related topic: is just one of the many ways to structure a version string. - Debian, RPM, PyPI, Rubygems, and Composer have their own subtly different - approach on how to determine which version is greater or lesser. + approach on how to determine how two versions are compared as equal, greater + or lesser. Solution @@ -191,11 +192,11 @@ and a version:: -These ```` are signposts in the version timeline of a -package that specify precisely non-overlaping version intervals. +This list of ```` are signposts in the version timeline of +a package that specify version intervals. -A ```` satisfies a version range specifier if it part of any of -intervals defined by these ordered ````. +A ```` satisfies a version range specifier if it is contained within +any of the intervals defined by these ````. URI scheme @@ -223,10 +224,9 @@ The ```` (such as ``npm``, ``deb``, etc.) determines: - how a versioning scheme-specific range notation can be transformed in the ``vers`` simplified notation defined here. -- by convention the versioning scheme **should** be the same as the ``Package - URL`` package type for a given package ecosystem. It is OK to have other - schemes beyond the purl type. A scheme could be specific to a single package - name. +By convention the versioning scheme **should** be the same as the ``Package URL`` +package type for a given package ecosystem. It is OK to have other schemes +beyond the purl type. A scheme could be specific to a single package name. The ```` is followed by a slash "/". @@ -235,30 +235,41 @@ The ```` is followed by a slash "/". ---------------------------- After the ```` and "/" there are one or more -```` each separated by a pipe "|". The pipe "|" has no -special meaning beside being a separator. +```` separated by a pipe "|". The pipe "|" has no special +meaning beside being a separator. -Each ```` of this pipe-separated list is either a -single ```` as in ``1.2.3 or the combination of a ```` and -a ```` as in ``>=2.0.0`` with this syntax:: +Each ```` of this list is either a single ```` as +in ``1.2.3`` or the combination of a ```` and a ```` as in +``>=2.0.0`` using this syntax:: -A single version that means that a version can be equal to the provided version -to satisfy the range spec. Equality is based on the equality of two normalized -version strings according to their versioning scheme. For most schemes, this is -a simple string equality. But some scheme -- such as ``pypi`` (e.g., PEP440)-- -specify some normalization before testing for equality. +A single version that means that a version equal to this version satisfies the +range spec. Equality is based on the equality of two normalized version strings +according to their versioning scheme. For most schemes, this is a simple string +equality. But schemes can specify normalization and rules for equality such as +``pypi`` with PEP440. -The ```` is one of these **two** comparison operators: +The special star "*" comparator matches any version. It must be used **alone** +exclusive of any other constraint and must not be followed by a version. For +example "vers:deb/\*" represent all the versions of a Debian package. This +includes past, current and possible future versions. -- ">=": Greater-or-equal version comparator points to all versions greater than - or equal to the provided version. For example ">=1.2.3" means greater than or - equal to version "1.2.3". -- "<": Less than version comparator points to all versions strictly less than a - provided version . For example "<1.2.3" means less than version "1.2.3". +Otherwise, the ```` is one of these comparison operators: + +- "!=": Version exclusion or inequality comparator. This means a version must + not be equal to the provided version that must be excluded from the range. + For example: "!=1.2.3" means that version "1.2.3" is excluded. + +- "<", "<=": Lesser than or lesser-or-equal version comparators point to all + versions less than or equal to the provided version. + For example "<=1.2.3" means less than or equal to "1.2.3". + +- ">", ">=": Greater than or greater-or-equal version comparators point to + all versions greater than or equal to the provided version. + For example ">=1.2.3" means greater than or equal to "1.2.3". The ```` defines: @@ -270,19 +281,52 @@ The ```` defines: primarily of three dot-separated numeric segments named major, minor and patch. -The special star "*" ```` matches any version. This star -constraint must be used **alone** exclusive of any other constraint. For example -"vers:deb/\*" resolves to any version of a Debian package. +Examples +------------- +Single version in an npm package dependency: -Examples -~~~~~~~~~ +- originally seen as a dependency on version "1.2.3" in a package.json manifest +- the version range spec is: ``vers:npm/1.2.3`` + + +Versions enumeration: + +- ``vers:pypi/0.0.0|0.0.1|0.0.2|0.0.3|1.0|2.0pre1`` + + +Complex statement about a vulnerability in a "maven" package that affects +multiple branches each with their own fixed versions at +https://repo1.maven.org/maven2/org/apache/tomee/apache-tomee/ +Note how the constraints are sorted: + + +- "affects Apache TomEE 8.0.0-M1 - 8.0.1, Apache TomEE 7.1.0 - 7.1.2, + Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0 - 1.7.5." + +- a normalized version range spec is: + ``vers:tomee/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` -TODO. +- alternatively, four ``vers`` express the same range, using one ``vers`` for + each vulnerable "branches": + - ``vers:tomee/>=1.0.0-beta1|<=1.7.5`` + - ``vers:tomee/>=7.0.0-M1|<=7.0.7`` + - ``vers:tomee/>=7.1.0|<=7.1.2`` + - ``vers:tomee/>=8.0.0-M1|<=8.0.1`` +Rubygems custom syntax for dependency on gem. Note how the pessimistic version +constraint is expanded: -Normalized or canonical representation -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- ``'library', '~> 2.2.0', '!= 2.2.1'`` +- the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1!<2.3.0`` + + +Normalized, canonical representation and validation +----------------------------------------------------- + +These construction and validation rules are designed such that a ``vers`` is +easier to read and understand by human and straight forward to process by tools, +attempting to avoid the creation of empty or impossible version ranges. - A version range specifier contains only printable ASCII letters, digits and punctuation. @@ -295,22 +339,48 @@ Normalized or canonical representation - The versions are case-sensitive, and a versioning scheme may specify its own case sensitivity. -- A ``version`` in a ```` can only contain printable ASCII - characters excluding some special characters and the characters used as - separators and comparators ``><=!,*|``. If it contains special characters - (which should be rare in practice) a version string in a constraint must be - quoted using the URL quoting rules. +- If a ``version`` in a ```` contains separator or + comparator characters (i.e. ``><=!*|``), it must be quoted using the URL + quoting rules. This should be rare in practice. + +The list of ``s`` of a range are signposts in the version +timeline of a package. With these few and simple validation rules, we can avoid +the creation of most empty or impossible version ranges: + +- **Constraints are sorted by version**. The canonical ordering is the versions + order. The ordering of ```` is not significant otherwise + but this sort order is needed when check if a version is contained in a range. + +- **Versions are unique**. Each ``version`` must be unique in a range and can + occur only once in any ```` of a range specifier, + irrespective of its comparators. Tools must report an error for duplicated + versions. -- The ordering of multiple ```` in a range specifier is not - significant. The canonical ordering is the versions order. +- **There is only one star**: "*" must only occur once and alone in a range, + without any other constraint or version. -- Each ```` is unique and can occur only once in a range - specifier. Duplicates must be removed in the canonical representation. +Starting from a de-duplicated and sorted list of constraints, these extra rules +apply to the comparators of any two contiguous constraints to be valid: +- "!=" constraint can be followed by a constraint using any comparator, i.e., + any of "=", "!=", ">", ">=", "<", "<=" as comparator (or no constraint). + +Ignoring all constraints with "!=" comparators: + +- A "=" constraint must be followed only by a constraint with one of "=", ">", + ">=" as comparator (or no constraint). + +And ignoring all constraints with "=" or "!=" comparators, the sequence of +constraint comparators must be an alternation of greater and lesser comparators: + +- "<" and "<=" must be followed by one of ">", ">=" (or no constraint). +- ">" and ">=" must be followed by one of "<", "<=" (or no constraint). + +Tools must report an error for such invalid ranges. Using version range specifiers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------------ ``vers`` primary usage is to test if a version is within a range. @@ -322,59 +392,64 @@ Some important usages derived from this include: - **Resolving a version range specifier to a list of concrete versions.** In this case, the input is one or more known versions of a package. Each version is then tested to check if it lies within or outside the range. For - example, given a vulnerability and the ``vers`` describing the affected and - fixed versions of a package, this process is used to determine if an existing - package version is within the known vulnerable version range specifier. + example, given a vulnerability and the ``vers`` describing the vulnerable + versions of a package, this process is used to determine if an existing + package version is vulnerable. - **Selecting one of several versions that are within a range.** In this case, given several versions that are within a range and several packages that express package dependencies qualified by a version range, a package management tools will determine and select the set of package - versions that satify all the version ranges constraints of all dependencies. + versions that satisfy all the version ranges constraints of all dependencies. This usually requires deploying heuristics and algorithms (possibly complex such as sat solvers) that are ecosystem- and tool-specific and outside of the scope for this specification; yet ``vers`` could be used in tandem with ``purl`` to provide an input to this dependencies resolution process. -Parsing version range specifiers -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Parsing and validating version range specifiers +------------------------------------------------- To parse a version range specifier string: - Remove all spaces and tabs. - Start from left, and split once on colon ":". -- The left hand side is the URI-scheme that must be lowercased. - - Verify that the URI-scheme value is ``vers``. +- The left hand side is the URI-scheme that must be lowercase. + - Tools must validate that the URI-scheme value is ``vers``. - The right hand side is the specifier. - Split the specifier from left once on a slash "/". -- The left hand side is the that must be lowercased. + +- The left hand side is the that must be lowercase. + Tools should validate that the is a known scheme. + - The right hand side is a list of one or more constraints. + Tools must validate that this constraints string is not empty ignoring spaces. -- If the constraints string is equal to "*", the is "*". +- If the constraints string is equal to "*", the is "*". Parsing is done and no further processing is needed for this ``vers``. A tool - may be strict and report an error if there are extra characters beyond "*" or - be lenient. + should report an error if there are extra characters beyond "*". - Strip leading and trailing pipes "|" from the constraints string. - Split the constraints on pipe "|". The result is a list of . - Consecutive pipes should be must as one and leading and trailing pipes ignored. + Consecutive pipes must be treated as one and leading and trailing pipes ignored. - For each : - - Determine if the starts with one of the two comparators: - - If it starts with ">=", then the comparator is ">=". - - If it starts with "<", then the comparator is "<". + - If it starts with ">=", then the comparator is ">=". + - If it starts with "<=", then the comparator is "<=". + - If it starts with "!=", then the comparator is "!=". + - If it starts with "<", then the comparator is "<". + - If it starts with ">", then the comparator is ">". - - Remove the comparator from string, and the - remaining string is the version. + - Remove the comparator from string start. The + remaining string is the version. - Otherwise the version is the full string (which implies - and equality comparator of "=") + an equality comparator of "=") - - A tool should validate and report an error if the version is empty. + - Tools should validate and report an error if the version is empty. - If the version contains a percent "%" character, apply URL quoting rules to unquote this string. @@ -383,25 +458,144 @@ To parse a version range specifier string: Finally: -- deduplicate and sort the list of (comparator, version) using the canonical order. -- return the and this list of constraints. +- The results are the and the list of + constraints. + +Tools should optionally validate and normalize the list of +constraints once parsing is complete: + +- Sort and validate the list of constraints. +- De-duplicate the list of constraints. + + +Version constraints de-duplication +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Tools can simplify a list of ```` using this approach: + +These pairs of contiguous constraints with these comparators are valid: + +- != followed by anything +- =, <, or <= followed by =, !=, >, or >= +- >, or >= followed by !=, <, or <= + +These pairs of contiguous constraints with these comparators are redundant and +invalid (ignoring any != since they can show up anywhere): + +- =, < or <= followed by < or <=: this is the same as < or <= +- > or >= followed by =, > or >=: this is the same as > or >= + + +A procedure to remove redundant constraints can be: + +- Start from a list of constraints of comparator and version, sorted by version + and where each version occurs only once in any constraint. + +- If the constraints list contains only one item and the comparator is "*", + return this list and simplification is finished. + +- Split the constraints list in two sub lists: + + - a list of "unequal constraints" where the comparator is "!=" + - a remainder list of "constraints" where the comparator is not "!=" + +- If the remainder list of "constraints" is empty, return the "unequal constraints" + list and simplification is finished. + +- Loop while the "constraints" list length diminishes with each iteration: + + - Save the starting length of "constraints" list + + - For each contiguous current constraint and next constraint of this list: + + - If current comparator is "=", "<" or "<=" and next comparator is <" or <=", + discard current constraint, keep next constraint + + - Else if current comparator is ">" or ">=" and next comparator is "=", ">" or ">=", + keep current constraint, discard next constraint + + - Else keep current constraint and next constraint + + - If the starting length of "constraints" list is the same as the current + length of the "constraints" list, stop iterating. + +- Concatenate the "unequal constraints" list and the filtered "constraints" list +- Sort by version and return. + + +Checking if a version is contained within a range +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To check if a "tested version" is contained within a version range: + +- Start from a parsed a version range specifier with: + + - a versioning scheme + - a list of constraints of comparator and version, sorted by version + and where each version occurs only once in any constraint. + +- If the constraint list contains only one item and the comparator is "*", + then the "tested version" is IN the range. Check is finished. + +- Select the version equality and comparison procedures suitable for this + versioning scheme and use these for all version comparisons performed below. + +- If the "tested version" is equal to the any of the constraint version + where the constraint comparator is for equality (any of "=", "<=", or ">=") + then the "tested version" is in the range. Check is finished. + +- If the "tested version" is equal to the any of the constraint version where + the constraint comparator is "=!" then the "tested version" is NOT in the + range. Check is finished. + +- Split the constraint list in two sub lists: + + - a first list where the comparator is "=" or "!=" + - a second list where the comparator is neither "=" nor "!=" + +- Iterate over the current and next contiguous constraints pairs (aka. pairwise) + in the second list. + +- For each current and next constraint: + + - If this is the first iteration and current comparator is "<" or <=" + and the "tested version" is less than the current version + then the "tested version" is IN the range. Check is finished. + + - If this is the last iteration and next comparator is ">" or >=" + and the "tested version" is greater than the next version + then the "tested version" is IN the range. Check is finished. + + - If current comparator is ">" or >=" and next comparator is "<" or <=" + and the "tested version" is greater than the current version + and the "tested version" is less than the next version + then the "tested version" is IN the range. Check is finished. + + - If current comparator is "<" or <=" and next comparator is ">" or >=" + then these versions are out the range. Continue to the next iteration. + +- Reaching here without having finished the check before means that the + "tested version" is NOT in the range. Notes and caveats ~~~~~~~~~~~~~~~~~~~ -- Comparing versions from two different versioning schemes is unspecified. Even +- Comparing versions from two different versioning schemes is an error. Even though there may be some similarities between the ``semver`` version of an npm and the ``deb`` version of its Debian packaging, these similarities are - specific to each versioning scheme and. Tools should report an error in these - cases as it does not make sense to compare such unrelated versions. + specific to each versioning scheme. Tools should report an error in this case. + +- All references to sorting or ordering of version constraints means sorting + by version. And sorting by versions always implies using the versioning + scheme-specified version comparison and ordering. Some of the known versioning schemes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------------------- -TODO: add details on how to convert to and from ``vers`` for a given versioning -scheme and package type. +These are a few known versioning schemes for some common Package URL +`types` (aka. ``ecosystem``). - **deb**: Debian and Ubuntu https://www.debian.org/doc/debian-policy/ch-relationships.html The comparators are <<, <=, =, >= and >>. @@ -426,7 +620,7 @@ scheme and package type. - **cpan**: Perl https://perlmaven.com/how-to-compare-version-numbers-in-perl-and-for-cpan-modules -- **go**: Go modules https://golang.org/ref/mod#versions use ``semver`` versions +- **golang**: Go modules https://golang.org/ref/mod#versions use ``semver`` versions with a specific minimum version resolution algorithm. - **maven**: Apache Maven supports a math interval notation which is rarely seen @@ -445,18 +639,52 @@ scheme and package type. later, likely based on a split on any wholly alpha or wholly numeric segments and dealing with digit and string comparisons, like is done in libversion) -TODO: add Rust, composer and archlinux + +TODO: add Rust, composer and archlinux, nginx, tomcat, apache. + +A separate document will provide details for each versioning scheme and: + +- how to convert its native range notation to the ``vers`` notation and back. +- how to compare and sort two versions in a range. + +This versioning schemes document will also explain how to convert CVE and OSV +ranges to ``vers``. Implementations -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +----------------------- - Python: https://github.com/nexB/univers - Yours! + +Related efforts and alternative +------------------------------------ + +- CUDF defines a generic range notation similar to Debian and integer version + numbers from the sequence of versions for universal dependencies resolution + https://www.mancoosi.org/cudf/primer/ + +- OSV is an "Open source vulnerability DB and triage service." It defines + vulnerable version range semantics using a minimal set of comparators for use + with package "ecosystem" and version range "type". + https://github.com/google/osv + +- libversion is a library for general purpose version comparison using a + unified procedure designed to work with many package types. + https://github.com/repology/libversion + +- unified-range is a library for uniform version ranges based on the Maven + version range spec. It support Apache Maven and npm ranges + https://github.com/snyk/unified-range + +- dephell specifier is a library to parse and evaluate version ranges and + "work with version specifiers (can parse PEP-440, SemVer, Ruby, NPM, Maven)" + https://github.com/dephell/dephell_specifier + Why not reuse existing version range notations? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +----------------------------------------------------- Most existing version range notations are tied to a specific version string syntax and are therefore not readily applicable to other contexts. For example, @@ -482,7 +710,7 @@ designed for dependencies and not for vulnerable ranges. In particular, a vulnerability may exist for multiple "version branches" of a given package such as with Django 2.x and 3.x. Several version range notations have difficulties to communicate these as typically all the version constraints must be satisfied. -In constrast, a vulnerability can affect multiple disjoint version ranges of a +In contrast, a vulnerability can affect multiple disjoint version ranges of a package and any version satisfying these constraints would be vulnerable: it may not be possible to express this with a notation designed exclusively for dependent versions resolution. @@ -494,27 +722,43 @@ in API with larger JSON documents. Why not use the OSV Ranges? -############################### +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: - https://ossf.github.io/osv-schema/ -``vers`` and the OSSF OSV schema vulnerable ranges are strictly equivalent and -``vers`` provides a compact range notation while OSV provides more verbose -JSON notation. +``vers`` and the OSSF OSV schema vulnerable ranges are equivalent and ``vers`` +provides a compact range notation while OSV provides more verbose JSON notation. ``vers`` borrows the design from and was informed by the OSV schema spec and its authors. -The only high level difference between the two specifications are the -codes used to qualify a range package "ecosystem" value that ressembles closely +OSV uses a minimalist set of only three comparators: + +- "=" to enumerate versions, +- ">=" for the version that introduced a vulnerability, and +- "<" for the version that fixed a vulnerability. + +OSV Ranges support neither ">" nor "!=" comparators making it difficult to +express some ranges that must exclude a version. This may not be an issue for +most vulnerable ranges yet: + +- this makes it difficult or impossible to precisely express certain dependency + and vulnerable ranges when a version must be excluded and the set of existing + versions is not yet known, + +- this make some ranges more verbose such as with the NVD CVE v5 API ranges + notation that can include their upper limit and would need two constraints. + +Another high level difference between the two specifications are the +codes used to qualify a range package "ecosystem" value that resembles closely the Package URL package "type" used in ``vers``. This spec will provide a strict mapping between the OSV ecosystem and the ``vers`` versioning schemes values. Why not use the NVD CVE v5 API Ranges? -############################################ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: @@ -524,8 +768,8 @@ See: The version 5 of the NVD CVE JSON data format defines version ranges with a starting version, a versionType, and an upper limit for the version range as lessThan or lessThanOrEqual or as an enumeration of versions. The versionType -and the package collectionURL possible values are inidicative and left outside -of that specification and both seem strictly equivalent to the Package URL purl +and the package collectionURL possible values are only indicative and left out +of this specification and both seem strictly equivalent to the Package URL "type" on the one hand and the ``vers`` versioning scheme on the other hand. The semantics and expressiveness of each range are similar and ``vers`` provides @@ -534,15 +778,14 @@ strictly the conversion of any NVD v5 range to its notation and further provides a concrete list of well known versioning schemes. ``vers`` design was informed by the NVD CVE v5 API schema spec and its authors. - When NVD v5 becomes active, this spec will provide a strict mapping between the -NVD versionType and the ``vers`` versioning schemes values. Futhermore, this +NVD versionType and the ``vers`` versioning schemes values. Furthermore, this spec and the Package URL "types" should be updated accordingly to provide a mapping with the upcoming NVD collectionURL that will be effectively used. Why not use the NVD CPE Ranges? -############################### +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: @@ -572,7 +815,7 @@ verbose, yet ``vers`` supports strictly the conversion of any CPE range. Why not use node-semver ranges? -############################### +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: @@ -581,8 +824,8 @@ See: The node-semver spec is similar but much more complex than this spec. This is an AND of ORs constraints with a few practical issues: -- A space means "AND", therefore whitespaces are significant. Having - significant whitespaces in a string makes normalization more complicated and +- A space means "AND", therefore white spaces are significant. Having + significant white spaces in a string makes normalization more complicated and may be a source of confusion if you remove the spaces from the string. ``vers`` avoids the ambiguity of spaces by ignoring them. @@ -598,8 +841,8 @@ Notations that are directly derived from node-semver as used in Rust and PHP Composer have the same issues. -Why not use Python pep-0440 ranges? -##################################### +Why not use Python PEP-0440 ranges? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: @@ -625,7 +868,7 @@ aspects specific to the versions used only in the Python ecosystem. Why not use Rubygems requirements notation? -############################################### +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ See: @@ -635,41 +878,69 @@ The Rubygems specification suggests but does not enforce using semver. It uses operators similar to the ``node-semver`` spec with the different of the "~>" aka. pessimistic operator vs. a plain "~" tilde used in node-semver. This operator implies some semver-like versioning, yet gem version are not strictly -semver. This makes the notation complex to implment and impractical to reuse +semver. This makes the notation complex to implement and impractical to reuse in places that do not use the same Ruby-specific semver-like semantics. -Why not use richers comparators such as >, <=, != a tilde, caret and 1.star? -############################################################################ +Why not use fewer comparators with only =, >= and =" (greater or equal) is for the version that introduces a vulnerability +- "<" (lesser) is for the version that fixes a vulnerability + +This approach is simpler and works well for most vulnerable ranges but it faces +limitations when converting from other notations: + +- ">" cannot be converted reliably to ">=" unless you know all the versions and + these will never change. -Several existing notations such as used with npm, gem, python, or composer -provide syntactic shorthands such as: +- "<=" cannot be converted reliably to "<" unless you know all the versions and + these will never change. -- negation with != -- richer comparisons with <= and >. -- a tilde prefix or ~> prefix or =~ as in "~1.3" or "~>1.2.3" +- "!=" cannot be converted reliably: there is no ">" comparator to create an + unequal equivalent of "><"; and a combo of ">=" and "<" is not equivalent + to inequality unless you know all the versions and these will never change. + + +Why not use richer comparators such as tilde, caret and star? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some existing notations such as used with npm, gem, python, or composer +provide syntactic shorthand such as: + +- a "pessimistic operator" using tilde, ~> or =~ as in "~1.3" or "~>1.2.3" - a caret ^ prefix as in "^ 1.2" -- using a star in a segment of a version as in "1.2.*" +- using a star in a version segment as in "1.2.*" - dash-separated ranges as in "1.2 - 1.4" +- arbitrary string equality such as "===1.2" + +Most of these notations can be converted without loss to the ``vers`` notation. +Furthermore these notations typically assume a well defined version string +structure specific to their package ecosystem and are not reusable in another +ecosystem that would not use the exact same version conventions. -These range syntaxes can **always** be reduced to a set of simpler operators -defined in ``vers``. Furthermore they assumed a certain structure in a version -string (most often semver or semver-like) as used in one ecosystem and therefore -are not reusable in another ecosystem that would not use the version conventions. +For instance, the tilde and caret notations demand that you can reliably +infer the next version (aka. "bump") from a given version; this is possible +only if the versioning scheme supports this operation reliably for all its +accepted versions. Why not use mathematical interval notation for ranges? -####################################################### +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Apache Maven and NuGet make use of a mathematical interval with "[", "]", "(" -and ")" with commas as a syntax for version ranges. +Apache Maven and NuGet use a mathematical interval notation with comma-separated +"[", "]", "(" and ")" to declare version ranges. -All other known range notations use a more common set of ">", "<", and "=" as -range symbols. ``vers`` adopts this more common approach. +All other known range notations use the more common ">", "<", and "=" as +comparators. ``vers`` adopts this familiar approach. References -~~~~~~~~~~~~~~~~~~~~ +--------------------- + Here are some of the discussions that led to the creation of this specification: @@ -681,6 +952,6 @@ Here are some of the discussions that led to the creation of this specification: - https://github.com/nexB/univers/pull/11 License -~~~~~~~ +--------------------- This document is licensed under the MIT license From c0487ad0ebdd0f80dcbb1ac911b8cb519a542b0d Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Wed, 15 Dec 2021 22:09:38 +0100 Subject: [PATCH 09/21] Re-organize sections in a more natural order Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 146 +++++++++++++++++++++-------------------- 1 file changed, 74 insertions(+), 72 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 1c6cb22..97b9c81 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -168,7 +168,7 @@ in this document. Version range specifier -~~~~~~~~~~~~~~~~~~~~~~~~ +------------------------ A version range specifier (aka. "vers") is a URI string using the ``vers`` URI-scheme with this syntax:: @@ -199,8 +199,76 @@ A ```` satisfies a version range specifier if it is contained within any of the intervals defined by these ````. +Using version range specifiers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``vers`` primary usage is to test if a version is within a range. + +An version is within a version range if falls in any of the intervals defined +by a range. Otherwise, the version is outside of the version range. + +Some important usages derived from this include: + +- **Resolving a version range specifier to a list of concrete versions.** + In this case, the input is one or more known versions of a package. Each + version is then tested to check if it lies within or outside the range. For + example, given a vulnerability and the ``vers`` describing the vulnerable + versions of a package, this process is used to determine if an existing + package version is vulnerable. + +- **Selecting one of several versions that are within a range.** + In this case, given several versions that are within a range and several + packages that express package dependencies qualified by a version range, + a package management tools will determine and select the set of package + versions that satisfy all the version ranges constraints of all dependencies. + This usually requires deploying heuristics and algorithms (possibly complex + such as sat solvers) that are ecosystem- and tool-specific and outside of the + scope for this specification; yet ``vers`` could be used in tandem with + ``purl`` to provide an input to this dependencies resolution process. + + +Examples +~~~~~~~~~ + +A single version in an npm package dependency: + +- originally seen as a dependency on version "1.2.3" in a package.json manifest +- the version range spec is: ``vers:npm/1.2.3`` + + +A list of versions, enumerated: + +- ``vers:pypi/0.0.0|0.0.1|0.0.2|0.0.3|1.0|2.0pre1`` + + +A complex statement about a vulnerability in a "maven" package that affects +multiple branches each with their own fixed versions at +https://repo1.maven.org/maven2/org/apache/tomee/apache-tomee/ +Note how the constraints are sorted: + + +- "affects Apache TomEE 8.0.0-M1 - 8.0.1, Apache TomEE 7.1.0 - 7.1.2, + Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0 - 1.7.5." + +- a normalized version range spec is: + ``vers:tomee/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` + +- alternatively, four ``vers`` express the same range, using one ``vers`` for + each vulnerable "branches": + - ``vers:tomee/>=1.0.0-beta1|<=1.7.5`` + - ``vers:tomee/>=7.0.0-M1|<=7.0.7`` + - ``vers:tomee/>=7.1.0|<=7.1.2`` + - ``vers:tomee/>=8.0.0-M1|<=8.0.1`` + +Conversing Rubygems custom syntax for dependency on gem. Note how the +pessimistic version constraint is expanded: + +- ``'library', '~> 2.2.0', '!= 2.2.1'`` +- the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1!<2.3.0`` + + URI scheme -------------- +~~~~~~~~~~ The ``vers`` URI scheme is an acronym for "VErsion Range Specifier". It has been selected because it is short, obviously about version and available @@ -210,7 +278,7 @@ The URI scheme is followed by a colon ":". ```` ------------------------- +~~~~~~~~~~~~~~~~~~~~~~~ The ```` (such as ``npm``, ``deb``, etc.) determines: @@ -232,7 +300,7 @@ The ```` is followed by a slash "/". ```` ----------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~ After the ```` and "/" there are one or more ```` separated by a pipe "|". The pipe "|" has no special @@ -281,50 +349,11 @@ The ```` defines: primarily of three dot-separated numeric segments named major, minor and patch. -Examples -------------- - -Single version in an npm package dependency: - -- originally seen as a dependency on version "1.2.3" in a package.json manifest -- the version range spec is: ``vers:npm/1.2.3`` - - -Versions enumeration: - -- ``vers:pypi/0.0.0|0.0.1|0.0.2|0.0.3|1.0|2.0pre1`` - - -Complex statement about a vulnerability in a "maven" package that affects -multiple branches each with their own fixed versions at -https://repo1.maven.org/maven2/org/apache/tomee/apache-tomee/ -Note how the constraints are sorted: - - -- "affects Apache TomEE 8.0.0-M1 - 8.0.1, Apache TomEE 7.1.0 - 7.1.2, - Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0 - 1.7.5." - -- a normalized version range spec is: - ``vers:tomee/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` - -- alternatively, four ``vers`` express the same range, using one ``vers`` for - each vulnerable "branches": - - ``vers:tomee/>=1.0.0-beta1|<=1.7.5`` - - ``vers:tomee/>=7.0.0-M1|<=7.0.7`` - - ``vers:tomee/>=7.1.0|<=7.1.2`` - - ``vers:tomee/>=8.0.0-M1|<=8.0.1`` - -Rubygems custom syntax for dependency on gem. Note how the pessimistic version -constraint is expanded: - -- ``'library', '~> 2.2.0', '!= 2.2.1'`` -- the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1!<2.3.0`` - Normalized, canonical representation and validation ----------------------------------------------------- -These construction and validation rules are designed such that a ``vers`` is +The construction and validation rules are designed such that a ``vers`` is easier to read and understand by human and straight forward to process by tools, attempting to avoid the creation of empty or impossible version ranges. @@ -379,34 +408,6 @@ constraint comparators must be an alternation of greater and lesser comparators: Tools must report an error for such invalid ranges. -Using version range specifiers ------------------------------- - -``vers`` primary usage is to test if a version is within a range. - -An version is within a version range if falls in any of the intervals defined -by a range. Otherwise, the version is outside of the version range. - -Some important usages derived from this include: - -- **Resolving a version range specifier to a list of concrete versions.** - In this case, the input is one or more known versions of a package. Each - version is then tested to check if it lies within or outside the range. For - example, given a vulnerability and the ``vers`` describing the vulnerable - versions of a package, this process is used to determine if an existing - package version is vulnerable. - -- **Selecting one of several versions that are within a range.** - In this case, given several versions that are within a range and several - packages that express package dependencies qualified by a version range, - a package management tools will determine and select the set of package - versions that satisfy all the version ranges constraints of all dependencies. - This usually requires deploying heuristics and algorithms (possibly complex - such as sat solvers) that are ecosystem- and tool-specific and outside of the - scope for this specification; yet ``vers`` could be used in tandem with - ``purl`` to provide an input to this dependencies resolution process. - - Parsing and validating version range specifiers ------------------------------------------------- @@ -683,6 +684,7 @@ Related efforts and alternative "work with version specifiers (can parse PEP-440, SemVer, Ruby, NPM, Maven)" https://github.com/dephell/dephell_specifier + Why not reuse existing version range notations? ----------------------------------------------------- From 5430731ece32b790c047b71185fc6adca0f88656 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 16 Dec 2021 00:01:11 +0100 Subject: [PATCH 10/21] Correct the deduplication procedure Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 97b9c81..474711f 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -492,7 +492,7 @@ A procedure to remove redundant constraints can be: - Start from a list of constraints of comparator and version, sorted by version and where each version occurs only once in any constraint. -- If the constraints list contains only one item and the comparator is "*", +- If the constraints list contains only one item, return this list and simplification is finished. - Split the constraints list in two sub lists: From 88fdcac6d16fa7fda1221db0092ed3c4820bc6e4 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Sat, 18 Dec 2021 17:01:33 +0100 Subject: [PATCH 11/21] Rename dedupe to simplification and make it work The constraints de-duplication is now called simplification. The pseudo has been cotrrect to actually work when implmented in Python Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 474711f..63f2177 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -469,8 +469,8 @@ constraints once parsing is complete: - De-duplicate the list of constraints. -Version constraints de-duplication -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Version constraints simplification +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Tools can simplify a list of ```` using this approach: @@ -492,7 +492,7 @@ A procedure to remove redundant constraints can be: - Start from a list of constraints of comparator and version, sorted by version and where each version occurs only once in any constraint. -- If the constraints list contains only one item, +- If the constraints list contains a single constraint (star, equal or anything) return this list and simplification is finished. - Split the constraints list in two sub lists: @@ -503,22 +503,22 @@ A procedure to remove redundant constraints can be: - If the remainder list of "constraints" is empty, return the "unequal constraints" list and simplification is finished. -- Loop while the "constraints" list length diminishes with each iteration: +- Iterate over the constraints list, considering the current and next contiguous + constraints, and the previous constraint (e.g., before current) if it exists: - - Save the starting length of "constraints" list + - If current comparator is ">" or ">=" and next comparator is "=", ">" or ">=", + discard next constraint - - For each contiguous current constraint and next constraint of this list: + - If current comparator is "=", "<" or "<=" and next comparator is <" or <=", + discard current constraint. Previous constraint becomes current if it exists. - - If current comparator is "=", "<" or "<=" and next comparator is <" or <=", - discard current constraint, keep next constraint + - If there is a previous constraint: - - Else if current comparator is ">" or ">=" and next comparator is "=", ">" or ">=", - keep current constraint, discard next constraint + - If previous comparator is ">" or ">=" and current comparator is "=", ">" or ">=", + discard current constraint - - Else keep current constraint and next constraint - - - If the starting length of "constraints" list is the same as the current - length of the "constraints" list, stop iterating. + - If previous comparator is "=", "<" or "<=" and current comparator is <" or <=", + discard previous constraint - Concatenate the "unequal constraints" list and the filtered "constraints" list - Sort by version and return. @@ -584,8 +584,9 @@ Notes and caveats - Comparing versions from two different versioning schemes is an error. Even though there may be some similarities between the ``semver`` version of an npm - and the ``deb`` version of its Debian packaging, these similarities are - specific to each versioning scheme. Tools should report an error in this case. + and the ``deb`` version of its Debian packaging, the way versions are compared + specific to each versioning scheme and may be different. Tools should report + an error in this case. - All references to sorting or ordering of version constraints means sorting by version. And sorting by versions always implies using the versioning From c96df26aaba60fdbcf8ae5d291eb94bd22f0a5b3 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Mon, 20 Dec 2021 22:26:28 +0100 Subject: [PATCH 12/21] Add section on NVD 5.0 star notation Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 63f2177..fdd4d18 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -786,6 +786,35 @@ NVD versionType and the ``vers`` versioning schemes values. Furthermore, this spec and the Package URL "types" should be updated accordingly to provide a mapping with the upcoming NVD collectionURL that will be effectively used. +There is one issue with NVD v5: it introduces a new trailing "*" notation that +does not exists in most version ranges notations and may not be computable +easily in many cases. The description of the "lessThan" property is: + + The non-inclusive upper limit of the range. This is the least version NOT + in the range. The usual version syntax is expanded to allow a pattern to end + in an asterisk `(*)`, indicating an arbitrarily large number in the version + ordering. For example, `{version: 1.0 lessThan: 1.*}` would describe the + entire 1.X branch for most range kinds, and `{version: 2.0, lessThan: *}` + describes all versions starting at 2.0, including 3.0, 5.1, and so on. + +The conversion to ``vers`` range should be: + +- with a version 1.0 and `"lessThan": "*"`, the ``vers`` equivalent is: ``>=1.0``. + +- with a version 1.0 and `"lessThan": "2.*"`, the ``vers`` equivalent can be + computed for ``semver`` versions as ``>=1.0|<2`` but is not accurate unless + as versioning schemes have different rules. For instance, pre-release may be + treated in some case as part of the v1. branch and in some other cases as part + of the v2. branch. It is not clear if with "2.*" the NVD spec means: + + - ``<2`` + - or something that excludes any version string that starts with ``2.`` + +And in this case, with the expression `"lessThan": "2.*"` using a ``semver`` +version, it is not clear if ``2.0.0-alpha`` is "lessThan"; semver sorts it +before ``2.0`` and after ``1.0``, e.g., in ``semver`` ``2.0.0-alpha`` is +"less than" ``2``. + Why not use the NVD CPE Ranges? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From 935c5ab1a498be61977bcb1ca65cfa8836bd0d39 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Tue, 21 Dec 2021 10:07:51 +0100 Subject: [PATCH 13/21] Remove spaces before mandating ASCII Also use simplify rather than "normalize" Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index fdd4d18..e44ad42 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -357,12 +357,12 @@ The construction and validation rules are designed such that a ``vers`` is easier to read and understand by human and straight forward to process by tools, attempting to avoid the creation of empty or impossible version ranges. -- A version range specifier contains only printable ASCII letters, digits and - punctuation. - - Spaces are not significant and removed in a canonical form. For example "<1.2.3|>=2.0" and " < 1.2. 3 | > = 2 . 0" are equivalent. +- A version range specifier contains only printable ASCII letters, digits and + punctuation. + - The URI scheme and versioning scheme are always lowercase as in ``vers:npm``. - The versions are case-sensitive, and a versioning scheme may specify its own @@ -462,11 +462,11 @@ Finally: - The results are the and the list of constraints. -Tools should optionally validate and normalize the list of +Tools should optionally validate and simplify the list of constraints once parsing is complete: - Sort and validate the list of constraints. -- De-duplicate the list of constraints. +- Simplify the list of constraints. Version constraints simplification From 66875a509bf0478393df916989dcf49da9ad54a2 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 23 Dec 2021 23:33:20 +0100 Subject: [PATCH 14/21] Fix word repetition Reported-by: @tschmidtb51 Signed-off-by: Philippe Ombredanne Co-authored-by: tschmidtb51 <65305130+tschmidtb51@users.noreply.github.com> --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index e44ad42..7f1250f 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -73,7 +73,7 @@ conventions in use: the topic of versioning. Version resolution uses its own algorithm. - Python uses its own version and version ranges notation with notable - peculiarities on how how pre-release and post-release suffixes are used + peculiarities on how pre-release and post-release suffixes are used https://www.python.org/dev/peps/pep-0440/ - Debian and Ubuntu use their own notation and are remarkable for their use of From 2328f78f7a286c83601a9bdf7c45cff55e09a4f7 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 23 Dec 2021 23:39:27 +0100 Subject: [PATCH 15/21] Fix vers example syntax Reported-by: @tschmidtb51 Signed-off-by: Philippe Ombredanne Co-authored-by: tschmidtb51 <65305130+tschmidtb51@users.noreply.github.com> --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 7f1250f..b01966b 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -264,7 +264,7 @@ Conversing Rubygems custom syntax for dependency on gem. Note how the pessimistic version constraint is expanded: - ``'library', '~> 2.2.0', '!= 2.2.1'`` -- the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1!<2.3.0`` +- the version range spec is: ``vers:gem/>=2.2.0|!= 2.2.1|<2.3.0`` URI scheme From 75b3b5b223dfaf32d3985846319ad81fde6c1610 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 23 Dec 2021 23:42:38 +0100 Subject: [PATCH 16/21] Clarify that a version is an example Reported-by: @tschmidtb51 Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index b01966b..c6a01b9 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -307,7 +307,7 @@ After the ```` and "/" there are one or more meaning beside being a separator. Each ```` of this list is either a single ```` as -in ``1.2.3`` or the combination of a ```` and a ```` as in +in ``1.2.3`` for example or the combination of a ```` and a ```` as in ``>=2.0.0`` using this syntax:: From 013eb68c496eee6c1d6ac3fe5042a46b8017ef68 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 24 Dec 2021 00:29:17 +0100 Subject: [PATCH 17/21] Improve quotes and correct meaning Reported-by: @tscmidtb51 Signed-off-by: Philippe Ombredanne Co-authored-by: tschmidtb51 <65305130+tschmidtb51@users.noreply.github.com> --- VERSION-RANGE-SPEC.rst | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index c6a01b9..83ee857 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -427,16 +427,16 @@ To parse a version range specifier string: - The right hand side is a list of one or more constraints. Tools must validate that this constraints string is not empty ignoring spaces. -- If the constraints string is equal to "*", the is "*". +- If the constraints string is equal to "*", the ```` is "*". Parsing is done and no further processing is needed for this ``vers``. A tool should report an error if there are extra characters beyond "*". - Strip leading and trailing pipes "|" from the constraints string. -- Split the constraints on pipe "|". The result is a list of . +- Split the constraints on pipe "|". The result is a list of ````. Consecutive pipes must be treated as one and leading and trailing pipes ignored. -- For each : - - Determine if the starts with one of the two comparators: +- For each ````: + - Determine if the ```` starts with one of the two comparators: - If it starts with ">=", then the comparator is ">=". - If it starts with "<=", then the comparator is "<=". @@ -444,10 +444,10 @@ To parse a version range specifier string: - If it starts with "<", then the comparator is "<". - If it starts with ">", then the comparator is ">". - - Remove the comparator from string start. The + - Remove the comparator from ```` string start. The remaining string is the version. - - Otherwise the version is the full string (which implies + - Otherwise the version is the full ```` string (which implies an equality comparator of "=") - Tools should validate and report an error if the version is empty. @@ -459,10 +459,10 @@ To parse a version range specifier string: Finally: -- The results are the and the list of +- The results are the ```` and the list of ```` constraints. -Tools should optionally validate and simplify the list of +Tools should optionally validate and simplify the list of ```` constraints once parsing is complete: - Sort and validate the list of constraints. From 19fa4cb1683acbe1db2981b7c9dfa55369c5844e Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Thu, 17 Feb 2022 14:42:27 +0100 Subject: [PATCH 18/21] Use correct license example Reported-by: tschmidtb51 @tschmidtb51 Signed-off-by: Philippe Ombredanne Co-authored-by: tschmidtb51 <65305130+tschmidtb51@users.noreply.github.com> --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 83ee857..5358b94 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -248,7 +248,7 @@ Note how the constraints are sorted: - "affects Apache TomEE 8.0.0-M1 - 8.0.1, Apache TomEE 7.1.0 - 7.1.2, - Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0 - 1.7.5." + Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0-beta1 - 1.7.5." - a normalized version range spec is: ``vers:tomee/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` From 63846effc636f5e92f83e28774cdbb0bc14c9f23 Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Mon, 12 Feb 2024 19:05:53 +0100 Subject: [PATCH 19/21] Update VERSION-RANGE-SPEC.rst Add Java implementation reference by @nscuro Reference: https://github.com/nscuro/versatile Signed-off-by: Philippe Ombredanne Co-authored-by: Niklas --- VERSION-RANGE-SPEC.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 5358b94..fcf2db4 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -657,6 +657,7 @@ Implementations ----------------------- - Python: https://github.com/nexB/univers +- Java: https://github.com/nscuro/versatile - Yours! From 378636f33d0342eb4d42a49359dd5c475d80d89c Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 27 Sep 2024 15:39:21 +0200 Subject: [PATCH 20/21] Improve wording for Debian comparators Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index fcf2db4..02ef769 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -600,7 +600,7 @@ These are a few known versioning schemes for some common Package URL `types` (aka. ``ecosystem``). - **deb**: Debian and Ubuntu https://www.debian.org/doc/debian-policy/ch-relationships.html - The comparators are <<, <=, =, >= and >>. + Debian uses these comparators: <<, <=, =, >= and >>. - **rpm**: RPM distros https://rpm-software-management.github.io/rpm/manual/dependencies.html The a simplified rmpvercmp version comparison routine is used by archlinux Pacman. From c7389ae7f826f7c04ecf84b6497ca7aa56fb1c2e Mon Sep 17 00:00:00 2001 From: Philippe Ombredanne Date: Fri, 27 Sep 2024 15:56:10 +0200 Subject: [PATCH 21/21] Update VERSION-RANGE-SPEC.rst Use correct scheme Signed-off-by: Philippe Ombredanne --- VERSION-RANGE-SPEC.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/VERSION-RANGE-SPEC.rst b/VERSION-RANGE-SPEC.rst index 02ef769..945fd43 100644 --- a/VERSION-RANGE-SPEC.rst +++ b/VERSION-RANGE-SPEC.rst @@ -251,7 +251,7 @@ Note how the constraints are sorted: Apache TomEE 7.0.0-M1 - 7.0.7, Apache TomEE 1.0.0-beta1 - 1.7.5." - a normalized version range spec is: - ``vers:tomee/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` + ``vers:maven/>=1.0.0-beta1|<=1.7.5|>=7.0.0-M1|<=7.0.7|>=7.1.0|<=7.1.2|>=8.0.0-M1|<=8.0.1`` - alternatively, four ``vers`` express the same range, using one ``vers`` for each vulnerable "branches":