Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container component purl format not consistent with CycloneDX specifications #112

Closed
kevinlinglesas opened this issue Dec 11, 2023 · 7 comments

Comments

@kevinlinglesas
Copy link

The Scenario

Given a kbom (v0.2.4) generated bom file with a set of 'container' type components that include package URL (purl) fields reflecting, presumably, docker images that are resources in the kubernetes cluster.

The bom is CycloneDX json format. I am also using another open source tool: https://github.com/CycloneDX/cyclonedx-python-lib. Specifically, I am using the from_json() method available with the project's Bom class to deserialize from a CycloneDX JSON BOM (ie, the afore mentioned kbom bom file).

The Issue

At the point where the first purl in the bom is encountered, and exception is thrown.

PURL string supplied (pkg:registry.k8s.io/kube-scheduler:v1.26.1@sha256:af0292<snip>) does not parse!
  File "/some/path/venv/lib/python3.8/site-packages/cyclonedx/serialization/__init__.py", line 66, in deserialize
    return PackageURL.from_string(purl=str(o))
  File "/some/path/venv/lib/python3.8/site-packages/packageurl/__init__.py", line 512, in from_string
    raise ValueError(msg)
ValueError: Invalid purl 'pkg:registry.k8s.io/kube-scheduler:v1.26.1@sha256:af0292<snip>' cannot contain a "user:pass@host:port" URL Authority component: ''.

This exception is coming from cyclonedx-python-lib's usage of https://github.com/package-url/packageurl-python 's PackageURL.from_string().

That usage isn't arbitrary. The CycloneDX v1.5 specification for the 'purl' indicates a purl MUST be valid and conform to the specification defined at: https://github.com/package-url/purl-spec.

It appears that a portion of the kbom generated URL is being interpreted as a "URL Authority component" and those are not allowed. See bullet 3 of section "A purl is a URL" in the purl-spec.

The Ask

Do you agree that the format of purl fields currently generated by kbom do not adhere to the required format as prescribed by CycloneDX specificications?

It would be desirable to be able to interoperate with another emerging open source toolset like cyclonedx-python-lib.

Potential Format Changes

If the kbom generated purl is altered as follows, packageurl is happy. Note: sha shortened for brevity.

"pkg:docker/registry.k8s.io/kube-scheduler@sha256:af029?tag=v1.26.1"

In particular, from the seven components the purl-spec defines:

  • scheme: pkg: - Per the spec, only pkg scheme defined.
  • type: docker - Added this per the pkg type that spec defines.
  • namespace: registry.k8s.io - not sure if this is best here or as a qualifier per the example in the spec.
  • name: kube-scheduler - the image name
  • version: the sha digest value - moved around a bit
  • qualifier: tag=v1.26.1 - moved to a qualifier so as to not lose this information

I am uncertain on whether my qualifiers are good our not. It was an attempt to retain the tag.
I was attempting to craft a purl where all the information needed to pull the image was there, even if not available
in a direct copy/paste form. Not that there is a requirement for that perse, but it seems to me it would be potentially useful to the consumers of SBOMs (automated or otherwise). The docker image can still be pulled using 'docker pull registry.k8s.io/kube-scheduler@sha256:af0292c...' (not including qualifiers). Automations should be able to further parse and craft pull commands.

References

@mateuszdyminski
Copy link
Contributor

Hi @kevinlinglesas

Thanks for your detailed analysis and finding this bug - Yes our cyclonedx support was broken.

I have fix the issue with following PR: https://github.com/ksoclabs/kbom/pull/117

we think that we should use purl in form:

"pkg:oci/kube-scheduler@sha256:af029?repository_url=registry.k8s.io/kube-scheduler&tag=v1.26.1"

but any feedback is more than welcome!

@kevinlinglesas
Copy link
Author

Hi @mateuszdyminski,

Thank you for the quick turn around on this issue. I had another comment based on something I've noticed while using the #117 changes.

For components like the ingress-nginx/controller, the new changes result in the component name being just "controller" while prior it was "registry.k8s.io/ingress-nginx/controller". The "ingress-nginx" portion that is now missing was very pertinent to the identify of this component. Simply referring to it as "controller" isn't very helpful.

Now, one will be required to parser further into the qualifiers to pull out the repository_url for a fully qualified name.

Is it possible for the "name:" and "cdx:k8s:component:name" to include the "namespace/name" again? The changes required for purl format compliance are a subset of the changes seen in #117 and is it possible to limit the changes just to the purl field?

@mateuszdyminski
Copy link
Contributor

mateuszdyminski commented Dec 19, 2023

I have reverted the name to the old version in https://github.com/ksoclabs/kbom/pull/117. Now the output looks like:

{
      "bom-ref": "pkg:oci/pilot@sha256%3Ad5132b1588c057121715d42ee33f85caeb9d3c1deff3e25730ac15a72af4465d?repository_url=docker.io%2Fistio%2Fpilot&tag=1.19.0",
      "type": "container",
      "name": "docker.io/istio/pilot",
      "version": "sha256:d5132b1588c057121715d42ee33f85caeb9d3c1deff3e25730ac15a72af4465d",
      "purl": "pkg:oci/pilot@sha256%3Ad5132b1588c057121715d42ee33f85caeb9d3c1deff3e25730ac15a72af4465d?repository_url=docker.io%2Fistio%2Fpilot&tag=1.19.0",
      "properties": [
        {
          "name": "cdx:k8s:component:type",
          "value": "container"
        },
        {
          "name": "cdx:k8s:component:name",
          "value": "docker.io/istio/pilot"
        },
        {
          "name": "ksoc:kbom:pkg:type",
          "value": "oci"
        },
        {
          "name": "ksoc:kbom:pkg:name",
          "value": "docker.io/istio/pilot"
        },
        {
          "name": "ksoc:kbom:pkg:version",
          "value": "1.19.0"
        },
        {
          "name": "ksoc:kbom:pkg:digest",
          "value": "sha256:d5132b1588c057121715d42ee33f85caeb9d3c1deff3e25730ac15a72af4465d"
        }
      ]
    }

Please @kevinlinglesas let us know if it works for you.

Thanks again for your help!

@kevinlinglesas
Copy link
Author

@mateuszdyminski,
I noticed that I needed to move to go v1.21 to pick up slices package. Does your go.mod need updating as well?
Thanks for the name changes. That is helpful!

@mateuszdyminski
Copy link
Contributor

@kevinlinglesas
Yeah - you're right with go.mod - bumped it now.

Please let me know if this cyclonexdx file format works for you, and I'll merge the PR and release the changes

@kevinlinglesas
Copy link
Author

@mateuszdyminski
Yes, the format works for me, so merge and release at your convenience.
Thank you!

@mateuszdyminski
Copy link
Contributor

#117 has been merged

Changes released in v0.2.5, details: https://github.com/ksoclabs/kbom/releases/tag/v0.2.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants