Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "collection" projects for better usage of hierarchical view #2041 #3258

Merged

Conversation

rkg-mm
Copy link
Contributor

@rkg-mm rkg-mm commented Dec 2, 2023

Description

This change introduces logic for "collection projects". Those are basically projects used as parent for other projects that shall not hold any own component or vulnerability data, but instead get calculated from child projects using different configurable aggregation logics.

Corresponding Frontend PR with Screenshots: DependencyTrack/frontend#658

Addressed Issue

resolves #2041
resolves #2410

Additional Details

Checklist

  • I have read and understand the contributing guidelines
  • This PR fixes a defect, and I have provided tests to verify that the fix is effective
  • This PR implements an enhancement, and I have provided tests to verify that it works as intended
  • This PR introduces changes to the database model, and I have added corresponding update logic
  • This PR introduces new or alters existing behavior, and I have updated the documentation accordingly

@cheapshot2000
Copy link

Has there been any updates/progress on the feature in completion of the 4.11 release?

@TWpgo
Copy link

TWpgo commented Apr 22, 2024

Looking forward for this feature 👍

@rkg-mm rkg-mm force-pushed the 2041-introduce-collection-projects branch from 0aa204e to ddc5c2b Compare April 23, 2024 00:50
@rkg-mm
Copy link
Contributor Author

rkg-mm commented Apr 23, 2024

@nscuro I wonder why the unit test still fails. I don't see a change that could influence the parent of a project not being serialized?

@nscuro
Copy link
Member

nscuro commented Apr 23, 2024

@rkg-mm FWIW there can be subtle changes in how serialization libraries (i.e. Jackson) handle certain fields, when those libraries are updated. Or how the persistence framework fetches fields implicitly.

@rkg-mm
Copy link
Contributor Author

rkg-mm commented Apr 23, 2024

Yea so, the PRs backend & frontend seem to be working well. But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

 Error:  Tests run: 40, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 74.367 s <<< FAILURE! - in org.dependencytrack.resources.v1.ProjectResourceTest
Error:  patchProjectParentTest(org.dependencytrack.resources.v1.ProjectResourceTest)  Time elapsed: 1.483 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: 
JSON documents are different:
Different keys found in node "", missing: "parent", expected: 
<{"active":true,"collectionLogic":"NONE","name":"DEF","parent":{"name":"GHI","uuid":"${json-unit.matches:parentProjectUuid}","version":"3.0"},"properties":[],"tags":[],"uuid":"${json-unit.matches:projectUuid}","version":"2.0"}> 
but was: 
<{"active":true,"collectionLogic":"NONE","name":"DEF","properties":[],"tags":[],"uuid":"ba033fa5-aa20-44ae-85de-6a33e5cf36a3","version":"2.0"}>

	at org.dependencytrack.resources.v1.ProjectResourceTest.patchProjectParentTest(ProjectResourceTest.java:733)

@@ -538,6 +547,7 @@ public Project updateProject(Project transientProject, boolean commitIndex) {
project.setDescription(transientProject.getDescription());
project.setVersion(transientProject.getVersion());
project.setClassifier(transientProject.getClassifier());
project.setCollectionLogic(transientProject.getCollectionLogic());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the collection logic from say AGGREGATE_DIRECT_CHILDREN to HIGHEST_SEMVER_CHILD can drastically change metric values on existing projects. There will be no indication as to if and when the logic was changed, which can make it tricky to explain why values changed so much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, but also it would be an intended change. Idk what could be done about this except add such change to the logfile, but that would only be helpful for deep analysis of such issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair.

Naive idea: We could add a Collection logic change flag to ProjectMetrics. We could use that in the metrics graph, in some similar way to what you can do in Grafana with annotations:

image

Copy link
Contributor Author

@rkg-mm rkg-mm May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea, but at least on a first look the frontend chart lib doesn't support this. not sure how difficult this would be to add on top of it. Any experience with that? I never used that component.
From data perspective in backend it should not be too complicated.

edit: or would be adding it to the charts tooltip be enough? Also it seems there are options to style datapoints differently. Maybe we could give such points a different styling + add the info in the tooltip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have too strong opinions on how we highlight it, but I do think that highlighting it one way or the other would be good.

Tooltip on it's own might be too hidden, but styling the datapoints sounds good.

Copy link
Contributor Author

@rkg-mm rkg-mm Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nscuro would you be fine with creating a new issue to add this visibility later as further enhancement, and for now rely on log file for visibility in case someone wonders? we could also add a hint in the project edit UI when someone changes the logic, that informs him of this for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging is probably fine for an MVP implementation, yes. But ultimately we will really need some kind of indicator in the UI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nscuro I found a way to show it in the metrics chart. Only issue is that the line gets invisible when hovered. No Idea why. But I think this is fine as it is, looks like its intended :D

Copy link
Contributor Author

@rkg-mm rkg-mm Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok found the reason for the visibility issue and fixed it. Also extended it to the other charts now. Looks like this:

image

Hint: The blue line at 0 will no longer be there, I improved the backend to not record the initial entry after an update to the new version anymore as changed collection logic, my DB just has these entries already

@nscuro
Copy link
Member

nscuro commented Apr 28, 2024

@rkg-mm But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

I debugged the area in question multiple times now and still don't quite understand how this is happening. The parent is part of the response when I comment out these lines: https://github.com/DependencyTrack/dependency-track/pull/3258/files#diff-75cd29a0a085d84dff3cb4794242e0d1b400385f1acfae172346c3e6780629b0R582-R590

Something about those lines causes parent to be unloaded. But they're literally just calling a setter, so I don't quite get why. The test succeeds when adding project.getParent(); after the lines mentioned above. This kind of "force loading" is used in other areas of the code base. I don't like it, but in this case I think it's OK to do it.

@rkg-mm
Copy link
Contributor Author

rkg-mm commented May 5, 2024

oh come on.. Idk what IntelliJ is doing but whenever I rebase this branch on your master I end up having all changes from master in my PR again. What am I doing wrong here? I don't get it...

edit: ok fixed it again with some hacky workaround. still don't know what I do wrong here...

@rkg-mm rkg-mm force-pushed the 2041-introduce-collection-projects branch from efe2c29 to ecfc394 Compare May 5, 2024 20:53
@nscuro
Copy link
Member

nscuro commented May 5, 2024

@rkg-mm How do you do the rebase? I usually run git pull --rebase upstream master in the command line and then use IntelliJ to resolve any conflicts in case there are any.

I never had this particular problem happen, but can imagine IntelliJ might do some unexpected things when merging/rebasing via UI.

@rkg-mm
Copy link
Contributor Author

rkg-mm commented May 5, 2024

In IntelliJ UI I select the upstream master and select "Rebase (my branch) onto master" then push the changes to my remote. The strange thing happens here already, as suddenly my own remote has changes not recognized by my local branch and needs to either be merged (which I tried before and ended up with double changes on same files) or I select "rebase anyway" which avoids double merging. But in both cases I still end up with all those unwanted changes again. Maybe I should try the console attempt next time...

@rkg-mm
Copy link
Contributor Author

rkg-mm commented May 6, 2024

@rkg-mm But I have no clue about this unit test and will need some help here. The test explicitly tests the now not working condition, so I guess its important that it works, but I don't see a change that causes the failure.

I debugged the area in question multiple times now and still don't quite understand how this is happening. The parent is part of the response when I comment out these lines: https://github.com/DependencyTrack/dependency-track/pull/3258/files#diff-75cd29a0a085d84dff3cb4794242e0d1b400385f1acfae172346c3e6780629b0R582-R590

Something about those lines causes parent to be unloaded. But they're literally just calling a setter, so I don't quite get why. The test succeeds when adding project.getParent(); after the lines mentioned above. This kind of "force loading" is used in other areas of the code base. I don't like it, but in this case I think it's OK to do it.

Fixed by your suggestion

Copy link

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.10% (target: -1.00%) 86.04% (target: 70.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (ff22d4f) 22069 16756 75.93%
Head commit (0cfdd8c) 22259 (+190) 16923 (+167) 76.03% (+0.10%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3258) 222 191 86.04%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy will stop sending the deprecated coverage status from June 5th, 2024. Learn more

@nscuro
Copy link
Member

nscuro commented May 7, 2024

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

@nscuro nscuro modified the milestones: 4.11, 4.12 May 7, 2024
@rkg-mm
Copy link
Contributor Author

rkg-mm commented May 7, 2024

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

no problem, I understand. Lets just hope the release cycle is a bit shorter this time :D

@cheapshot2000
Copy link

@rkg-mm We have to schedule this for 4.12 to allow us for sufficient time to review and test.

no problem, I understand. Lets just hope the release cycle is a bit shorter this time :D

Is there a set release cadence? or other logic for determining when releases are completed?

@nscuro
Copy link
Member

nscuro commented May 7, 2024

Is there a set release cadence? or other logic for determining when releases are completed?

There is not at the moment. We originally planned to do monthly releases, but ended up stretching that (by far...) yet again. A problem is that the core team has certain items they want (and have to work on) in a given release, while said core team also needs to review and test community contributions. It's a luxury problem, but it means that more time is needed to meet both our own, and the community's expectations.

This particular PR was planned for inclusion in 4.11 for a long time, but it also touches on the very core of how DT behaves, in turn requiring more review and testing than other, smaller PRs. And of course contributors also don't always have time to answer our annoying requests. :)

We're still in the process of figuring out how to best plan and size releases. Going forward, we're labelling issues with T-shirt sizes, according to their effort. Aim being to limit the number of high-effort items per release, enabling faster cycles.

@msymons
Copy link
Member

msymons commented Aug 8, 2024

Re-assigning to 4.13 milestone in order to reduce the pressure on contributors to get this finished whilst allowing v4.12.0 to be released quicker.

@msymons msymons modified the milestones: 4.12, 4.13 Aug 8, 2024
@rkg-mm rkg-mm force-pushed the 2041-introduce-collection-projects branch from 0cfdd8c to cb0b75b Compare September 14, 2024 15:30
Copy link

codacy-production bot commented Sep 14, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.09% (target: -1.00%) 88.74% (target: 70.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (6f7c49c) 22460 17770 79.12%
Head commit (2bf56f0) 22649 (+189) 17941 (+171) 79.21% (+0.09%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3258) 222 197 88.74%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

rkg-mm added 2 commits October 3, 2024 12:09
…nt in UI to visualize why drastic changes in numbers occured.

Signed-off-by: Ralf King <[email protected]>
…change in UI for first new entry.

Signed-off-by: Ralf King <[email protected]>
@rkg-mm rkg-mm force-pushed the 2041-introduce-collection-projects branch from 5acc770 to 5fc4375 Compare October 3, 2024 10:09
@netomi
Copy link

netomi commented Oct 3, 2024

so the behavior is just fine as it is right now. It feels just odd that when modeling your projects with dependency track you are forced to enter some data for a project whose only purpose is to act as a container for nested projects.

Right now you have to select at least a classifier afaict, so in our case (see the screenshot) we have selected Application, though it is not an Application, but the project that consists of various software products. When selecting that project, you should see all vulnerabilities of child projects as done right now.

This was discussed yesterday in the community call and its a really small thing but people asked me to raise it in this ticket as well.

rkg-mm added 2 commits October 4, 2024 14:06
…_VERSION_CHILDREN to match other 2 names

* Add missing test for AGGREGATE_LATEST_VERSION_CHILDREN logic
* Update outdated tests with additional property

Signed-off-by: Ralf King <[email protected]>
@rkg-mm
Copy link
Contributor Author

rkg-mm commented Oct 4, 2024

@nscuro
From my point of view this PR is good to go now, would be great if you could review this before I have a lot of conflicts to fix again ;-)
Are you fine with making the classifier optional (always or only in case of collections?) as requested by @netomi ? I missed the community meeting and the recording is not there yet, so I cannot check what was said there.

@rkg-mm
Copy link
Contributor Author

rkg-mm commented Oct 6, 2024

@nscuro From my point of view this PR is good to go now, would be great if you could review this before I have a lot of conflicts to fix again ;-) Are you fine with making the classifier optional (always or only in case of collections?) as requested by @netomi ? I missed the community meeting and the recording is not there yet, so I cannot check what was said there.

I added a NONE classifier now after watching the community meetings on youtube.
I furthermore noticed that Frontend does not support all classifiers supported by backend so far, so I also added support for the missing classifiers in Frontend.

Copy link

codacy-production bot commented Oct 6, 2024

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.09% (target: -1.00%) 88.34% (target: 70.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (69cd540) 22588 17881 79.16%
Head commit (f9769d5) 22778 (+190) 18051 (+170) 79.25% (+0.09%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3258) 223 197 88.34%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@netomi
Copy link

netomi commented Oct 15, 2024

fyi: a package of this PR is available at https://github.com/users/netomi/packages/container/dtrack-apiserver/289607235?tag=4.13.0-SNAPSHOT

ghcr.io/netomi/dtrack-apiserver:4.13.0-SNAPSHOT

@m4nch0t
Copy link

m4nch0t commented Nov 18, 2024

Hello,
Thank for the work, it works like a charm for metrics and it's a great feature!

It's give me 2 questions/thoughts:

  • tags doesn't really need to be in direct children for me. We could have a project images who contains all images used by different projects/applications and only create a meta view of an project/application with all dependencies based on tag. To avoid duplicated data?
  • limit the propagation of metrics to one level is made on purpose?

@rkg-mm
Copy link
Contributor Author

rkg-mm commented Nov 18, 2024

  • tags doesn't really need to be in direct children for me. We could have a project images who contains all images used by different projects/applications and only create a meta view of an project/application with all dependencies based on tag. To avoid duplicated data?

Looping through the complete child-subtrees (and on updates of a project through all parents to see if there is any metrics to update) would have a significant performance effect in large instances, which is why I limited possible rules to direct children only.

  • limit the propagation of metrics to one level is made on purpose?

Unless I made a mistake (but I think I tested this) the propagation should go to the direct parent (due to above reasons). But once the parents metrics change, this will trigger an update of the parent's parent, if this also is a calculated project, and so on, so you can build up whole calculated trees this way. You can see the logic in this screenshot https://raw.githubusercontent.com/DependencyTrack/dependency-track/f14fc9a005743f74b5a8ae5812792eac3b956cc5/docs/images/screenshots/collection-projects-structure.png

@nscuro nscuro added the enhancement New feature or request label Dec 4, 2024
@nscuro
Copy link
Member

nscuro commented Dec 7, 2024

@rkg-mm I raised rkg-mm#1 to resolve the merge conflicts, please have a look.

rkg-mm and others added 7 commits December 7, 2024 22:48
Resolve merge conflicts in collection projects PR
Fields are no longer unloaded when a transaction commits (`DataNucleus.RetainValues` is enabled globally), as of stevespringett/Alpine#552.

Signed-off-by: nscuro <[email protected]>
The functionality has been implemented and a test was added already.

Signed-off-by: nscuro <[email protected]>
Store `ProjectCollectionLogic.NONE` as `NULL`. Reason being that the vast majority of projects will not have a collection logic, and storing `NULL` is cheaper than storing the string `NONE` over and over again. The code and REST API will still treat `NULL` as `NONE`, though.

Remove the index on `Project.collectionLogic`. The column has a low cardinality and is only ever queried in `PortfolioMetricsUpdateTask#fetchNextActiveProjectsBatch`. Since this method paginates using the `id` column, the index on `collectionLogic` is never used.

Remove the redundant upgrade item since `collectionLogic` columns are expected to remain `NULL` when a project is not a collection.

Make `ProjectMetrics.collectionLogicChanged` non-nullable and default to `false`. Prevents tri-state logic.

Signed-off-by: nscuro <[email protected]>
Avoid repetitive queries for individual fields accessed during project metrics updates.

Signed-off-by: nscuro <[email protected]>
@nscuro
Copy link
Member

nscuro commented Dec 8, 2024

@rkg-mm Had a few more suggestions / comments and just went ahead and raised a PR for them instead of just commenting: rkg-mm#2

Please have a look and let me know if the proposed changes make sense.

Collection projects suggestions and cleanup
Copy link

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
+0.12% (target: -1.00%) 89.12% (target: 70.00%)
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (69cd540) 22588 17881 79.16%
Head commit (b840f33) 22790 (+202) 18069 (+188) 79.28% (+0.12%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#3258) 239 213 89.12%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more

@nscuro nscuro merged commit 5501ba7 into DependencyTrack:master Dec 10, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parent shows not the sum of the childs Introduce "collection" projects for better usage of hierarchical view
7 participants