Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(physmon): compare GX2F vs KF truth tracking #3889

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

AJPfleger
Copy link
Contributor

@AJPfleger AJPfleger commented Nov 22, 2024

Adds a new item to the physmon output to get the current comparison of GX2F and KF. This item cannot fail because all checks are turned off.

It looks something like this:

...
✅ AMVF gauss notime | trackfinding | ttbar with 200 pileup | default seeding
✅ AMVF grid time | trackfinding | ttbar with 200 pileup | default seeding
🔵️ Comparison - Truth tracking (GX2F vs KF)

The FPE check for the KF is set to false due to this issue:

The comparison highlights some interesting differences, which could be worked on in the future. e.g.:
image
image
image
image

--- END COMMIT MESSAGE ---

@asalzburger @XiaocongAi

blocked:

Summary by CodeRabbit

  • New Features

    • Introduced a new configuration file for flexible checks.
    • Added a new mode for comparing tracking methods in the performance monitoring script.
    • Enhanced output formatting in summary reports with visual distinctions for comparison entries.
    • Launched a comprehensive workflow for simulating and reconstructing particle tracks with Gx2f and Kalman algorithms.
    • Updated functions to allow customizable naming and integration of machine learning models in track filtering.
  • Bug Fixes

    • Improved error handling and output consistency in various scripts.

@AJPfleger AJPfleger added this to the next milestone Nov 22, 2024
Copy link

coderabbitai bot commented Nov 22, 2024

Walkthrough

A new configuration file info_only.yml has been added to specify a set of checks, including Chi2Test, KolmogorovTest, RatioCheck, ResidualCheck, and IntegralCheck, all set to null. The script phys_perf_mon.sh has been updated to introduce a new mode gx2f_vs_kf for comparing two tracking methods. Additionally, the summary.py file now formats summary entries based on their titles, enhancing visual distinction. A new workflow file physmon_trackfitting_gx2f_vs_kf.py has been created for simulating particle tracks using Gx2f and Kalman filtering algorithms. Finally, several functions in reconstruction.py have been modified to include new parameters for improved configurability.

Changes

File Path Change Summary
CI/physmon/config/info_only.yml New configuration file with checks: Chi2Test, KolmogorovTest, RatioCheck, ResidualCheck, IntegralCheck set to null.
CI/physmon/phys_perf_mon.sh Added new mode gx2f_vs_kf for comparison; updated usage message; added dataset generation and histogram comparison for new mode.
CI/physmon/summary.py Enhanced output formatting for HTML and Markdown summaries based on title checks, using blue circle emoji for "Comparison" entries.
CI/physmon/workflows/physmon_trackfitting_gx2f_vs_kf.py New file created for simulating and reconstructing particle tracks using Gx2f and Kalman algorithms.
Examples/Python/python/acts/examples/reconstruction.py Updated functions to include new parameters: whiteboardAliasTracks for track naming and onnxModelFile for seed filtering.

Possibly related PRs

Suggested labels

Component - Core, Track Finding, automerge

Suggested reviewers

  • paulgessinger

In the realm of code, changes abound,
New checks and modes, in structure profound.
With summaries bright, and tracks to align,
A dance of functions, in harmony they shine.
Jedi wisdom guides, as we enhance our way,
In the force of coding, together we sway! 🌌✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the Infrastructure Changes to build tools, continous integration, ... label Nov 22, 2024
@AJPfleger AJPfleger marked this pull request as ready for review November 22, 2024 10:47
@AJPfleger
Copy link
Contributor Author

Do we want to do this comparision also on the gitlab ci?
I could make an extra option, that we do the info-stuff only in the github-actions to have the nice comment added to the PR and safe ressources.

Copy link

github-actions bot commented Nov 22, 2024

📊: Physics performance monitoring for d6e9f8e

Full contents

physmon summary

@paulgessinger
Copy link
Member

@AJPfleger I think we should keep the two configurations in sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should rather use the output of the two separate physmon workflows directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that too, but decided against it for three reasons:

  • The outputs might be from different settings. (Still can be, but there is one layer less)
  • I wanted to have a script:
    • that directly generates a comparison
    • can tune more parameters at the same time, like number of events, offsets, detector configuration (I used it already for detailed comparisons, where I only used a small part of the ODD), ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how you achieve that by calling runTruthTrackingKalman and runTruthTrackingGx2f. Most of the settings are hidden in these functions.

I would either build a complete reconstruction chain that runs both fitters in a single sequencer or use the existing outputs. The idea is that the configuration of all the fitters are the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how you achieve that by calling runTruthTrackingKalman and runTruthTrackingGx2f. Most of the settings are hidden in these functions.

The outputs might be from different settings. (Still can be, but there is one layer less)

We execute phys_perf_mon.sh. Inside we execute the scripts physmon_trackfitting_kf.py and physmon_trackfitting_gx2f.py which both then call runTruthTrackingKalman/runTruthTrackingGx2f with the option-set:

events
numThreads
logLevel
trackingGeometry
field
digiConfigFile
outputDir

(which is still missing the option for trackFpes)

The logLevel and outputDir setting are not relevant for the comparison, leaving still 5 settings, that could be avoid to have different by having a single script. This is what I wanted to say with "one layer less".

[A script,] that directly generates a comparison

If relying on the other outputs, one needs to run

phys_perf_mon.sh gx2f
phys_perf_mon.sh kf
phys_perf_mon.sh gx2f_vs_kf

each time when developing something. (This would probably end up in a local script anyway, which I wanted to avoid.) Locally, a large amount of time is spent on creating the plots at each phys_perf_mon.sh-run, which I wanted to reduce. Of course not all 3 would need to be run for each change, especially, when developing for a single fitter, but it still makes development quite annoying.

[A script,] can tune more parameters at the same time, like number of events, offsets, detector configuration (I used it already for detailed comparisons, where I only used a small part of the ODD), ...

When someone wants to change some parameters of the comparison, they would need to touch with your approach at least 2 files and need to check, if everything is copied over correctly. By having it in a single file, development is

  • a lot smoother, since only a single value needs to be changed, instead of 2
  • a lot faster, since only a single file with the purpose of doing this needs to be touched, which is useful for remote development
  • a lot safer, because both fitter will receive the same input reducing the place for human error
    I added also more options to the script:
skip
trackFpes

skip is useful, when starting to compare single events. I used this a lot when comparing the fitters track-by-track.
trackFpes this is necessary, because the KF unfortunately has sometimes problems with unmasked FPEs. I don't want to turn it off in the original CI-run, because we really want to have it there.

I would either build a complete reconstruction chain that runs both fitters in a single sequencer or use the existing outputs.

I don't see the need for a full reconstruction chain. This is another large function, that needs to be maintained. Assuming that the configuration of all the fitters are the same it should be enough to have a direct call to the existing functions, which could be achieved with my small proposed script.

Does this make my idea behind the workflow clearer? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the physmon scripts should be used for any developments. This is something you should rather do with dedicated local scripts.

If relying on the other outputs, one needs to run

phys_perf_mon.sh gx2f
phys_perf_mon.sh kf
phys_perf_mon.sh gx2f_vs_kf

You can just run gx2f and kf when you ask for gx2f_vs_kf. This is just a matter changing ifs.

I don't see the need for a full reconstruction chain. This is another large function, that needs to be maintained. Assuming that the configuration of all the fitters are the same it should be enough to have a direct call to the existing functions, which could be achieved with my small proposed script.

I think we should either go the full way and replicate a simulation and reconstruction chain which guarantees that you get the same event and synchronized outputs or we aim for minimal duplication. Right now it looks like it is in the middle with some duplication but seemingly no benefit to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the physmon scripts should be used for any developments. This is something you should rather do with dedicated local scripts.

I see your point. I will reconsider, if we need this feature at all and check, if can find more points or better words to explain the above mentioned benefits. :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is definitely nice to have. It seems a bit on the boundary of what physmon is supposed to do. Since usually we are comparing if something changed but here we just generate plots to look at and don't fail the CI at any point.

I though about doing something similar for the steppers/navigator where you can just look at differences instead of references. If we can separate these plots out from the monitoring plots, like you did by putting it at the bottom with a special indicator, it quite fine to me.

My only concern then would be CI run time where we can safe some time by reusing existing outputs instead of generating them twice, or even more often given multiple fitters or steppers in my example. Potentially the plotting itself is the heaviest part right now anyways and the reprocessing might not be a big concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now put both fitters in a single sequencer as suggested. I can live with that compromise. I had to adapt the whiteboard a bit to support multiple fitters:

@AJPfleger AJPfleger marked this pull request as draft November 27, 2024 19:59
kodiakhq bot pushed a commit that referenced this pull request Nov 28, 2024
Popped up in a discussion in #3889 that the configurations should be the same for all fitters.

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->

## Summary by CodeRabbit

- **Chores**
	- Updated parameter naming in function calls for improved clarity and readability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@github-actions github-actions bot added the Component - Examples Affects the Examples module label Nov 28, 2024
@AJPfleger AJPfleger added the 🛑 blocked This item is blocked by another item label Nov 28, 2024
Copy link

sonarcloud bot commented Nov 28, 2024

@AJPfleger AJPfleger marked this pull request as ready for review November 28, 2024 12:16
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (6)
CI/physmon/config/info_only.yml (1)

1-7: Document the purpose of disabled checks, we must.

Hmmmm, clear the structure is, but documentation it lacks. Add comments to explain why all checks are disabled, you should. Help future maintainers understand the purpose, it will.

+# Configuration for GX2F vs KF truth tracking comparison
+# All checks intentionally disabled for fail-safe implementation
 checks:
   "*":
     Chi2Test: null
     KolmogorovTest: null
     RatioCheck: null
     ResidualCheck: null
     IntegralCheck: null
CI/physmon/summary.py (1)

51-60: Hmmmm, good changes these are, but a small improvement suggest I must.

Clean and logical the implementation is, young padawan. Yet, extract the emoji constants to the top of the file, we should. Make future changes easier, this will.

# At the top of the file with other constants
+EMOJI_COMPARISON = "🔵"
+EMOJI_SUCCESS = "✅"
+EMOJI_FAILURE = "🔴"

# In the HTML generation
-        <li>🔵 <a href="{s["path"]}">{s["title"]}</a></li>"""
+        <li>{EMOJI_COMPARISON} <a href="{s["path"]}">{s["title"]}</a></li>"""

-        <li>{"✅" if s["total"] else "🔴"} <a href="{s["path"]}">{s["title"]}</a></li>"""
+        <li>{EMOJI_SUCCESS if s["total"] else EMOJI_FAILURE} <a href="{s["path"]}">{s["title"]}</a></li>"""
CI/physmon/phys_perf_mon.sh (1)

465-475: Consider GitLab CI resource optimization, mentioned in PR comments you have.

In PR comments, discussion about limiting output to GitHub Actions exists. Implement configuration option for this, you should.

Consider adding an environment check:

 if [[ "$mode" == "all" || "$mode" == "gx2f_vs_kf" ]]; then
+    # Skip resource-intensive comparison in GitLab CI
+    if [ "$CI_SERVER_NAME" = "GitLab" ]; then
+        echo "Skipping GX2F vs KF comparison in GitLab CI"
+        return
+    fi
     run_histcmp \
CI/physmon/workflows/physmon_trackfitting_gx2f_vs_kf.py (3)

65-77: Particle selection criteria, review you should.

The postSelectParticles configuration may exclude particles unintentionally. Verify that the pt range and minMeasurements align with your analysis needs.


97-105: Consistency in track aliases, maintain you must.

Assigning whiteboardAliasTracks="tracks-gx2f" should match the track names used downstream to prevent confusion.


129-136: Consistency in Kalman track naming, vital it is.

Match whiteboardAliasTracks="tracks-kf" with the naming conventions used to avoid confusion between different track collections.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e10cd54 and d6e9f8e.

📒 Files selected for processing (5)
  • CI/physmon/config/info_only.yml (1 hunks)
  • CI/physmon/phys_perf_mon.sh (3 hunks)
  • CI/physmon/summary.py (2 hunks)
  • CI/physmon/workflows/physmon_trackfitting_gx2f_vs_kf.py (1 hunks)
  • Examples/Python/python/acts/examples/reconstruction.py (6 hunks)
🔇 Additional comments (10)
CI/physmon/config/info_only.yml (1)

2-7: Verify the intention of disabled checks, we must.

Disabled all checks are, young padawan. With the Force, I sense this aligns with the fail-safe implementation mentioned in the PR. Yet verify this intention explicitly, we should.

✅ Verification successful

Confirmed the intention is, young padawan. Aligned with the Force, this change is.

Clear it now becomes, through examination of other configuration files. In default.yml and track fitting configurations, specific thresholds for these checks defined they are. Yet in info_only.yml, disabled all checks are - a deliberate choice this appears to be.

Pattern observed in track fitting configs:

  • Selective disabling of checks for specific metrics they do (pullwidth_*, pullmean_*, etc.)
  • Default thresholds maintained for general cases
  • Careful exceptions made where needed

Therefore, in info_only.yml, disabling all checks aligns with its purpose - for information gathering only, without enforcing checks it is. Consistent with the fail-safe implementation mentioned in the PR, this approach is.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for related test configurations to ensure consistency
echo "Searching for similar check configurations..."
rg -l "Chi2Test|KolmogorovTest|RatioCheck|ResidualCheck|IntegralCheck" --type yaml

# Look for documentation about these checks
echo "Searching for documentation about these checks..."
rg "Chi2Test|KolmogorovTest|RatioCheck|ResidualCheck|IntegralCheck" --type md

Length of output: 553


Script:

#!/bin/bash
# Examine the content of related configuration files to understand the check patterns
echo "Default configuration, let us see..."
cat CI/physmon/config/default.yml

echo "\nOther configurations, we shall examine..."
for file in CI/physmon/config/trackfitting_*.yml; do
    echo "\nInspecting $file:"
    cat "$file"
done

# Search for any related Python files that might implement these checks
echo "\nImplementation files, we shall find..."
rg -l "class (Chi2Test|KolmogorovTest|RatioCheck|ResidualCheck|IntegralCheck)" --type py

Length of output: 3168

CI/physmon/summary.py (1)

83-87: Strong with the Force, this code is. Approve it, I do.

Consistent with the HTML changes, this implementation is. Proper handling of comparison entries, it maintains. The path of the Jedi, you follow well.

CI/physmon/phys_perf_mon.sh (3)

23-24: LGTM! Properly updated the mode validation, hmmmm.

Correctly added the new mode gx2f_vs_kf to the valid options and updated the usage message, I see.


166-168: LGTM! Dataset generation for comparison, well implemented it is.

The new conditional block properly generates the comparison dataset using run_physmon_gen with appropriate title and slug.


465-475: Verify the impact of disabled checks in info_only.yml, we must.

Hmmmm, disabled all checks in info_only.yml as mentioned in PR objectives, I see. But verify the implications, we should.

Run this script to examine the config file:

Additionally, consider adding a comment in the code explaining why checks are disabled:

 if [[ "$mode" == "all" || "$mode" == "gx2f_vs_kf" ]]; then
+    # Note: Using info_only.yml which disables all checks for fail-safe comparison
     run_histcmp \
         $outdir/data/trackfitting_gx2f_vs_kf/performance_trackfitting_gx2f.root \
✅ Verification successful

I seek to understand the normal configuration patterns and documentation about these checks, I do. Help us make a wise decision about disabling all checks, this will.


Safe and intentional, the disabling of checks appears to be, young padawan.

Clear in the Force, I see this matter now. The info_only.yml follows the same pattern as trackfitting_gx2f.yml, where checks are purposefully disabled. A common practice for comparison-only scenarios, this is. Other config files show normal threshold values, they do:

  • Chi2Test: 0.01
  • KolmogorovTest: 0.68
  • RatioCheck: 3
  • ResidualCheck: 3
  • IntegralCheck: 3

Used only in gx2f_vs_kf comparison mode, this configuration is. Intentional design choice, not oversight, mmmmm. For visualization without statistical validation, appropriate this is.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Examine the contents and usage of info_only.yml

# Check the contents of info_only.yml
cat CI/physmon/config/info_only.yml

# Find other files that might be affected by this configuration
rg -l "info_only.yml"

Length of output: 215


Script:

#!/bin/bash
# Search for other config files to understand the normal check patterns
fd yml --exec head -n 20 {} \; --exec echo "---" \; CI/physmon/config/

# Find where run_histcmp is defined to understand its usage of config
rg -A 5 "run_histcmp\(\)" CI/physmon/phys_perf_mon.sh

# Look for any documentation about these checks
fd README.md --exec cat {} \; CI/physmon/

Length of output: 3094

Examples/Python/python/acts/examples/reconstruction.py (4)

1286-1286: Approve the new parameter addition to addKalmanTracks, I do!

Wise addition of whiteboardAliasTracks parameter, this is. Flexibility in naming output tracks, it provides. Default value maintains backward compatibility, it does.


1422-1422: Consistent implementation in addCKFTracks, I observe!

Mirror the pattern from addKalmanTracks, this change does. Harmony in the codebase, it maintains.


1573-1573: Complete the trilogy of changes with addGx2fTracks, we do!

Consistent with its siblings, this implementation is. Balance to the Force... err... codebase, it brings.


1286-1286: Verify the usage of these new parameters across the codebase, we must!

Search for existing track aliases, we shall, to ensure no conflicts arise.

Also applies to: 1422-1422, 1573-1573

✅ Verification successful

The Force guides me to search deeper into the codebase, particularly for the new parameter whiteboardAliasTracks and its usage. This shall reveal if any conflicts with existing aliases exist, hmm.


Safe and well-designed, this parameter usage is, young Padawan!

Through the Force, I sense that:

  • Default value "tracks" aligns with existing usage patterns, it does
  • In CI workflows, unique aliases like "tracks-gx2f" and "tracks-kf" used properly, they are
  • No conflicting usage detected across the codebase, hmmmm
  • Parameterization allows flexibility without breaking existing code paths, wise choice this is

The parameter serves its purpose well, allowing customization where needed while maintaining backward compatibility. Trust in the code, we can!

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for track alias usage to verify no conflicts

# Search for whiteboard alias assignments for tracks
rg -A 2 "addWhiteboardAlias.*tracks.*" 

Length of output: 2294


Script:

#!/bin/bash
# Search for track alias usage in function parameters and variable assignments
ast-grep --pattern 'whiteboardAliasTracks = $_'

# Search for any other track alias references
rg "whiteboardAliasTracks" -A 2

Length of output: 2238

CI/physmon/workflows/physmon_trackfitting_gx2f_vs_kf.py (1)

56-59: Zero vertex spread, intentional is it?

Setting both mean and stddev to zero in GaussianVertexGenerator results in all particles originating from a single point without any spatial spread.

Ensure this reflects your simulation goals. If a realistic vertex distribution desired is, adjust stddev accordingly:

     vtxGen=acts.examples.GaussianVertexGenerator(
         mean=acts.Vector4(0, 0, 0, 0),
-        stddev=acts.Vector4(0, 0, 0, 0),
+        stddev=acts.Vector4(0, 0, 0, 0.1 * u.ns),
     ),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - Examples Affects the Examples module Infrastructure Changes to build tools, continous integration, ... 🛑 blocked This item is blocked by another item
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants