NLL Terms #8

denehoffman · 2024-10-23T22:21:03Z

Adds a feature which allows NLL-like terms to be added and multiplied together. This can effectively be used to fit the same model over multiple datasets or fit multiple models over multiple datasets simultaneously. This adds an interface very similar to the one which allows Amplitudes to be registered and formed into Expressions. This should also allow users to introduce their own likelihood terms, like a LASSO term for example.

Todos:

A Python-side interface for making LikelihoodTerms.
LASSO implementation
Constants in LikelihoodExpressions (this should be easy)

…gnatures

Also changes the K-matrix benchmark to collect more samples

This allows `Evaluator` and `NLL` to be safely cloned without duplicating precomputed data, which is just really nice in general but also lets us copy them around as likelihood terms. The way it's currently implement doesn't seem to change performance.

7000 evals takes too long, let's just shoot for 30 seconds

`LikelihoodTerm`s can be registered in a `LikelihoodManager` just like `Amplitude`s get registered in `Manager`s. The registration returns a `LikelihoodID`, which behaves the same way as an `AmplitudeID` and can be added or multiplied with other `LiklihoodExpression`s. These `LikelihoodExpression`s can then be loaded by the `LikelihoodManager` to create a `LikelihoodEvaluator` which can be minimized in the same way as an `NLL`. Since this is an extended implementation, we also keep the minimization method for `NLL`, since not all `LikelihoodTerm`s *should* be minimized.

I honestly don't think the measurement time does anything, maybe it takes the maximum of either of these settings?

codecov · 2024-10-23T22:23:51Z

Codecov Report

Attention: Patch coverage is 0% with 1070 lines in your changes missing coverage. Please review.

Project coverage is 10.36%. Comparing base (b646360) to head (04cc7b8).
Report is 33 commits behind head on main.

Files with missing lines	Patch %	Lines
src/likelihoods.rs	0.00%	464 Missing ⚠️
src/python.rs	0.00%	273 Missing ⚠️
src/amplitudes/mod.rs	0.00%	138 Missing ⚠️
src/amplitudes/kmatrix.rs	0.00%	117 Missing ⚠️
src/amplitudes/common.rs	0.00%	41 Missing ⚠️
src/amplitudes/breit_wigner.rs	0.00%	16 Missing ⚠️
src/amplitudes/ylm.rs	0.00%	9 Missing ⚠️
src/amplitudes/zlm.rs	0.00%	9 Missing ⚠️
src/resources.rs	0.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main       #8      +/-   ##
==========================================
- Coverage   12.69%   10.36%   -2.33%     
==========================================
  Files          14       15       +1     
  Lines        3213     3935     +722     
  Branches     3213     3935     +722     
==========================================
  Hits          408      408              
- Misses       2805     3527     +722

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2024-10-23T22:28:15Z

Benchmark for `545d5f8`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	2.9±0.02ms	3.0±0.16ms	+3.45%

…educe code duplication

github-actions · 2024-10-24T16:33:20Z

Benchmark for `1fce241`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.0±0.29ms	3.0±0.09ms	0.00%

github-actions · 2024-10-24T19:21:34Z

Benchmark for `5370a2c`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.1±0.24ms	3.0±0.16ms	-3.23%

github-actions · 2024-10-26T18:15:21Z

Benchmark for `6d37313`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.0±0.25ms	3.3±0.16ms	+10.00%

denehoffman · 2024-10-26T18:24:56Z

This is a huge step forward in terms of efficiency. Since gradient-requiring methods like the L-BFGS-B optimizer spend a lot of time calculating gradients, any improvement in the speed of this calculation is greatly beneficial.

The gradient commit (6af103d) does this in a very easy-to-use way for end users. First, ganesh provides a gradient method which can be overwritten. By default, it uses a central finite difference, which means to calculate a gradient, it must calculate the function twice for every free parameter in the function. This is theoretically the best you can do if you don't know the analytic gradient.

However, we can be a bit clever here. Since there are only a few ways we allow users to operate on Amplitudes, we can let each amplitude calculate its own central finite difference and then use the chain rule on each of the standard operations (the Expressions enum) to calculate the entire gradient from that. But we are not done just yet! Since not every Amplitude relies on every free parameter (unless your expression is just one Amplitude), we only have to calculate the finite difference for the free parameters used by the Amplitude, setting the other gradient coordinates to zero for that Amplitude. Furthermore, if we know the analytic form of the gradient for an Amplitude, we can just use that (and this commit provides convenience methods for both). For example, the gradient of the Scalar amplitude is just a vector of zeros with a one in the position of the corresponding parameter. The same goes for the ComplexScalar, except there are two parameters, and the imaginary one gets an additional factor of $\imath$. Finally, there are some Amplitudes, like the Ylm and Zlm, which are actually just constants with respect to the free parameters in the fit. While we have to know their value to determine the gradient of more complicated expressions involving them, their actual gradient is just zero, which simplifies the chain rule. There is probably some additional optimization we could do with products to avoid messing with this zero-gradient, but for right now, this works incredibly well.

In summary, if you don't implement a gradient for an Amplitude, you'll get the same central finite difference that ganesh uses anyway. You can speed that up by only caring about the parameters used by your Amplitude, and there's a convenience method (Amplitude::central_difference_with_indices) for doing this without much additional code. In this case, or in the case where you do know the gradient, you just have to override the Amplitude::compute_gradient method (which takes a gradient as a mutable reference to a nalgebra::DVector for memory reasons).

In my own tests, this speeds up the L-BFGS-B algorithm a ton (I don't have a benchmark for this, but it's actually quite shocking in practice). Of course, this will have no effect on gradient-free methods like Nelder-Mead, although it will speed up the calculation of the Hessian/error matrix at the end of both algorithms. Since computing the Hessian is like taking the gradient of the gradient, and the gradient takes twice as many function evaluations as free parameters ($2N$), the central finite difference Hessian would take the square of this ($4N^2$), which scales horribly if $N$ is large. For example, one of my test calculations has on the order of 40 free parameters and the objective function takes around 700ms to compute. This means it takes around 1.25 hours to compute the Hessian. However, all of the gradients for the Amplitudes being used are known analytically, so this same calculation just takes $2N$ computations of the gradient, or less than a minute (or somewhere around this, since the gradient doesn't take exactly as much time to calculate as the function itself).

…m_amptools` method in the main python package Now users can either convert files via the command line `amptools-to-laddu` script installed automatically with `laddu` or via the function `laddu.convert_from_amptools` directly in their Python code. This greatly simplifies the pipeline for using `AmpTools`-formated data in `laddu` without much additional effort for users (and hopefully aids in adoption)

github-actions · 2024-10-26T19:09:35Z

Benchmark for `092c0ea`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.0±0.13ms	3.3±0.33ms	+10.00%

…thon API for `NLL` and add method to turn an `NLL` into a `LikelihoodTerm`

…e the same length Previously, if two `LikelihoodTerm`s had a different number or layout of parameters, the gradients wouldn't be compatible. This should fix that by using the parameter layouts to move values from the local gradient for each term to the global gradient for the `LikelihoodExpression`.

…ihoodTerm`s by a scalar-valued parameter

github-actions · 2024-10-27T18:34:53Z

Benchmark for `1690076`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.1±0.24ms	3.3±0.16ms	+6.45%

github-actions · 2024-10-28T14:00:45Z

Benchmark for `8a8f70d`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.1±0.28ms	3.3±0.17ms	+6.45%

…ghts if none are provided

github-actions · 2024-10-28T18:39:14Z

Benchmark for `83543b7`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.1±0.47ms	3.2±0.19ms	+3.23%

Floating point addition is not associative, and the standard parallel summation was causing non-deterministic results. This should be fixed with the inclusion of the `accurate` crate in this commit, although I've included a temporary hack for the gradients since you cannot sum `DVector`s using `accurate`. This should eventually be fixed, since I think it might cause a slight unneeded bump in memory (although maybe the compiler optimizes this out?).

…om MC AmpTools doesn't scale the MC term by the size of the "data" dataset, so I've chosen not to either. This seems to give better results and was probably the correct way to do it in the first place.

This is a simple, self-contained example script that should "just work". The data and MC have been generated with [`gen_amp`](https://github.com/JeffersonLab/halld_sim/tree/962c1fffc29eb4801b146d0a7f1e9aecb417374a/src/programs/Simulation/gen_amp) with the same amplitudes as in the example. Note that the overall scaling of parameters seems to be different. The generated S-wave had magnitude `100+0i` and the D-wave had `50+50i`, but the fit values from `laddu` aren't near those. However, the S/D ratio and phase are very close to correct, so this can probably be chalked up to some overall difference in how amplitudes are scaled in the two architectures.

github-actions · 2024-10-29T19:34:17Z

Benchmark for `4a8984f`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.0±0.02ms	3.3±0.16ms	+10.00%

github-actions · 2024-10-29T20:01:41Z

Benchmark for `0c1cc69`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.1±0.27ms	3.3±0.19ms	+6.45%

github-actions · 2024-10-29T20:04:10Z

Benchmark for `6b7faf9`

Click to view benchmark

Test	Base	PR	%
kmatrix benchmark (nll)	3.0±0.04ms	3.3±0.17ms	+10.00%

denehoffman added 9 commits October 23, 2024 11:32

refactor: store Expressions inside Evaluators to simplify call si…

cda8def

…gnatures

feat: Add LikelihoodTerm trait and implement it for NLL

ec7c1e0

Also changes the K-matrix benchmark to collect more samples

bench: change benchmark config

570a937

7000 evals takes too long, let's just shoot for 30 seconds

refactor: move Likelihood-related code to new likelihoods module

0fa8270

bench: add sample size specification

c9aacc5

I honestly don't think the measurement time does anything, maybe it takes the maximum of either of these settings?

feat: add python API for likelihood terms and document Rust API

f196146

fix: correct some signatures and fix PyObserver implementation

64e4b37

denehoffman added 3 commits October 24, 2024 12:26

style: move parsing of minimizer options to a dedicated function to r…

bdc6168

…educe code duplication

deps: update ganesh to latest version (better default epsilons)

dc5a0aa

style: move kwarg extractor to be near parser

951bbfd

feat: add amptools-to-laddu conversion script to python package

baf3e90

denehoffman added 2 commits October 24, 2024 15:35

style: remove lints

502ae70

feat: add gradient calculations at Amplitude level

6af103d

denehoffman added 3 commits October 27, 2024 14:24

feat: expose the underlying dataset and Monte-Carlo dataset in the Py…

5e849f5

…thon API for `NLL` and add method to turn an `NLL` into a `LikelihoodTerm`

feat: adds a LikelihoodScalar term that can be used to scale `Likel…

c25eb85

…ihoodTerm`s by a scalar-valued parameter

fix: these indices were backwards

b14a962

fix: make sure rayon-free build works

18303ad

denehoffman added 4 commits October 28, 2024 14:31

fix: ensure extension-module is used with the python feature

6248261

fix: correct type hints

f149e1f

feat: add method to input beam polarization info and assume unity wei…

76835d3

…ghts if none are provided

feat: add Debug derive for Parameters

dc3e406

denehoffman added 3 commits October 29, 2024 15:20

fix: change NLL implementation to properly weight the contribution fr…

31e0849

…om MC AmpTools doesn't scale the MC term by the size of the "data" dataset, so I've chosen not to either. This seems to give better results and was probably the correct way to do it in the first place.

denehoffman and others added 4 commits October 29, 2024 15:35

fix: update example_1.py to allow running from any directory

684fd33

feat: add gen_amp config file for Python example_1

df8915e

docs: update README.md to include the first python example

b5e42af

docs: some stylistic changes to the README

04cc7b8

denehoffman merged commit 9121511 into main Oct 30, 2024
20 of 22 checks passed

denehoffman deleted the nll-terms branch October 30, 2024 01:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLL Terms #8

NLL Terms #8

denehoffman commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

github-actions bot commented Oct 23, 2024

github-actions bot commented Oct 24, 2024

github-actions bot commented Oct 24, 2024

github-actions bot commented Oct 26, 2024

denehoffman commented Oct 26, 2024

github-actions bot commented Oct 26, 2024

github-actions bot commented Oct 27, 2024

github-actions bot commented Oct 28, 2024

github-actions bot commented Oct 28, 2024

github-actions bot commented Oct 29, 2024

github-actions bot commented Oct 29, 2024

github-actions bot commented Oct 29, 2024

NLL Terms #8

NLL Terms #8

Conversation

denehoffman commented Oct 23, 2024

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

github-actions bot commented Oct 23, 2024

Benchmark for 545d5f8

github-actions bot commented Oct 24, 2024

Benchmark for 1fce241

github-actions bot commented Oct 24, 2024

Benchmark for 5370a2c

github-actions bot commented Oct 26, 2024

Benchmark for 6d37313

denehoffman commented Oct 26, 2024

github-actions bot commented Oct 26, 2024

Benchmark for 092c0ea

github-actions bot commented Oct 27, 2024

Benchmark for 1690076

github-actions bot commented Oct 28, 2024

Benchmark for 8a8f70d

github-actions bot commented Oct 28, 2024

Benchmark for 83543b7

github-actions bot commented Oct 29, 2024

Benchmark for 4a8984f

github-actions bot commented Oct 29, 2024

Benchmark for 0c1cc69

github-actions bot commented Oct 29, 2024

Benchmark for 6b7faf9

codecov bot commented Oct 23, 2024 •

edited

Loading

Benchmark for `545d5f8`

Benchmark for `1fce241`

Benchmark for `5370a2c`

Benchmark for `6d37313`

Benchmark for `092c0ea`

Benchmark for `1690076`

Benchmark for `8a8f70d`

Benchmark for `83543b7`

Benchmark for `4a8984f`

Benchmark for `0c1cc69`

Benchmark for `6b7faf9`