NLE with multiple iid conditions #1331

janfb · 2024-12-12T14:40:32Z

Scenario: we are using NLE in a use-case where x_o consists of i.i.d. trials, and our simulator has both parameters and experimental conditions (both contained in theta). Thus, we are training an emulator on the full simulator. Then, at inference time, we want to additionally condition on a set of experimental conditions and perform inference only over the parameters. To do so, we define a new potential function that fixes those columns in theta that correspond to the observed experimental conditions and leaves the rest of theta free for inference.

Problem: So far, this was implemented only for a single experimental condition, i.e., a batch size of one. In practice, however, we often have a batch i.i.d. trials in x_o and a matching batch of i.i.d. conditions (e.g., varying difficulty as experimental conditions in a decision-making experiment). This scenario is not supported in the current implementation because the ConditionedPotential allows only a single value as a condition.

Solution: This PR introduces a new method to the LikelihoodBasedPotential (because this case is specific for NLE), that allows to condition on a batch of conditions that matches the batch of i.i.d. x_o. It uses the underlying density estimator to evaluate the log likelihood $\log ; L(\theta | [ x_0, \ldots, x_N], [c_0, \ldots, c_N])$ that is needed to obtain the posterior over the parameters given observed i.i.d. trials and matching i.i.d. experimental conditions.

introduce condition_on_theta method for likelihood-estimator-based potential to condition on a batch of theta values (experimental conditions) that match the current batch of iid x_o.
deprecate MNLE-based potential (can be nle-based)
adapt tests for conditioned mnle.
adapt Example notebook for decision-making experiments.

- deprecate MNLE-based potential (can be nle-based) - adapt tests for conditioned mnle.

codecov · 2024-12-12T15:29:45Z

Codecov Report

Attention: Patch coverage is 62.06897% with 11 lines in your changes missing coverage. Please review.

Project coverage is 78.41%. Comparing base (06890eb) to head (737764d).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
...inference/potentials/likelihood_based_potential.py	59.25%	11 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1331       +/-   ##
===========================================
- Coverage   89.39%   78.41%   -10.98%     
===========================================
  Files         118      118               
  Lines        8709     8748       +39     
===========================================
- Hits         7785     6860      -925     
- Misses        924     1888      +964

Flag	Coverage Δ
unittests	`78.41% <62.06%> (-10.98%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sbi/inference/trainers/nle/mnle.py	`85.36% <100.00%> (-7.32%)`	⬇️
sbi/utils/conditional_density_utils.py	`73.46% <100.00%> (-21.09%)`	⬇️
sbi/utils/sbiutils.py	`78.11% <ø> (-8.68%)`	⬇️
...inference/potentials/likelihood_based_potential.py	`69.76% <59.25%> (-30.24%)`	⬇️

... and 31 files with indirect coverage changes

michaeldeistler

I am not sure I fully understand tbh.

IIUC, then there are four different dimensions across which we could vectorize:

Across the x batch-dimension, in case we want to amortize.
Across the x iid-dimension, in case we have iid samples.
Across a batch of theta conditions which act as different experimental conditions
Across thetas, in case we want to run multi-chain.

Which one of these does this PR solve? And which ones does it not yet solve?

Can we maybe somewhere define names for each of these four dimensions above and specify which ones are handled by which function (or which ones are currently assumed to be the same dimension and therefore do not have a cartesian product applied to them)?

sbi/inference/potentials/likelihood_based_potential.py

janfb · 2024-12-13T09:58:38Z

As discussed in the call, the plan is to

introduce (sample, batch, *event) for the theta_condition at inference time
this will enable sampling for multiple xs (batched x) multiple iid observations (for each x) and simultaneously passing a corresponding batch of conditions for each x. For example, for a decision-making use-case, it will be possible to sample in one call: multiple subject (batch of xs) where each subject performed N trials with N (different) experimental conditions. N needs to be the same for all subjects though.
improve naming and docs to avoid confusion between condition and experimental_condition, e.g., use theta_condition
add tests for new conditioning functions

janfb · 2024-12-18T10:11:46Z

@dgedon this is ready for your review. Do let me know if anything is unclear. Thanks!

sbi/inference/potentials/likelihood_based_potential.py

tests/mnle_test.py

manuelgloeckler

Had a bit of a look into it. The test should be fixed, but otherwise looks great. Only left some minor comments on naming and docstrings.

add method for iid-batched conditioning.

b2fe636

- deprecate MNLE-based potential (can be nle-based) - adapt tests for conditioned mnle.

janfb requested a review from michaeldeistler December 12, 2024 14:40

update notebook, bugfixes

074efa0

michaeldeistler reviewed Dec 12, 2024

View reviewed changes

janfb added 2 commits December 17, 2024 13:49

add batch dim for x, add test.

8a95c8f

fix shape handling, adapt tutorial.

ced7e15

janfb changed the title ~~add method for iid-batched conditioning.~~ allow multiple iid conditions Dec 17, 2024

janfb changed the title ~~allow multiple iid conditions~~ NLE with multiple iid conditions Dec 17, 2024

fix test

b058b1c

janfb requested a review from dgedon December 18, 2024 10:11

janfb mentioned this pull request Dec 19, 2024

Add sample_dim for condition in conditional estimator #1339

Open

manuelgloeckler reviewed Dec 20, 2024

View reviewed changes

sbi/inference/potentials/likelihood_based_potential.py Outdated Show resolved Hide resolved

manuelgloeckler reviewed Dec 20, 2024

View reviewed changes

sbi/inference/potentials/likelihood_based_potential.py Outdated Show resolved Hide resolved

manuelgloeckler reviewed Dec 20, 2024

View reviewed changes

sbi/inference/potentials/likelihood_based_potential.py Show resolved Hide resolved

manuelgloeckler reviewed Dec 20, 2024

View reviewed changes

tests/mnle_test.py Outdated Show resolved Hide resolved

manuelgloeckler approved these changes Dec 20, 2024

View reviewed changes

feedback; fix texts

737764d

janfb merged commit e7940dc into main Dec 21, 2024
6 checks passed

janfb deleted the multiple-conditions-in-iid-sampling branch December 21, 2024 13:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLE with multiple iid conditions #1331

NLE with multiple iid conditions #1331

janfb commented Dec 12, 2024 •

edited

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading

michaeldeistler left a comment

janfb commented Dec 13, 2024 •

edited

Loading

janfb commented Dec 18, 2024

manuelgloeckler left a comment

NLE with multiple iid conditions #1331

NLE with multiple iid conditions #1331

Conversation

janfb commented Dec 12, 2024 • edited Loading

codecov bot commented Dec 12, 2024 • edited Loading

Codecov Report

michaeldeistler left a comment

Choose a reason for hiding this comment

janfb commented Dec 13, 2024 • edited Loading

janfb commented Dec 18, 2024

manuelgloeckler left a comment

Choose a reason for hiding this comment

janfb commented Dec 12, 2024 •

edited

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading

janfb commented Dec 13, 2024 •

edited

Loading