Add usage of `EMParams` #1050

johschnee · 2023-05-04T16:19:00Z

Changes

Class EMParams is now used by the ExpectationMaximizationModel in tools/_em_model.py.

Related issues

Closes #1026, #1025.

johschnee · 2023-05-04T16:45:03Z

@WeilerP Currently, I have the following questions about the future design of the API:

Should all the fitted parameters still be written in the AnnData object by default when calling ExpectationMaximiziationModel.fit()? Or should the EMParams class become the new way of handling the inferred parameters and only be written to the AnnData object when calling ExpectationMaximiziationModel.export_results_adata() (not yet implemented)?
My current implementation relies on the methods _read_pars() and _write_pars() in tools/_em_model_core.py. Should these methods be kept in the implementation at all or do you aim to remove them with this restructuring?
In the old method recover_dynamics() the parameter copy is false by default. In the new implementation ExpectationMaximizationModel.fit() it is true by default. Is this on purpose?

WeilerP

Thanks, @johschnee. I added two comments which should also answer two of your questions. Regarding the third: Since we'll not be writing to AnnData in the fit method, the copy argument will become redundant. So it doesn't really matter what the default argument is, ATM (although it's a good point).

scvelo/tools/_em_model.py

WeilerP · 2023-05-15T11:09:14Z

@johschnee, is there an update on this PR?

johschnee · 2023-05-15T20:07:28Z

Yes, I was working on it actually today. I added the implementation of the abstract methods to this PR, because export_results_adata is used instead of _write_pars.

WeilerP

@johschnee, I left another few comments. Mainly two things:

I updated/refactored the _initialize_state_dict function. Can you please double check that it is still working?
Can you please und all changes irrelevant to this PR (i.e., refactoring the code to rely on EMParams)? These changes (e.g., function implementations) should be there own PR accompanied by an issue.

scvelo/tools/_em_model.py

WeilerP · 2023-05-23T02:41:44Z

FYI, @johschnee, the ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part. error is caused by a new Numpy version (see #1058).

Class `EMParams` is now used by the `ExpectationMaximizationModel`.

Since _read_pars() should be deleted in the future, _initialize_state_dict() does not rely on it anymore.

Add implementation of state_dict() and export_results_adata() and remove use of _write_pars() in ExpectationMaximizationModel.fit().

* Rename variables (`pars_names` to `parameters`, `pars` to `parameter_dict`). * Restructure code to reduce number of variables. * Update definition of parameters initialized with as nan.

* Add metadata the attributes to `EMParams` * Add methods read/write `EMParams` from/to AnnData * Adapt the method `fit()` such that it uses the class `EMParams` * Include `_align_dynamics()` as a method in `ExpectationMaximizationModel` and adapt it to also use `EMParams` * Include `_flatten()` from `_em_model_core.py`

johschnee · 2023-08-22T18:27:44Z

Hi @WeilerP,
this branch now contains the changes we discussed. In particular, I changed the following:

Added metadata to the attributes of EMParams and read/write methods: The metadata allows a different handling of vector and matrix parameters without hardcoding them in the read/write methods. I decided to add the read/write methods to EMParams because they rely a lot on the attributes of EMParams.
In the method fit(), I used the class EMParams. Since the AnnData object should not be modified during fit(), I defined some private attributes to store results and additional settings: _loss, _fit_connected_states, and _use_raw. They are written to an AnnData object when export_results_adata() is called.
As we discussed, I copied _align_dynamics() to the class ExpectationMaximizationModel and adapted it slightly to use the EMParams. Now, ExpectationMaximizationModel only uses the class DynamicsRecovery from _em_model_core, but no other methods.

Overall,

model = ExpectationMaximizationModel(adata)
model.fit()
adata_return = model.export_results_adata()

and

scv.tl.recover_dynamics(adata)

should lead to the same results. I tested it with simulated data and the pancreas dataset.

After the change, parameters are not written if they contain only nan values.

johschnee · 2023-09-20T18:00:08Z

Hi @WeilerP,
I changed the behavior of export_to_adata() such that arrays containing only nan values are not written anymore.
I also compared the results of the old and the new API for the simulated data and the pancreas dataset using numpy. As expected, they were identical.
Hence, it would be great if you could review this PR.

Add argument `show_progress_bar` to `fit` method.

WeilerP

Thanks, @johschnee!

WeilerP reviewed May 8, 2023

View reviewed changes

scvelo/tools/_em_model.py Outdated Show resolved Hide resolved

scvelo/tools/_em_model.py Outdated Show resolved Hide resolved

WeilerP reviewed May 23, 2023

View reviewed changes

scvelo/tools/_em_model.py Show resolved Hide resolved

scvelo/tools/_em_model.py Outdated Show resolved Hide resolved

scvelo/tools/_em_model.py Show resolved Hide resolved

scvelo/tools/_em_model.py Outdated Show resolved Hide resolved

johschnee and others added 6 commits August 22, 2023 19:52

Add usage of EMParams

5019ac4

Class `EMParams` is now used by the `ExpectationMaximizationModel`.

Remove use of _read_pars() from _em_model.py

2bd5f42

Since _read_pars() should be deleted in the future, _initialize_state_dict() does not rely on it anymore.

Implement abstact methods

36de253

Add implementation of state_dict() and export_results_adata() and remove use of _write_pars() in ExpectationMaximizationModel.fit().

Add type hints to _initialize_state_dict

3ac2c3b

Refactor _initialize_state_dict

b4c27d3

* Rename variables (`pars_names` to `parameters`, `pars` to `parameter_dict`). * Restructure code to reduce number of variables. * Update definition of parameters initialized with as nan.

johschnee force-pushed the feat/EMParams branch from 76e15a1 to 15808e0 Compare August 22, 2023 18:02

Modify export to AnnData

b2c8258

After the change, parameters are not written if they contain only nan values.

johschnee requested a review from WeilerP September 20, 2023 18:00

WeilerP added 2 commits December 1, 2023 14:29

Update _em_model.py

d120e50

Add argument `show_progress_bar` to `fit` method.

Update tests/test_basic.py

60f94cc

WeilerP approved these changes Dec 1, 2023

View reviewed changes

WeilerP merged commit 43faf36 into theislab:master Dec 1, 2023
4 of 6 checks passed

WeilerP mentioned this pull request Dec 3, 2023

Implement abstract methods in ExpectationMaximizationModel #1025

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add usage of `EMParams` #1050

Add usage of `EMParams` #1050

johschnee commented May 4, 2023 •

edited

Loading

johschnee commented May 4, 2023

WeilerP left a comment

WeilerP commented May 15, 2023

johschnee commented May 15, 2023

WeilerP left a comment

WeilerP commented May 23, 2023

johschnee commented Aug 22, 2023

johschnee commented Sep 20, 2023

WeilerP left a comment

Add usage of EMParams #1050

Add usage of EMParams #1050

Conversation

johschnee commented May 4, 2023 • edited Loading

Changes

Related issues

johschnee commented May 4, 2023

WeilerP left a comment

Choose a reason for hiding this comment

WeilerP commented May 15, 2023

johschnee commented May 15, 2023

WeilerP left a comment

Choose a reason for hiding this comment

WeilerP commented May 23, 2023

johschnee commented Aug 22, 2023

johschnee commented Sep 20, 2023

WeilerP left a comment

Choose a reason for hiding this comment

Add usage of `EMParams` #1050

Add usage of `EMParams` #1050

johschnee commented May 4, 2023 •

edited

Loading