Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add usage of EMParams #1050

Merged
merged 9 commits into from
Dec 1, 2023
Merged

Add usage of EMParams #1050

merged 9 commits into from
Dec 1, 2023

Conversation

johschnee
Copy link
Contributor

@johschnee johschnee commented May 4, 2023

Changes

  • Class EMParams is now used by the ExpectationMaximizationModel in tools/_em_model.py.

Related issues

Closes #1026, #1025.

@johschnee
Copy link
Contributor Author

@WeilerP Currently, I have the following questions about the future design of the API:

  1. Should all the fitted parameters still be written in the AnnData object by default when calling ExpectationMaximiziationModel.fit()? Or should the EMParams class become the new way of handling the inferred parameters and only be written to the AnnData object when calling ExpectationMaximiziationModel.export_results_adata() (not yet implemented)?
  2. My current implementation relies on the methods _read_pars() and _write_pars() in tools/_em_model_core.py. Should these methods be kept in the implementation at all or do you aim to remove them with this restructuring?
  3. In the old method recover_dynamics() the parameter copy is false by default. In the new implementation ExpectationMaximizationModel.fit() it is true by default. Is this on purpose?

Copy link
Member

@WeilerP WeilerP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @johschnee. I added two comments which should also answer two of your questions. Regarding the third: Since we'll not be writing to AnnData in the fit method, the copy argument will become redundant. So it doesn't really matter what the default argument is, ATM (although it's a good point).

scvelo/tools/_em_model.py Outdated Show resolved Hide resolved
scvelo/tools/_em_model.py Outdated Show resolved Hide resolved
@WeilerP
Copy link
Member

WeilerP commented May 15, 2023

@johschnee, is there an update on this PR?

@johschnee
Copy link
Contributor Author

Yes, I was working on it actually today. I added the implementation of the abstract methods to this PR, because export_results_adata is used instead of _write_pars.

Copy link
Member

@WeilerP WeilerP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johschnee, I left another few comments. Mainly two things:

  1. I updated/refactored the _initialize_state_dict function. Can you please double check that it is still working?
  2. Can you please und all changes irrelevant to this PR (i.e., refactoring the code to rely on EMParams)? These changes (e.g., function implementations) should be there own PR accompanied by an issue.

scvelo/tools/_em_model.py Show resolved Hide resolved
scvelo/tools/_em_model.py Outdated Show resolved Hide resolved
scvelo/tools/_em_model.py Show resolved Hide resolved
scvelo/tools/_em_model.py Outdated Show resolved Hide resolved
@WeilerP
Copy link
Member

WeilerP commented May 23, 2023

FYI, @johschnee, the ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (4,) + inhomogeneous part. error is caused by a new Numpy version (see #1058).

johschnee and others added 6 commits August 22, 2023 19:52
Class `EMParams` is now used by the `ExpectationMaximizationModel`.
Since _read_pars() should be deleted in the future,
_initialize_state_dict() does not rely on it anymore.
Add implementation of state_dict() and export_results_adata() and
remove use of _write_pars() in ExpectationMaximizationModel.fit().
* Rename variables (`pars_names` to `parameters`, `pars` to
`parameter_dict`).
* Restructure code to reduce number of variables.
* Update definition of parameters initialized with as nan.
* Add metadata the attributes to `EMParams`
* Add methods read/write `EMParams` from/to AnnData
* Adapt the method `fit()` such that it uses the class `EMParams`
* Include `_align_dynamics()` as a method in
  `ExpectationMaximizationModel` and adapt it to also use `EMParams`
* Include `_flatten()` from `_em_model_core.py`
@johschnee
Copy link
Contributor Author

Hi @WeilerP,
this branch now contains the changes we discussed. In particular, I changed the following:

  • Added metadata to the attributes of EMParams and read/write methods: The metadata allows a different handling of vector and matrix parameters without hardcoding them in the read/write methods. I decided to add the read/write methods to EMParams because they rely a lot on the attributes of EMParams.
  • In the method fit(), I used the class EMParams. Since the AnnData object should not be modified during fit(), I defined some private attributes to store results and additional settings: _loss, _fit_connected_states, and _use_raw. They are written to an AnnData object when export_results_adata() is called.
  • As we discussed, I copied _align_dynamics() to the class ExpectationMaximizationModel and adapted it slightly to use the EMParams. Now, ExpectationMaximizationModel only uses the class DynamicsRecovery from _em_model_core, but no other methods.

Overall,

model = ExpectationMaximizationModel(adata)
model.fit()
adata_return = model.export_results_adata()

and

scv.tl.recover_dynamics(adata)

should lead to the same results. I tested it with simulated data and the pancreas dataset.

After the change, parameters are not written if they contain only
nan values.
@johschnee
Copy link
Contributor Author

Hi @WeilerP,
I changed the behavior of export_to_adata() such that arrays containing only nan values are not written anymore.
I also compared the results of the old and the new API for the simulated data and the pancreas dataset using numpy. As expected, they were identical.
Hence, it would be great if you could review this PR.

@johschnee johschnee requested a review from WeilerP September 20, 2023 18:00
Add argument `show_progress_bar` to `fit` method.
Copy link
Member

@WeilerP WeilerP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @johschnee!

@WeilerP WeilerP merged commit 43faf36 into theislab:master Dec 1, 2023
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use EMParams in ExpectationMaximizationModel
2 participants