Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsetting a SingleCellExperiment breaks the ScpModel object #58

Open
cvanderaa opened this issue Mar 22, 2024 · 8 comments
Open

Subsetting a SingleCellExperiment breaks the ScpModel object #58

cvanderaa opened this issue Mar 22, 2024 · 8 comments

Comments

@cvanderaa
Copy link
Member

The issues has been highlighted in: #56 (comment)

The reason is that the subsetting functionality in QFeatures does not know it should also subset the ScpModel objects contained in metadata(qfObject).

A solution I see is to overwrite the subsetting functionality of QFeatures (with call to nextMethod()), but also including a substerring of the metadata.

@cvanderaa
Copy link
Member Author

This also implies creating subsetting functionality for ScpModel objects, which should be rather straightforward.

@lgatto
Copy link
Member

lgatto commented Mar 22, 2024

Wait, I'm confused... an ScpModel is computed on a SingleCellExperiment. Why is the QFeatures subsetting and issue?

Also, having a quick look at the issue, isn't one problem that the sce variable wasn't created with getWithColData()?

@cvanderaa cvanderaa changed the title Subsetting a QFeatures breaks the ScpModel object Subsetting a SingleCellExperiment breaks the ScpModel object Mar 22, 2024
@cvanderaa
Copy link
Member Author

Sorry, you're right! It's subsetting the SingleCellExperiment 😅

@lgatto
Copy link
Member

lgatto commented Mar 22, 2024

Ok, then sub-setting the ScpModel would be the best approach.

On the other hand, from a user's perspective, what about sub-setting features at the model results data.frames level? Sub-setting cells or cell types kind of invalidates the results, as the model was produced with all cells.

@lgatto
Copy link
Member

lgatto commented Mar 22, 2024

Also regarding this

the model was produced with all cells

do we want to allow sub-setting columns/cells?

@cvanderaa
Copy link
Member Author

It makes indeed no sense to subset an ScpModel by cell as the estimated results depend on all cells. So when sub-setting an SCE by column, all ScpModel objects in the metadata() should be removed.

what about sub-setting features at the model results data.frames level

The output tables are disconnected from the SCE object, so it is not our responsibility and up to the user to subset the data tables as they wish. But every ScpModel object depends on an SCE object, so we must ensure that the functionality does not break upon subsetting, or at least explicitly forbid it. In practice, I see 3 possible approach:

  1. Add a check (in scpVariance*, scpDifferential*, scpComponent*) to detect whether the SCE has been subset and throw an error, that is forbid subsetting
  2. Add a check (in scpVariance*, scpDifferential*, scpComponent*) to detect whether the SCE has been subset and throw an error that suggests using a function (eg scpModelUpdate()). scpModelUpdate() should detect changes in cells (if change, remove scpModel) and changes in features (if change, subset scpModel).
  3. Overwrite sub-setting method of SCE so that is uses scpModelUpdate(). However, I'm not convinced this is a good idea to mess up with SCE's methods.

What do you think?

@lgatto
Copy link
Member

lgatto commented Mar 22, 2024

The output tables are disconnected from the SCE object, so it is not our responsibility and up to the user to subset the data tables as they wish.

Of course, it not our responsibility. My point was rather that sub-setting by feature can be done on that level, so do we really need support sub-setting the SCE?

But now that I think a bit more about it, doesn't sub-setting by features invalidates some of the ScpModel results? Are the component and variance analyses still relevant? Would these be preformed on the sub-setted modelled data?

@lgatto
Copy link
Member

lgatto commented Mar 22, 2024

In other words, and following up on your suggestions, do we want to support sub-setting an SCE after modelling? Shouldn't we not simply warn that this will lead to broken/invalid ScpModel results, and suggest to manipulate the individual dataframe results of re-run a new model on a subset of the data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants