You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I made a scpModelWorkflow() modeling of a small SingleCellExperiment object (I only have 20 cells).
The scpModelFilterPlot() looks like this:
I'm not surprised that I only have a few estimated features as I only have a few cells/observations. However, I'm puzzled by two things:
Why is the bar carresponding to features with a n/p ratio of 1 colored as "inestimable" ? According to the legend (and what I checked), features with a n/p ratio >= 1 are considered to be estimated.
How can I have features with a n/p ratio of 0?
I thought that n could never be equal to 0 and checked that this was the case.
summary(sapply(metadata(sce)$model@scpModelFitList, "slot", "n"))
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 3.000 3.695 5.000 21.000
Indeed, n/p ratio is never less than 0.5
np <-
sapply(metadata(sce)$model@scpModelFitList, "slot", "n") /
sapply(metadata(sce)$model@scpModelFitList, "slot", "p")
summary(np)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5000 0.6667 1.1250 Inf Inf Inf
However, I was surpised to see that a large number of the n/p ratios were infinite, which means that p is equal 0.
summary(sapply(metadata(sce)$model@scpModelFitList, "slot", "p"))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 0.000 5.000 3.905 7.000 14.000
On further investigation, I found out that this happens whenever there are only 1 or 2 observations for a specific feature
I assume that the 3409 features with an infinite n/p ratio are plotted as 0 in the plot.
Why do the features with 2 observations always have a p equal 0? I suppose it's not that important since features with only 2 observations are not very informative in bigger datasets.
The text was updated successfully, but these errors were encountered:
Hi Sam,
Thanks for pointing out these inconsistencies.
Regarding your first point, I will fix this. The legend and docs are right, but the plot is misleading. It has to do with a wrong assignment of the edge cases when I cut the histograms into estimable and non-estimable features.
Regarding your second point, you did a great investigation job! Indeed, the issue you are raising lies within these lines:
I intentionally did this, as IMHO, there is no use to model data with only 2 or less data points. Hence I generate an empty model matrix, hence p = 0, hence the feature is ignored. I'm open for discussion whether this would need a more clever management.
I made a
scpModelWorkflow()
modeling of a smallSingleCellExperiment
object (I only have 20 cells).The
scpModelFilterPlot()
looks like this:I'm not surprised that I only have a few estimated features as I only have a few cells/observations. However, I'm puzzled by two things:
Why is the bar carresponding to features with a n/p ratio of 1 colored as "inestimable" ? According to the legend (and what I checked), features with a n/p ratio >= 1 are considered to be estimated.
How can I have features with a n/p ratio of 0?
I thought that n could never be equal to 0 and checked that this was the case.
Indeed, n/p ratio is never less than 0.5
However, I was surpised to see that a large number of the n/p ratios were infinite, which means that p is equal 0.
On further investigation, I found out that this happens whenever there are only 1 or 2 observations for a specific feature
I assume that the 3409 features with an infinite n/p ratio are plotted as 0 in the plot.
Why do the features with 2 observations always have a p equal 0? I suppose it's not that important since features with only 2 observations are not very informative in bigger datasets.
The text was updated successfully, but these errors were encountered: