Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standard error of the overdispersion estimates #50

Open
anglixue opened this issue May 6, 2023 · 4 comments
Open

Standard error of the overdispersion estimates #50

anglixue opened this issue May 6, 2023 · 4 comments

Comments

@anglixue
Copy link

anglixue commented May 6, 2023

Dear Constantin,

I was wondering when estimating the overdispersion by glmGamPoi::glm_gp() function, can we extract the standard error of the estimates?

Also, when the algorithm shrinks the dispersion, would that impact the standard error?

Thank you!

Cheers,
Angli

@const-ae
Copy link
Owner

const-ae commented May 8, 2023

I was wondering when estimating the overdispersion by glmGamPoi::glm_gp() function, can we extract the standard error of the estimates?

I currently have not implemented any way to directly extract the standard error of the overdispersion.

glmGamPoi is inspired by limma and there was some discussion how to estimate the standard error of the variance in limma, which might be helpful.

Also, when the algorithm shrinks the dispersion, would that impact the standard error?

Yes, applying empirical Bayesian shrinkage to the overdispersion would also change its standard error.


I am curious, what is the case for which you need the standard error of the variance?

@anglixue
Copy link
Author

anglixue commented May 8, 2023

Thanks for your reply.

I noticed that even Cox-Reid adjusted MLE still has slightly inflated estimates for dispersion. The McCarthy et al. (2012) showed that CR-MLE is the least biased method in GLM but the simulating settings are specific to the data. I didn't find how many replicates they simulated per gene, but when the simulated mean and number of replicates are small, let's say dispersion=0.4, mean = 0.1, 100 replicates, the CR-MLE estimates could return very large estimates for dispersion (>100).

I want to test the lower limit where CR-MLE provides unbiased estimates. So knowing how to estimate the standard error will be essential. My gut feeling is that applying empirical Bayesian shrinkage won't help in this case as the gene-wise dispersion will not be consistent in the real data.

@const-ae
Copy link
Owner

const-ae commented May 8, 2023

Oh, interesting. It has been 3 years since I really deeply looked into this topic, so I am afraid I cannot give much useful input.

Regarding the question how to get the variance of the overdispersion estimate, I would approach this by using the conventional_deriv_score_function_fast to calculate the Hessian of the log(overdispersion) estimate (see for an example here). You just need the mean vector, the counts and model matrix.

@anglixue
Copy link
Author

Thanks for your suggestion. I'll try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants