You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm new to the group so let me know if there is a better place to write this kind of thing...
I am working on assessing whether the gene expression data provides considerably more predictive information than the metadata (samples.tsv). I created a notebook to predict TP53 mutation from the metadata alone and achieved .82 AUROC. This is substantially lower than the AUROC achieved using gene expression (.92). I have a few other ideas for what to do next, but am interested in any input. The new notebook can be found on my forked repo (4.TCGA-Metadata-MLexample). Have not submitted a pull request.
The text was updated successfully, but these errors were encountered:
I'm new to the group so let me know if there is a better place to write this kind of thing...
Nope Issues are the right place. I'm going to tag a few related issues for convenience: #8, #21, #47.
See this notebook from #47 which looks at performance for several mutations only using the covariates (metadata). So I think the next step based on what currently exists will be find a way to fit two models:
using covariates only
using covariates and gene expression
Then seeing how much better 2 performs will give us the marginal contribution of gene expression over sample metadata. @joshlevy89, do you want to tackle this analysis. You can make a new directory in explore and open a pull request (even if it's still a work in progress -- just put WIP in the pull request title).
I'm new to the group so let me know if there is a better place to write this kind of thing...
I am working on assessing whether the gene expression data provides considerably more predictive information than the metadata (samples.tsv). I created a notebook to predict TP53 mutation from the metadata alone and achieved
.82 AUROC. This is substantially lower than the AUROC achieved using gene expression (.92). I have a few other ideas for what to do next, but am interested in any input. The new notebook can be found on my forked repo (4.TCGA-Metadata-MLexample). Have not submitted a pull request.The text was updated successfully, but these errors were encountered: