-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to bring in custom annotations (BSgenome, TxDb)? #19
Comments
In theory it should be possible to bring in custom genome + annotation. However it will require that an annotation database is available. i.e. Ularcirc first searches for annotation database libraries that is named as follows: org..eg.db so for humans this is org.Hs.eg.db The two letter code is then used to identify matching genome and transcript data bases. If an annotation data base library exists for you organism then it sounds like you are very close to having all the required items. |
What about the BSgenome and TxDB? They seem to be mandatory as well. Also where is the annotation database required to be - it's checking somewhere online, right? If there is a local installation of the database possible, it would be great, if there was a wrapper, where the user provides the genome fasta, the genome annotation (e.g. gtf) file (and else might be necessary) to bring in custom annotations suitable for ularcirc. |
Agree have a wrapper is a good idea - but I am unsure of what is involved for some of those files. I have experience in making TxDb from gtf, but have not generated genome or annotation database. You mentioned you had generated genome file, was that easy to do? I suspect the annotation database is the most involved. Perhaps another solution to your problem is to convert your alignment coordinates to UCSC coordinated. I could make a wrapper for that. If you could generate a small test dataset I could generate a simple method to convert to a format that is compatible with existing databases. |
I thought about the conversion of alignments - or even remapping - but the downstream effects of the conversion will be to costly for me as I am using more tools for circRNA prediction and quantification (mostly from the CIRI world). Thank you for the offer, though. Regarding the BSgenome, I think it's not too tricky and believe it can be automated (in a wrapper). The BSgenome has some documentation on the how to forge a new one. In brief, you create sort of a dictionary ( This is what the
Genome fasta to 2bit conversion |
Hi,
thanks for this interesting tool. I am current trying to get ularcirc to run with some of my data.
Unfortunately, the reference genome for alignments don't match the UCSC chromosome naming conventions, so I thought of creating my own BSgenome and TxDb. I already forged the BSgenome, the TxDb is yet to come.
For now, with the BSgenome loaded in to the name space, I tried to find it in the shiny App under Setup configuration. My custom BSgenome was not listed - I could imagine that it would be due to my missing TxDb (yet to be produced).
My question for you:
Is it yet possible to bring in custom genome + annotation and if so, how can I achieve that?
best,
-Michael
The text was updated successfully, but these errors were encountered: