-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snp extract "Invalid format" #98
Comments
The dbsnp index used by gemBS has its own format which is quite unlike the
formats from the public databases. You need to download the VCF or BED
files from dbSNP and then use the index subcommand in gemBS to generate the
index.
Simon
…On Thu, Feb 2, 2023 at 8:14 AM xiaoqiwang19 ***@***.***> wrote:
When initally running gemBS extract, I do not see any variants in the
_snps.txt.gz files. The he _snps.txt.gz file is empty.
I execute the following commands to extract wgbs samples SNPs info:
gemBS extract --snp-db
/public/backup/users/wangxq/software/annovar/humandb/dbsnp_138.hg38.vcf.gz.idx
-S -c -N -B -t 32
But I get the following erros:
"Loading dbSNP header from
/public/backup/users/wangxq/software/annovar/humandb/dbsnp_138.hg38.vcf.gz.idx
Invalid format"
Whether it's a dbsnp downloaded from NCBI or a dbsnp database downloaded
from humandb. First, the index is built with gembs, and then the format
error is displayed when running gemBS extract command. I don't know what
format is required for this and why the error is reported. I need your
help, thank you.
—
Reply to this email directly, view it on GitHub
<#98>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY4652MQS6UFMAVVVBTJ53WVNNF5ANCNFSM6AAAAAAUOUSR2E>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I downloaded dbsnp from NCBI and built the index using gembs index. Can you give me a example of dbsnp or url ? |
I would recommend downloading the VCF format files from dbSNP for example
https://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.40.gz
It is simplest to then add the following to the config file (and re-run
gemBS prepare)
dbSBP_files = /path/to/file/GCF_000001405.40.gz
and then run gemBS index
To get homozygous reference SNPs it is necessary to rerun the calling step
after you have configured gemBS for dbSNP, otherwise only variant SNPs (or
SNPs with C or G alleles) will be included. If you have already generated
the BCF files you should move or remove them so that gemBS will redo the
calling step.
Simon
…On Thu, Feb 2, 2023 at 9:13 AM xiaoqiwang19 ***@***.***> wrote:
I downloaded dbsnp from NCBI and built the index using gembs index. Can
you give me a example of dbsnp or url ?
By the way, does the snp result include reference homozygous snp calls?
—
Reply to this email directly, view it on GitHub
<#98 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY46534PFR5MEQ4TA7DAWTWVNUD5ANCNFSM6AAAAAAUOUSR2E>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thank you very much. |
When initally running gemBS extract, I do not see any variants in the _snps.txt.gz files. The he _snps.txt.gz file is empty.
I execute the following commands to extract wgbs samples SNPs info:
gemBS extract --snp-db /public/backup/users/wangxq/software/annovar/humandb/dbsnp_138.hg38.vcf.gz.idx -S -c -N -B -t 32
But I get the following erros:
"Loading dbSNP header from /public/backup/users/wangxq/software/annovar/humandb/dbsnp_138.hg38.vcf.gz.idx
Invalid format"
Whether it's a dbsnp downloaded from NCBI or a dbsnp database downloaded from humandb. First, the index is built with gembs, and then the format error is displayed when running gemBS extract command. I don't know what format is required for this and why the error is reported. I need your help, thank you.
The text was updated successfully, but these errors were encountered: