-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with covariate-file #337
Comments
What's the header of your pc file?
…On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote:
Hello,
I am trying to calculate a PRS-score, with PRSice2, on a
case-control-cohort based on summary statistics from a larger GWAS-study. I
have calculated principal components and want to use the first 6 PCs as
covariates for the analysis. However, when I run the analysis I get the
following error message:
Error: All samples removed due to missingness in covariate
file!
I have made sure there aren't any hidden spaces in the covariates-file, I
have tried to delimit with both tabs and spaces, and I have checked (and
re-checked) that the path and the file-name are correct. However the same
error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process
below.
Thanks for a great product,
Lars
***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2023-08-17 13:54:23
/home/laros/PRSice2/PRSice_linux
--a1 A1
--a2 A2
--bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
--base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T
--clump-kb 250kb
--clump-p 1.000000
--clump-r2 0.100000
--cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
--ignore-fid
--interval 5e-05
--keep-ambig
--ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
--ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11
--num-auto 22
--or
--out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
--pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
--pheno-col MDD
--pvalue P
--score std
--seed 3270214622
--snp MarkerName
--stat LogOR
--target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
--thread 1
--upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base
and target are reporting alleles on the same
strand and will therefore only perform dosage flip
for the ambiguous SNPs. If you are unsure of what
the strand is, then you should not select the
--keep-ambig option
Initializing Genotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is:
MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00%
8483301 variant(s) observed in base file, with:
39487 NA stat/p-value observed
4210543 negative statistic observed. Maybe you have
forgotten the --beta flag?
646120 ambiguous variant(s)
4233271 total variant(s) included from base file
Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed
92 founder(s) included
4112097 variant(s) not found in previous data
43 variant(s) with mismatch information
522636 ambiguous variant(s) kept
3460831 variant(s) included
Initializing Genotype file:
/fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
(bed)
Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed
503 founder(s) included
10540328 variant(s) not found in previous data
149 variant(s) with mismatch information
469778 ambiguous variant(s) kept
3104546 variant(s) included
Phenotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
Column Name of Sample ID: FID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00%
Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype
35 control(s)
57 case(s)
Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate
file!
—
Reply to this email directly, view it on GitHub
<#337>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi,
Thank you for getting back to me!
The headers (and format) are as follows:
FID IID PC1 PC2 PC3 PC4 PC5 PC6
F1 F1 -0.0488942 -0.00648387 0.0119713 0.0394345 -0.0165522 0.0617235
F2 F2 -0.0499371 0.0127898 0.0426918 0.0412524 -0.0538963 0.0342523
F4 F4 0.0154813 0.0156588 0.0044783 -0.00596863 -0.023635 0.00985086
F5 F5 -0.0147007 0.00670695 0.0355421 0.00302993 -0.0671668 -0.00930397
F6 F6 -0.0259049 -0.0069673 -0.0347271 -0.0398622 0.015978 0.0781486
F8 F8 -0.0345881 0.0205085 -0.0136661 0.0191272 -0.0209368 0.0631035
F9 F9 -0.0259158 0.0119127 0.0224861 0.0451637 -0.0516346 0.0112552
The columns are tab delimited in the file, but I’ve tried with space aswell and get the same error-message.
Thanks again,
Lars
From: Shing Wan Choi ***@***.***>
Sent: den 18 augusti 2023 14:04
To: choishingwan/PRSice ***@***.***>
Cc: Lars Östman ***@***.***>; Author ***@***.***>
Subject: Re: [choishingwan/PRSice] Issue with covariate-file (Issue #337)
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***<mailto:***@***.***>> wrote:
Hello,
I am trying to calculate a PRS-score, with PRSice2, on a
case-control-cohort based on summary statistics from a larger GWAS-study. I
have calculated principal components and want to use the first 6 PCs as
covariates for the analysis. However, when I run the analysis I get the
following error message:
Error: All samples removed due to missingness in covariate
file!
I have made sure there aren't any hidden spaces in the covariates-file, I
have tried to delimit with both tabs and spaces, and I have checked (and
re-checked) that the path and the file-name are correct. However the same
error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process
below.
Thanks for a great product,
Lars
***@***.***:/fenix/users/laros/ALF/Genetics/scripts$<mailto:***@***.***:/fenix/users/laros/ALF/Genetics/scripts$> ./ALF_PRS_by_group.sh
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2023-08-17 13:54:23
/home/laros/PRSice2/PRSice_linux
--a1 A1
--a2 A2
--bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
--base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T
--clump-kb 250kb
--clump-p 1.000000
--clump-r2 0.100000
--cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
--ignore-fid
--interval 5e-05
--keep-ambig
--ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
--ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11
--num-auto 22
--or
--out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
--pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
--pheno-col MDD
--pvalue P
--score std
--seed 3270214622
--snp MarkerName
--stat LogOR
--target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
--thread 1
--upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base
and target are reporting alleles on the same
strand and will therefore only perform dosage flip
for the ambiguous SNPs. If you are unsure of what
the strand is, then you should not select the
--keep-ambig option
Initializing Genotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is:
MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00%
8483301 variant(s) observed in base file, with:
39487 NA stat/p-value observed
4210543 negative statistic observed. Maybe you have
forgotten the --beta flag?
646120 ambiguous variant(s)
4233271 total variant(s) included from base file
Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed
92 founder(s) included
4112097 variant(s) not found in previous data
43 variant(s) with mismatch information
522636 ambiguous variant(s) kept
3460831 variant(s) included
Initializing Genotype file:
/fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
(bed)
Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed
503 founder(s) included
10540328 variant(s) not found in previous data
149 variant(s) with mismatch information
469778 ambiguous variant(s) kept
3104546 variant(s) included
Phenotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
Column Name of Sample ID: FID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00%
Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype
35 control(s)
57 case(s)
Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate
file!
—
Reply to this email directly, view it on GitHub
<#337>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***<mailto:***@***.***>>
—
Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
|
Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote:
Hello,
I am trying to calculate a PRS-score, with PRSice2, on a
case-control-cohort based on summary statistics from a larger GWAS-study. I
have calculated principal components and want to use the first 6 PCs as
covariates for the analysis. However, when I run the analysis I get the
following error message:
Error: All samples removed due to missingness in covariate
file!
I have made sure there aren't any hidden spaces in the covariates-file, I
have tried to delimit with both tabs and spaces, and I have checked (and
re-checked) that the path and the file-name are correct. However the same
error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process
below.
Thanks for a great product,
Lars
***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2023-08-17 13:54:23
/home/laros/PRSice2/PRSice_linux
--a1 A1
--a2 A2
--bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
--base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T
--clump-kb 250kb
--clump-p 1.000000
--clump-r2 0.100000
--cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
--ignore-fid
--interval 5e-05
--keep-ambig
--ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
--ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11
--num-auto 22
--or
--out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
--pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
--pheno-col MDD
--pvalue P
--score std
--seed 3270214622
--snp MarkerName
--stat LogOR
--target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
--thread 1
--upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base
and target are reporting alleles on the same
strand and will therefore only perform dosage flip
for the ambiguous SNPs. If you are unsure of what
the strand is, then you should not select the
--keep-ambig option
Initializing Genotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is:
MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00%
8483301 variant(s) observed in base file, with:
39487 NA stat/p-value observed
4210543 negative statistic observed. Maybe you have
forgotten the --beta flag?
646120 ambiguous variant(s)
4233271 total variant(s) included from base file
Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed
92 founder(s) included
4112097 variant(s) not found in previous data
43 variant(s) with mismatch information
522636 ambiguous variant(s) kept
3460831 variant(s) included
Initializing Genotype file:
/fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
(bed)
Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed
503 founder(s) included
10540328 variant(s) not found in previous data
149 variant(s) with mismatch information
469778 ambiguous variant(s) kept
3104546 variant(s) included
Phenotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
Column Name of Sample ID: FID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00%
Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype
35 control(s)
57 case(s)
Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate
file!
—
Reply to this email directly, view it on GitHub
<#337>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
—
Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
You used ignore fid, and you have the fid column in your covariate file. In
addition, as you did not specify the covariates, PRSice will use all non-ID
fields, in this case the IID (default is the first column is id). Easy fix
will be --cov-col @pc[1-6]
Sam
…On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote:
Thought I'd add that it is just the .eigenvec output-file from the
PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote:
> Hello,
> I am trying to calculate a PRS-score, with PRSice2, on a
> case-control-cohort based on summary statistics from a larger
GWAS-study. I
> have calculated principal components and want to use the first 6 PCs as
> covariates for the analysis. However, when I run the analysis I get the
> following error message:
>
> Error: All samples removed due to missingness in covariate
> file!
>
> I have made sure there aren't any hidden spaces in the covariates-file,
I
> have tried to delimit with both tabs and spaces, and I have checked (and
> re-checked) that the path and the file-name are correct. However the
same
> error-message keeps showing up.
>
> Any help would be greatly appreciated, I will paste in the whole process
> below.
>
> Thanks for a great product,
> Lars
>
> ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$
./ALF_PRS_by_group.sh
> PRSice 2.3.5 (2021-09-20)
> https://github.com/choishingwan/PRSice
> (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
> GNU General Public License v3
> If you use PRSice in any published work, please cite:
> Choi SW, O'Reilly PF.
> PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
> GigaScience 8, no. 7 (July 1, 2019)
> 2023-08-17 13:54:23
> /home/laros/PRSice2/PRSice_linux
> --a1 A1
> --a2 A2
> --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
> --base
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
>
> --binary-target T
> --clump-kb 250kb
> --clump-p 1.000000
> --clump-r2 0.100000
> --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
> --ignore-fid
> --interval 5e-05
> --keep-ambig
> --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> --ld-keep
>
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
>
> --lower 1e-11
> --num-auto 22
> --or
> --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
> --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> --pheno-col MDD
> --pvalue P
> --score std
> --seed 3270214622
> --snp MarkerName
> --stat LogOR
> --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
> --thread 1
> --upper 0.05
>
> Warning: By selecting --keep-ambig, PRSice assume the base
> and target are reporting alleles on the same
> strand and will therefore only perform dosage flip
> for the ambiguous SNPs. If you are unsure of what
> the strand is, then you should not select the
> --keep-ambig option
>
> Initializing Genotype file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
> Start processing PGC_UKB_depression_genome-wide
>
> Base file:
>
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
> Header of file is:
> MarkerName A1 A2 Freq LogOR StdErrLogOR P
>
> Reading 100.00%
> 8483301 variant(s) observed in base file, with:
> 39487 NA stat/p-value observed
> 4210543 negative statistic observed. Maybe you have
> forgotten the --beta flag?
> 646120 ambiguous variant(s)
> 4233271 total variant(s) included from base file
> Loading Genotype info from target
>
> 92 people (0 male(s), 0 female(s)) observed
> 92 founder(s) included
>
> 4112097 variant(s) not found in previous data
> 43 variant(s) with mismatch information
> 522636 ambiguous variant(s) kept
> 3460831 variant(s) included
>
> Initializing Genotype file:
> /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> (bed)
> Loading Genotype info from reference
>
> 2504 people (0 male(s), 0 female(s)) observed
> 503 founder(s) included
>
> 10540328 variant(s) not found in previous data
> 149 variant(s) with mismatch information
> 469778 ambiguous variant(s) kept
> 3104546 variant(s) included
>
> Phenotype file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> Column Name of Sample ID: FID
> Note: If the phenotype file does not contain a header, the
> column name will be displayed as the Sample ID which is
> expected.
>
> There are a total of 1 phenotype to process
>
> Start performing clumping
>
> Clumping Progress: 100.00%
> Number of variant(s) after clumping : 188356
>
> Processing the 1 th phenotype
>
> MDD is a binary phenotype
> 35 control(s)
> 57 case(s)
> Processing the covariate file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
>
> Error: All samples removed due to missingness in covariate
> file!
>
> —
> Reply to this email directly, view it on GitHub
> <#337>, or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub<
#337 (comment)>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub
<#337 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Thank you so much, and I apologize for taking your time with such a simple answer. I'll fix it straight away. And just to see if I understand, would another solution be to remove the FID-column from the covariates-file? Since they would make IID the first column, and thus the default one?
Thank you once again!
Lars
Den 18 aug. 2023 15:46 skrev Shing Wan Choi ***@***.***>:
You used ignore fid, and you have the fid column in your covariate file. In
addition, as you did not specify the covariates, PRSice will use all non-ID
fields, in this case the IID (default is the first column is id). Easy fix
will be --cov-col @pc[1-6]
Sam
On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote:
Thought I'd add that it is just the .eigenvec output-file from the
PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote:
> Hello,
> I am trying to calculate a PRS-score, with PRSice2, on a
> case-control-cohort based on summary statistics from a larger
GWAS-study. I
> have calculated principal components and want to use the first 6 PCs as
> covariates for the analysis. However, when I run the analysis I get the
> following error message:
>
> Error: All samples removed due to missingness in covariate
> file!
>
> I have made sure there aren't any hidden spaces in the covariates-file,
I
> have tried to delimit with both tabs and spaces, and I have checked (and
> re-checked) that the path and the file-name are correct. However the
same
> error-message keeps showing up.
>
> Any help would be greatly appreciated, I will paste in the whole process
> below.
>
> Thanks for a great product,
> Lars
>
> ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$
./ALF_PRS_by_group.sh
> PRSice 2.3.5 (2021-09-20)
> https://github.com/choishingwan/PRSice
> (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
> GNU General Public License v3
> If you use PRSice in any published work, please cite:
> Choi SW, O'Reilly PF.
> PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
> GigaScience 8, no. 7 (July 1, 2019)
> 2023-08-17 13:54:23
> /home/laros/PRSice2/PRSice_linux
> --a1 A1
> --a2 A2
> --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
> --base
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
>
> --binary-target T
> --clump-kb 250kb
> --clump-p 1.000000
> --clump-r2 0.100000
> --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
> --ignore-fid
> --interval 5e-05
> --keep-ambig
> --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> --ld-keep
>
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
>
> --lower 1e-11
> --num-auto 22
> --or
> --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
> --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> --pheno-col MDD
> --pvalue P
> --score std
> --seed 3270214622
> --snp MarkerName
> --stat LogOR
> --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
> --thread 1
> --upper 0.05
>
> Warning: By selecting --keep-ambig, PRSice assume the base
> and target are reporting alleles on the same
> strand and will therefore only perform dosage flip
> for the ambiguous SNPs. If you are unsure of what
> the strand is, then you should not select the
> --keep-ambig option
>
> Initializing Genotype file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
> Start processing PGC_UKB_depression_genome-wide
>
> Base file:
>
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
> Header of file is:
> MarkerName A1 A2 Freq LogOR StdErrLogOR P
>
> Reading 100.00%
> 8483301 variant(s) observed in base file, with:
> 39487 NA stat/p-value observed
> 4210543 negative statistic observed. Maybe you have
> forgotten the --beta flag?
> 646120 ambiguous variant(s)
> 4233271 total variant(s) included from base file
> Loading Genotype info from target
>
> 92 people (0 male(s), 0 female(s)) observed
> 92 founder(s) included
>
> 4112097 variant(s) not found in previous data
> 43 variant(s) with mismatch information
> 522636 ambiguous variant(s) kept
> 3460831 variant(s) included
>
> Initializing Genotype file:
> /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> (bed)
> Loading Genotype info from reference
>
> 2504 people (0 male(s), 0 female(s)) observed
> 503 founder(s) included
>
> 10540328 variant(s) not found in previous data
> 149 variant(s) with mismatch information
> 469778 ambiguous variant(s) kept
> 3104546 variant(s) included
>
> Phenotype file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> Column Name of Sample ID: FID
> Note: If the phenotype file does not contain a header, the
> column name will be displayed as the Sample ID which is
> expected.
>
> There are a total of 1 phenotype to process
>
> Start performing clumping
>
> Clumping Progress: 100.00%
> Number of variant(s) after clumping : 188356
>
> Processing the 1 th phenotype
>
> MDD is a binary phenotype
> 35 control(s)
> 57 case(s)
> Processing the covariate file:
> /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
>
> Error: All samples removed due to missingness in covariate
> file!
>
> —
> Reply to this email directly, view it on GitHub
> <#337>, or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub<
#337 (comment)>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub
<#337 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you commented.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Yes
…On Fri, Aug 18, 2023, 10:06 AM LarsOstman ***@***.***> wrote:
Thank you so much, and I apologize for taking your time with such a simple
answer. I'll fix it straight away. And just to see if I understand, would
another solution be to remove the FID-column from the covariates-file?
Since they would make IID the first column, and thus the default one?
Thank you once again!
Lars
Den 18 aug. 2023 15:46 skrev Shing Wan Choi ***@***.***>:
You used ignore fid, and you have the fid column in your covariate file.
In
addition, as you did not specify the covariates, PRSice will use all
non-ID
fields, in this case the IID (default is the first column is id). Easy fix
will be --cov-col @pc[1-6]
Sam
On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote:
> Thought I'd add that it is just the .eigenvec output-file from the
> PC-analysis, which I haven't done any changes to.
>
> Lars
>
> Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>:
>
> What's the header of your pc file?
>
> On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote:
>
> > Hello,
> > I am trying to calculate a PRS-score, with PRSice2, on a
> > case-control-cohort based on summary statistics from a larger
> GWAS-study. I
> > have calculated principal components and want to use the first 6 PCs
as
> > covariates for the analysis. However, when I run the analysis I get
the
> > following error message:
> >
> > Error: All samples removed due to missingness in covariate
> > file!
> >
> > I have made sure there aren't any hidden spaces in the
covariates-file,
> I
> > have tried to delimit with both tabs and spaces, and I have checked
(and
> > re-checked) that the path and the file-name are correct. However the
> same
> > error-message keeps showing up.
> >
> > Any help would be greatly appreciated, I will paste in the whole
process
> > below.
> >
> > Thanks for a great product,
> > Lars
> >
> > ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$
> ./ALF_PRS_by_group.sh
> > PRSice 2.3.5 (2021-09-20)
> > https://github.com/choishingwan/PRSice
> > (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
> > GNU General Public License v3
> > If you use PRSice in any published work, please cite:
> > Choi SW, O'Reilly PF.
> > PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
> > GigaScience 8, no. 7 (July 1, 2019)
> > 2023-08-17 13:54:23
> > /home/laros/PRSice2/PRSice_linux
> > --a1 A1
> > --a2 A2
> > --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
> > --base
> >
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
>
> >
> > --binary-target T
> > --clump-kb 250kb
> > --clump-p 1.000000
> > --clump-r2 0.100000
> > --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
> > --ignore-fid
> > --interval 5e-05
> > --keep-ambig
> > --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> > --ld-keep
> >
>
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
>
> >
> > --lower 1e-11
> > --num-auto 22
> > --or
> > --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
> > --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> > --pheno-col MDD
> > --pvalue P
> > --score std
> > --seed 3270214622
> > --snp MarkerName
> > --stat LogOR
> > --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
> > --thread 1
> > --upper 0.05
> >
> > Warning: By selecting --keep-ambig, PRSice assume the base
> > and target are reporting alleles on the same
> > strand and will therefore only perform dosage flip
> > for the ambiguous SNPs. If you are unsure of what
> > the strand is, then you should not select the
> > --keep-ambig option
> >
> > Initializing Genotype file:
> > /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
> > Start processing PGC_UKB_depression_genome-wide
> >
> > Base file:
> >
> >
>
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
>
> > Header of file is:
> > MarkerName A1 A2 Freq LogOR StdErrLogOR P
> >
> > Reading 100.00%
> > 8483301 variant(s) observed in base file, with:
> > 39487 NA stat/p-value observed
> > 4210543 negative statistic observed. Maybe you have
> > forgotten the --beta flag?
> > 646120 ambiguous variant(s)
> > 4233271 total variant(s) included from base file
> > Loading Genotype info from target
> >
> > 92 people (0 male(s), 0 female(s)) observed
> > 92 founder(s) included
> >
> > 4112097 variant(s) not found in previous data
> > 43 variant(s) with mismatch information
> > 522636 ambiguous variant(s) kept
> > 3460831 variant(s) included
> >
> > Initializing Genotype file:
> > /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
> > (bed)
> > Loading Genotype info from reference
> >
> > 2504 people (0 male(s), 0 female(s)) observed
> > 503 founder(s) included
> >
> > 10540328 variant(s) not found in previous data
> > 149 variant(s) with mismatch information
> > 469778 ambiguous variant(s) kept
> > 3104546 variant(s) included
> >
> > Phenotype file:
> > /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
> > Column Name of Sample ID: FID
> > Note: If the phenotype file does not contain a header, the
> > column name will be displayed as the Sample ID which is
> > expected.
> >
> > There are a total of 1 phenotype to process
> >
> > Start performing clumping
> >
> > Clumping Progress: 100.00%
> > Number of variant(s) after clumping : 188356
> >
> > Processing the 1 th phenotype
> >
> > MDD is a binary phenotype
> > 35 control(s)
> > 57 case(s)
> > Processing the covariate file:
> > /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
> >
> > Error: All samples removed due to missingness in covariate
> > file!
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <#337>, or unsubscribe
> > <
>
https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
>
> > .
> > You are receiving this because you are subscribed to this
thread.Message
> > ID: ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub<
>
#337 (comment)>,
> or unsubscribe<
>
https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
>
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
> —
> Reply to this email directly, view it on GitHub
> <
#337 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub<
#337 (comment)>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub
<#337 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYTCYVWWHBTHA2EQXP3XV5ZGFANCNFSM6AAAAAA3VC7XBA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate
file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product,
Lars
laros@maul:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2023-08-17 13:54:23
/home/laros/PRSice2/PRSice_linux
--a1 A1
--a2 A2
--bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
--base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T
--clump-kb 250kb
--clump-p 1.000000
--clump-r2 0.100000
--cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
--ignore-fid
--interval 5e-05
--keep-ambig
--ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
--ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11
--num-auto 22
--or
--out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
--pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
--pheno-col MDD
--pvalue P
--score std
--seed 3270214622
--snp MarkerName
--stat LogOR
--target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
--thread 1
--upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base
and target are reporting alleles on the same
strand and will therefore only perform dosage flip
for the ambiguous SNPs. If you are unsure of what
the strand is, then you should not select the
--keep-ambig option
Initializing Genotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is:
MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00%
8483301 variant(s) observed in base file, with:
39487 NA stat/p-value observed
4210543 negative statistic observed. Maybe you have
forgotten the --beta flag?
646120 ambiguous variant(s)
4233271 total variant(s) included from base file
Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed
92 founder(s) included
4112097 variant(s) not found in previous data
43 variant(s) with mismatch information
522636 ambiguous variant(s) kept
3460831 variant(s) included
Initializing Genotype file:
/fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
(bed)
Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed
503 founder(s) included
10540328 variant(s) not found in previous data
149 variant(s) with mismatch information
469778 ambiguous variant(s) kept
3104546 variant(s) included
Phenotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
Column Name of Sample ID: FID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00%
Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype
35 control(s)
57 case(s)
Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate
file!
The text was updated successfully, but these errors were encountered: