Issue with covariate-file #337

LarsOstman · 2023-08-18T06:58:37Z

Hello,
I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:

Error: All samples removed due to missingness in covariate
file!

I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.

Any help would be greatly appreciated, I will paste in the whole process below.

Thanks for a great product,
Lars

laros@maul:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh
PRSice 2.3.5 (2021-09-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2023-08-17 13:54:23
/home/laros/PRSice2/PRSice_linux
--a1 A1
--a2 A2
--bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1
--base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T
--clump-kb 250kb
--clump-p 1.000000
--clump-r2 0.100000
--cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
--ignore-fid
--interval 5e-05
--keep-ambig
--ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
--ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11
--num-auto 22
--or
--out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group
--pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
--pheno-col MDD
--pvalue P
--score std
--seed 3270214622
--snp MarkerName
--stat LogOR
--target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC
--thread 1
--upper 0.05

Warning: By selecting --keep-ambig, PRSice assume the base
and target are reporting alleles on the same
strand and will therefore only perform dosage flip
for the ambiguous SNPs. If you are unsure of what
the strand is, then you should not select the
--keep-ambig option

Initializing Genotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)

Start processing PGC_UKB_depression_genome-wide

Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is:
MarkerName A1 A2 Freq LogOR StdErrLogOR P

Reading 100.00%
8483301 variant(s) observed in base file, with:
39487 NA stat/p-value observed
4210543 negative statistic observed. Maybe you have
forgotten the --beta flag?
646120 ambiguous variant(s)
4233271 total variant(s) included from base file

Loading Genotype info from target

92 people (0 male(s), 0 female(s)) observed
92 founder(s) included

4112097 variant(s) not found in previous data
43 variant(s) with mismatch information
522636 ambiguous variant(s) kept
3460831 variant(s) included

Initializing Genotype file:
/fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr
(bed)

Loading Genotype info from reference

2504 people (0 male(s), 0 female(s)) observed
503 founder(s) included

10540328 variant(s) not found in previous data
149 variant(s) with mismatch information
469778 ambiguous variant(s) kept
3104546 variant(s) included

Phenotype file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno
Column Name of Sample ID: FID
Note: If the phenotype file does not contain a header, the
column name will be displayed as the Sample ID which is
expected.

There are a total of 1 phenotype to process

Start performing clumping

Clumping Progress: 100.00%
Number of variant(s) after clumping : 188356

Processing the 1 th phenotype

MDD is a binary phenotype
35 control(s)
57 case(s)

Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs

Error: All samples removed due to missingness in covariate
file!

choishingwan · 2023-08-18T12:03:53Z

What's the header of your pc file?

…

On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote: Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message: Error: All samples removed due to missingness in covariate file! I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up. Any help would be greatly appreciated, I will paste in the whole process below. Thanks for a great product, Lars ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt --binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt --lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05 Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide Base file: /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target 92 people (0 male(s), 0 female(s)) observed 92 founder(s) included 4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference 2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included 10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected. There are a total of 1 phenotype to process Start performing clumping Clumping Progress: 100.00% Number of variant(s) after clumping : 188356 Processing the 1 th phenotype MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs Error: All samples removed due to missingness in covariate file! — Reply to this email directly, view it on GitHub <#337>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

LarsOstman · 2023-08-18T13:22:30Z

Hi, Thank you for getting back to me! The headers (and format) are as follows: FID IID PC1 PC2 PC3 PC4 PC5 PC6 F1 F1 -0.0488942 -0.00648387 0.0119713 0.0394345 -0.0165522 0.0617235 F2 F2 -0.0499371 0.0127898 0.0426918 0.0412524 -0.0538963 0.0342523 F4 F4 0.0154813 0.0156588 0.0044783 -0.00596863 -0.023635 0.00985086 F5 F5 -0.0147007 0.00670695 0.0355421 0.00302993 -0.0671668 -0.00930397 F6 F6 -0.0259049 -0.0069673 -0.0347271 -0.0398622 0.015978 0.0781486 F8 F8 -0.0345881 0.0205085 -0.0136661 0.0191272 -0.0209368 0.0631035 F9 F9 -0.0259158 0.0119127 0.0224861 0.0451637 -0.0516346 0.0112552 The columns are tab delimited in the file, but I’ve tried with space aswell and get the same error-message. Thanks again, Lars From: Shing Wan Choi ***@***.***> Sent: den 18 augusti 2023 14:04 To: choishingwan/PRSice ***@***.***> Cc: Lars Östman ***@***.***>; Author ***@***.***> Subject: Re: [choishingwan/PRSice] Issue with covariate-file (Issue #337) What's the header of your pc file?

On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***<mailto:***@***.***>> wrote: Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message: Error: All samples removed due to missingness in covariate file! I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up. Any help would be greatly appreciated, I will paste in the whole process below. Thanks for a great product, Lars ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$<mailto:***@***.***:/fenix/users/laros/ALF/Genetics/scripts$> ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt --binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt --lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05 Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide Base file: /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target 92 people (0 male(s), 0 female(s)) observed 92 founder(s) included 4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference 2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included 10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected. There are a total of 1 phenotype to process Start performing clumping Clumping Progress: 100.00% Number of variant(s) after clumping : 188356 Processing the 1 th phenotype MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs Error: All samples removed due to missingness in covariate file! — Reply to this email directly, view it on GitHub <#337>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***<mailto:***@***.***>>

— Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>

LarsOstman · 2023-08-18T13:32:11Z

Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to. Lars Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>: What's the header of your pc file?

On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote: Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message: Error: All samples removed due to missingness in covariate file! I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up. Any help would be greatly appreciated, I will paste in the whole process below. Thanks for a great product, Lars ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt --binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt --lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05 Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide Base file: /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target 92 people (0 male(s), 0 female(s)) observed 92 founder(s) included 4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference 2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included 10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected. There are a total of 1 phenotype to process Start performing clumping Clumping Progress: 100.00% Number of variant(s) after clumping : 188356 Processing the 1 th phenotype MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs Error: All samples removed due to missingness in covariate file! — Reply to this email directly, view it on GitHub <#337>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

— Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.***>

choishingwan · 2023-08-18T13:46:22Z

You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @pc[1-6] Sam

…

On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote: Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to. Lars Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>: What's the header of your pc file? On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote: > Hello, > I am trying to calculate a PRS-score, with PRSice2, on a > case-control-cohort based on summary statistics from a larger GWAS-study. I > have calculated principal components and want to use the first 6 PCs as > covariates for the analysis. However, when I run the analysis I get the > following error message: > > Error: All samples removed due to missingness in covariate > file! > > I have made sure there aren't any hidden spaces in the covariates-file, I > have tried to delimit with both tabs and spaces, and I have checked (and > re-checked) that the path and the file-name are correct. However the same > error-message keeps showing up. > > Any help would be greatly appreciated, I will paste in the whole process > below. > > Thanks for a great product, > Lars > > ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh > PRSice 2.3.5 (2021-09-20) > https://github.com/choishingwan/PRSice > (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly > GNU General Public License v3 > If you use PRSice in any published work, please cite: > Choi SW, O'Reilly PF. > PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. > GigaScience 8, no. 7 (July 1, 2019) > 2023-08-17 13:54:23 > /home/laros/PRSice2/PRSice_linux > --a1 A1 > --a2 A2 > --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 > --base > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > > --binary-target T > --clump-kb 250kb > --clump-p 1.000000 > --clump-r2 0.100000 > --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > --ignore-fid > --interval 5e-05 > --keep-ambig > --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > --ld-keep > /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt > > --lower 1e-11 > --num-auto 22 > --or > --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group > --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > --pheno-col MDD > --pvalue P > --score std > --seed 3270214622 > --snp MarkerName > --stat LogOR > --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC > --thread 1 > --upper 0.05 > > Warning: By selecting --keep-ambig, PRSice assume the base > and target are reporting alleles on the same > strand and will therefore only perform dosage flip > for the ambiguous SNPs. If you are unsure of what > the strand is, then you should not select the > --keep-ambig option > > Initializing Genotype file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) > Start processing PGC_UKB_depression_genome-wide > > Base file: > > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > Header of file is: > MarkerName A1 A2 Freq LogOR StdErrLogOR P > > Reading 100.00% > 8483301 variant(s) observed in base file, with: > 39487 NA stat/p-value observed > 4210543 negative statistic observed. Maybe you have > forgotten the --beta flag? > 646120 ambiguous variant(s) > 4233271 total variant(s) included from base file > Loading Genotype info from target > > 92 people (0 male(s), 0 female(s)) observed > 92 founder(s) included > > 4112097 variant(s) not found in previous data > 43 variant(s) with mismatch information > 522636 ambiguous variant(s) kept > 3460831 variant(s) included > > Initializing Genotype file: > /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > (bed) > Loading Genotype info from reference > > 2504 people (0 male(s), 0 female(s)) observed > 503 founder(s) included > > 10540328 variant(s) not found in previous data > 149 variant(s) with mismatch information > 469778 ambiguous variant(s) kept > 3104546 variant(s) included > > Phenotype file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > Column Name of Sample ID: FID > Note: If the phenotype file does not contain a header, the > column name will be displayed as the Sample ID which is > expected. > > There are a total of 1 phenotype to process > > Start performing clumping > > Clumping Progress: 100.00% > Number of variant(s) after clumping : 188356 > > Processing the 1 th phenotype > > MDD is a binary phenotype > 35 control(s) > 57 case(s) > Processing the covariate file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > > Error: All samples removed due to missingness in covariate > file! > > — > Reply to this email directly, view it on GitHub > <#337>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> > — Reply to this email directly, view it on GitHub< #337 (comment)>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.***> — Reply to this email directly, view it on GitHub <#337 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you commented.Message ID: ***@***.***>

LarsOstman · 2023-08-18T14:06:14Z

Thank you so much, and I apologize for taking your time with such a simple answer. I'll fix it straight away. And just to see if I understand, would another solution be to remove the FID-column from the covariates-file? Since they would make IID the first column, and thus the default one? Thank you once again! Lars Den 18 aug. 2023 15:46 skrev Shing Wan Choi ***@***.***>: You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @pc[1-6] Sam

On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote: Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to. Lars Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>: What's the header of your pc file? On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote: > Hello, > I am trying to calculate a PRS-score, with PRSice2, on a > case-control-cohort based on summary statistics from a larger GWAS-study. I > have calculated principal components and want to use the first 6 PCs as > covariates for the analysis. However, when I run the analysis I get the > following error message: > > Error: All samples removed due to missingness in covariate > file! > > I have made sure there aren't any hidden spaces in the covariates-file, I > have tried to delimit with both tabs and spaces, and I have checked (and > re-checked) that the path and the file-name are correct. However the same > error-message keeps showing up. > > Any help would be greatly appreciated, I will paste in the whole process > below. > > Thanks for a great product, > Lars > > ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh > PRSice 2.3.5 (2021-09-20) > https://github.com/choishingwan/PRSice > (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly > GNU General Public License v3 > If you use PRSice in any published work, please cite: > Choi SW, O'Reilly PF. > PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. > GigaScience 8, no. 7 (July 1, 2019) > 2023-08-17 13:54:23 > /home/laros/PRSice2/PRSice_linux > --a1 A1 > --a2 A2 > --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 > --base > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > > --binary-target T > --clump-kb 250kb > --clump-p 1.000000 > --clump-r2 0.100000 > --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > --ignore-fid > --interval 5e-05 > --keep-ambig > --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > --ld-keep > /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt > > --lower 1e-11 > --num-auto 22 > --or > --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group > --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > --pheno-col MDD > --pvalue P > --score std > --seed 3270214622 > --snp MarkerName > --stat LogOR > --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC > --thread 1 > --upper 0.05 > > Warning: By selecting --keep-ambig, PRSice assume the base > and target are reporting alleles on the same > strand and will therefore only perform dosage flip > for the ambiguous SNPs. If you are unsure of what > the strand is, then you should not select the > --keep-ambig option > > Initializing Genotype file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) > Start processing PGC_UKB_depression_genome-wide > > Base file: > > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > Header of file is: > MarkerName A1 A2 Freq LogOR StdErrLogOR P > > Reading 100.00% > 8483301 variant(s) observed in base file, with: > 39487 NA stat/p-value observed > 4210543 negative statistic observed. Maybe you have > forgotten the --beta flag? > 646120 ambiguous variant(s) > 4233271 total variant(s) included from base file > Loading Genotype info from target > > 92 people (0 male(s), 0 female(s)) observed > 92 founder(s) included > > 4112097 variant(s) not found in previous data > 43 variant(s) with mismatch information > 522636 ambiguous variant(s) kept > 3460831 variant(s) included > > Initializing Genotype file: > /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > (bed) > Loading Genotype info from reference > > 2504 people (0 male(s), 0 female(s)) observed > 503 founder(s) included > > 10540328 variant(s) not found in previous data > 149 variant(s) with mismatch information > 469778 ambiguous variant(s) kept > 3104546 variant(s) included > > Phenotype file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > Column Name of Sample ID: FID > Note: If the phenotype file does not contain a header, the > column name will be displayed as the Sample ID which is > expected. > > There are a total of 1 phenotype to process > > Start performing clumping > > Clumping Progress: 100.00% > Number of variant(s) after clumping : 188356 > > Processing the 1 th phenotype > > MDD is a binary phenotype > 35 control(s) > 57 case(s) > Processing the covariate file: > /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > > Error: All samples removed due to missingness in covariate > file! > > — > Reply to this email directly, view it on GitHub > <#337>, or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> > — Reply to this email directly, view it on GitHub< #337 (comment)>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.***> — Reply to this email directly, view it on GitHub <#337 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you commented.Message ID: ***@***.***>

— Reply to this email directly, view it on GitHub<#337 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.***>

choishingwan · 2023-08-18T15:24:54Z

Yes

…

On Fri, Aug 18, 2023, 10:06 AM LarsOstman ***@***.***> wrote: Thank you so much, and I apologize for taking your time with such a simple answer. I'll fix it straight away. And just to see if I understand, would another solution be to remove the FID-column from the covariates-file? Since they would make IID the first column, and thus the default one? Thank you once again! Lars Den 18 aug. 2023 15:46 skrev Shing Wan Choi ***@***.***>: You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @pc[1-6] Sam On Fri, Aug 18, 2023, 9:32 AM LarsOstman ***@***.***> wrote: > Thought I'd add that it is just the .eigenvec output-file from the > PC-analysis, which I haven't done any changes to. > > Lars > > Den 18 aug. 2023 14:04 skrev Shing Wan Choi ***@***.***>: > > What's the header of your pc file? > > On Fri, Aug 18, 2023, 2:58 AM LarsOstman ***@***.***> wrote: > > > Hello, > > I am trying to calculate a PRS-score, with PRSice2, on a > > case-control-cohort based on summary statistics from a larger > GWAS-study. I > > have calculated principal components and want to use the first 6 PCs as > > covariates for the analysis. However, when I run the analysis I get the > > following error message: > > > > Error: All samples removed due to missingness in covariate > > file! > > > > I have made sure there aren't any hidden spaces in the covariates-file, > I > > have tried to delimit with both tabs and spaces, and I have checked (and > > re-checked) that the path and the file-name are correct. However the > same > > error-message keeps showing up. > > > > Any help would be greatly appreciated, I will paste in the whole process > > below. > > > > Thanks for a great product, > > Lars > > > > ***@***.***:/fenix/users/laros/ALF/Genetics/scripts$ > ./ALF_PRS_by_group.sh > > PRSice 2.3.5 (2021-09-20) > > https://github.com/choishingwan/PRSice > > (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly > > GNU General Public License v3 > > If you use PRSice in any published work, please cite: > > Choi SW, O'Reilly PF. > > PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. > > GigaScience 8, no. 7 (July 1, 2019) > > 2023-08-17 13:54:23 > > /home/laros/PRSice2/PRSice_linux > > --a1 A1 > > --a2 A2 > > --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 > > --base > > > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > > > > > --binary-target T > > --clump-kb 250kb > > --clump-p 1.000000 > > --clump-r2 0.100000 > > --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > > --ignore-fid > > --interval 5e-05 > > --keep-ambig > > --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > > --ld-keep > > > /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt > > > > > --lower 1e-11 > > --num-auto 22 > > --or > > --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group > > --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > > --pheno-col MDD > > --pvalue P > > --score std > > --seed 3270214622 > > --snp MarkerName > > --stat LogOR > > --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC > > --thread 1 > > --upper 0.05 > > > > Warning: By selecting --keep-ambig, PRSice assume the base > > and target are reporting alleles on the same > > strand and will therefore only perform dosage flip > > for the ambiguous SNPs. If you are unsure of what > > the strand is, then you should not select the > > --keep-ambig option > > > > Initializing Genotype file: > > /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) > > Start processing PGC_UKB_depression_genome-wide > > > > Base file: > > > > > /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt > > > Header of file is: > > MarkerName A1 A2 Freq LogOR StdErrLogOR P > > > > Reading 100.00% > > 8483301 variant(s) observed in base file, with: > > 39487 NA stat/p-value observed > > 4210543 negative statistic observed. Maybe you have > > forgotten the --beta flag? > > 646120 ambiguous variant(s) > > 4233271 total variant(s) included from base file > > Loading Genotype info from target > > > > 92 people (0 male(s), 0 female(s)) observed > > 92 founder(s) included > > > > 4112097 variant(s) not found in previous data > > 43 variant(s) with mismatch information > > 522636 ambiguous variant(s) kept > > 3460831 variant(s) included > > > > Initializing Genotype file: > > /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr > > (bed) > > Loading Genotype info from reference > > > > 2504 people (0 male(s), 0 female(s)) observed > > 503 founder(s) included > > > > 10540328 variant(s) not found in previous data > > 149 variant(s) with mismatch information > > 469778 ambiguous variant(s) kept > > 3104546 variant(s) included > > > > Phenotype file: > > /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno > > Column Name of Sample ID: FID > > Note: If the phenotype file does not contain a header, the > > column name will be displayed as the Sample ID which is > > expected. > > > > There are a total of 1 phenotype to process > > > > Start performing clumping > > > > Clumping Progress: 100.00% > > Number of variant(s) after clumping : 188356 > > > > Processing the 1 th phenotype > > > > MDD is a binary phenotype > > 35 control(s) > > 57 case(s) > > Processing the covariate file: > > /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs > > > > Error: All samples removed due to missingness in covariate > > file! > > > > — > > Reply to this email directly, view it on GitHub > > <#337>, or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA> > > > . > > You are receiving this because you are subscribed to this thread.Message > > ID: ***@***.***> > > > > — > Reply to this email directly, view it on GitHub< > #337 (comment)>, > or unsubscribe< > https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>. > > You are receiving this because you authored the thread.Message ID: > ***@***.***> > > — > Reply to this email directly, view it on GitHub > < #337 (comment)>, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA> > . > You are receiving this because you commented.Message ID: > ***@***.***> > — Reply to this email directly, view it on GitHub< #337 (comment)>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA>. You are receiving this because you authored the thread.Message ID: ***@***.***> — Reply to this email directly, view it on GitHub <#337 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJTRYTCYVWWHBTHA2EQXP3XV5ZGFANCNFSM6AAAAAA3VC7XBA> . You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with covariate-file #337

Issue with covariate-file #337

LarsOstman commented Aug 18, 2023

choishingwan commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

choishingwan commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

choishingwan commented Aug 18, 2023 via email

Issue with covariate-file #337

Issue with covariate-file #337

Comments

LarsOstman commented Aug 18, 2023

Start processing PGC_UKB_depression_genome-wide

Loading Genotype info from target

Loading Genotype info from reference

Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs

choishingwan commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

choishingwan commented Aug 18, 2023 via email

LarsOstman commented Aug 18, 2023 via email

choishingwan commented Aug 18, 2023 via email

Processing the covariate file:
/fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs