Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyEffGenomeSize throws error - single positional indexer is out-of-bounds #18

Open
jpcartailler opened this issue Aug 3, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@jpcartailler
Copy link

Grehttps://vumc365.sharepoint.com/sites/pancreatlas-teametings,

Command used:

pyEffGenomeSize.py \
  --gtf gencode.v20.annotation.sorted.gtf \
  --filterNonCoding \
  --bed Twist_ILMN_Exome_2.0_Plus_Panel.hg38.sorted.bed

Input files (prior to sorting)

Both original gencode and bed files were sorted (as required by pyEffGenomeSize.py):

bedtools sort -i gencode.v20.annotation.gtf > gencode.v20.annotation.sorted.gtf
bedtools sort -i Twist_ILMN_Exome_2.0_Plus_Panel.hg38.bed > Twist_ILMN_Exome_2.0_Plus_Panel.hg38.sorted.bed

The script runs for a minute or so and throws the following error:

chr1    HAVANA  exon    69091   70008   .       +       .       gene_id "ENSG00000186092.4";transcript_id "ENST00000335137.3";gene_type "protein_coding";gene_status "KNOWN";gene_name "OR4F5";transcript_type "protein_coding";transcript_status "KNOWN";transcript_name "OR4F5-001";exon_number 1;exon_id "ENSE00002319515.1";level 2;protein_id "ENSP00000334393.3";tag "CCDS";ccdsid "CCDS30547.1";havana_gene "OTTHUMG00000001094.1";havana_transcript "OTTHUMT00000003223.1";
 chr1   HAVANA  exon    139790  139847  .       -       .       gene_id "ENSG00000239906.1";transcript_id "ENST00000493797.1";gene_type "antisense";gene_status "NOVEL";gene_name "RP11-34P13.14";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.14-001";exon_number 2;exon_id "ENSE00001922992.1";level 2;tag "basic";havana_gene "OTTHUMG00000002481.1";havana_transcript "OTTHUMT00000007038.1";
 chr1   HAVANA  exon    140075  140339  .       -       .       gene_id "ENSG00000239906.1";transcript_id "ENST00000493797.1";gene_type "antisense";gene_status "NOVEL";gene_name "RP11-34P13.14";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.14-001";exon_number 1;exon_id "ENSE00001913281.1";level 2;tag "basic";havana_gene "OTTHUMG00000002481.1";havana_transcript "OTTHUMT00000007038.1";
 chr1   HAVANA  exon    141474  143011  .       -       .       gene_id "ENSG00000241860.3";transcript_id "ENST00000484859.1";gene_type "processed_transcript";gene_status "NOVEL";gene_name "RP11-34P13.13";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.13-004";exon_number 2;exon_id "ENSE00001911218.1";level 2;tag "basic";havana_gene "OTTHUMG00000002480.3";havana_transcript "OTTHUMT00000007035.1";
 chr1   HAVANA  exon    142808  143011  .       -       .       gene_id "ENSG00000241860.3";transcript_id "ENST00000490997.2";gene_type "processed_transcript";gene_status "NOVEL";gene_name "RP11-34P13.13";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.13-003";exon_number 3;exon_id "ENSE00001838397.1";level 2;tag "basic";havana_gene "OTTHUMG00000002480.3";havana_transcript "OTTHUMT00000007036.1";
 chr1   HAVANA  exon    146386  149707  .       -       .       gene_id "ENSG00000241860.3";transcript_id "ENST00000484859.1";gene_type "processed_transcript";gene_status "NOVEL";gene_name "RP11-34P13.13";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.13-004";exon_number 1;exon_id "ENSE00001860404.1";level 2;tag "basic";havana_gene "OTTHUMG00000002480.3";havana_transcript "OTTHUMT00000007035.1";
 chr1   HAVANA  exon    146386  146509  .       -       .       gene_id "ENSG00000241860.3";transcript_id "ENST00000490997.2";gene_type "processed_transcript";gene_status "NOVEL";gene_name "RP11-34P13.13";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.13-003";exon_number 2;exon_id "ENSE00001853409.1";level 2;tag "basic";havana_gene "OTTHUMG00000002480.3";havana_transcript "OTTHUMT00000007036.1";
 chr1   HAVANA  exon    146642  146831  .       -       .       gene_id "ENSG00000241860.3";transcript_id "ENST00000490997.2";gene_type "processed_transcript";gene_status "NOVEL";gene_name "RP11-34P13.13";transcript_type "antisense";transcript_status "KNOWN";transcript_name "RP11-34P13.13-003";exon_number 1;exon_id "ENSE00001868647.1";level 2;tag "basic";havana_gene "OTTHUMG00000002480.3";havana_transcript "OTTHUMT00000007036.1";
 chr1   HAVANA  exon    450740  451678  .       -       .       gene_id "ENSG00000278566.1";transcript_id "ENST00000426406.2";gene_type "protein_coding";gene_status "KNOWN";gene_name "OR4F29";transcript_type "protein_coding";transcript_status "KNOWN";transcript_name "OR4F29-001";exon_number 1;exon_id "ENSE00002316283.2";level 2;protein_id "ENSP00000409316.1";tag "CCDS";ccdsid "CCDS41220.1";havana_gene "OTTHUMG00000002860.1";havana_transcript "OTTHUMT00000007999.1";
 chr1   HAVANA  exon    685716  686654  .       -       .       gene_id "ENSG00000273547.1";transcript_id "ENST00000332831.3";gene_type "protein_coding";gene_status "KNOWN";gene_name "OR4F16";transcript_type "protein_coding";transcript_status "KNOWN";transcript_name "OR4F16-001";exon_number 1;exon_id "ENSE00002324228.2";level 2;protein_id "ENSP00000329982.2";tag "CCDS";ccdsid "CCDS41221.1";havana_gene "OTTHUMG00000002581.1";havana_transcript "OTTHUMT00000007334.1";
 None
chrom   start   end     num     list    /data/cds_group/reference/Twist_ILMN_Exome_2.0_Plus_Panel.hg38.sorted.bed       filtered_gtf.gtf
 chr1   69090   70008   1       2       0       1
 chr1   139789  139847  1       2       0       1
 chr1   140074  140339  1       2       0       1
 chr1   141473  143011  1       2       0       1
 chr1   146385  149707  1       2       0       1
 chr1   450739  451678  1       2       0       1
 chr1   685715  686654  1       2       0       1
 chr1   760910  761154  1       2       0       1
 chr1   761777  761989  1       2       0       1
 None
Traceback (most recent call last):
  File "/data/p_magnuson_lab/conda/envs/pytmb/bin/pyEffGenomeSize.py", line 182, in <module>
    df.columns = df.iloc[0]
  File "/data/p_magnuson_lab/conda/envs/pytmb/lib/python3.10/site-packages/pandas/core/indexing.py", line 1073, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/data/p_magnuson_lab/conda/envs/pytmb/lib/python3.10/site-packages/pandas/core/indexing.py", line 1625, in _getitem_axis
    self._validate_integer(key, axis)
  File "/data/p_magnuson_lab/conda/envs/pytmb/lib/python3.10/site-packages/pandas/core/indexing.py", line 1557, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
(pytmb) [cartaij@horus src]$ echo "${CMD}"
pyEffGenomeSize.py     --gtf /data/cds_group/Processed-Data/2023-275-Jan_T-WGS-Cutaneous-Squamous-Cell-Carcinoma/data/gencode.v20.annotation.sorted.gtf     --filterNonCoding      --bed /data/cds_group/reference/Twist_ILMN_Exome_2.0_Plus_Panel.hg38.sorted.bed

Suggestions?

@nservant nservant added the bug Something isn't working label Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants