-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STAR, CIRIquant, DCC errors during pipeline run #89
Comments
STARI encountered this problem before, but had no time to fix it yet. The problem is that the iGenomes STAR index is not compatible with the STAR version used in the pipeline. A workaround is setting manually setting CIRIquantThis is most probably due to missing escape characters in nextflow scripts, will be fixed via #83 DCCCould potentially also be fixed via #83 What can you do now?I expect #83 to be merged into |
Hi, just letting you know I ran with the merged bug fix #83, and also got a similar DCC error.
details:
|
Hey, @dmgie, could you maybe check if your errors do still occur with the latest version of the pipeline? There were some PRs fixed in the meantime and I am not sure if they also adressed your problems The DCC issue looks like there is an additional |
Hi @nictru, sorry for the late reply! I would've been happy to test it but sadly I do not have access anymore to the data I had initially run the pipeline with (which had lead to the errors mentioned in the first post). I could possibly test it with some other data, but I'm currently not working on projects related to circRNA-related anlaysis anymore so I can't promise to be able to test it anytime soon. If I do, I'll try and report back here in case there are any changes. Would you want me to close the issue meanwhile or should it be left open? |
Hey, no problem - I will close the issue in this case, feel free to open a new one if you encounter new problems some time in the future :) |
Description of the bug
Hiya,
thank you for the work on the pipeline!
Currently, when I try to run the pipeline using my own (paired-end) data, it seems that there are a few steps in the pipeline in which it fails and exits. When going through the test run/profile though (using the test profile i.e
nextflow run nf-core/circrna -c ./hpc.config -profile test,singularity -r dev -ansi-log false -resume
) it seems to work fine and the pipeline completes.The first issue that arose was regarding STAR. If it uses the
genome: GRCh37
parameter, from what I understand this obtains the necessary fies/indices from iGenome. The issue is that when it reaches the mapping step prior toDCC
, it fails due to Genome & STAR version incompatibility (STAR output below). The image used for this step seems to contain STAR version 2.7.10a, whereas Genome was generated with 2.7.4a, so could be a need to downgrade the image to a older STAR version? [*1]Alternatively, I saw that I can provide my own fasta/gtf (and also the required species) parameter, so I tried it using the files from Ensembl (https://grch37.ensembl.org/Homo_sapiens/Info/Index). This seemed to work fine, but during DCC’s execution results in a
ValueError: invalid literal for int() with base 10: '4"'
error (more details below). From what I have found so far is that the GTF doesn't get parsed correctly by theCirc_nonCirc_Exon_Match.py
functions of DCC/circtools. Installing and runningcirctools detect
/DCC
with the same files seems to work fine.There was another error I had run into when trying to add/use
ciriquant
as a tool which errored out withCIRIquant.utils.PipelineError: Empty hisat2 bam generated, please re-run CIRIquant with -v and check the fastq and hisat2-index
. Re-running this viabash .command.run
results in the same error. If I try on the other hand launching the singularity image myself and run the commands i.eworks fine and runs.
I have copied the errors to the box below. The command that was run (which produced the errors)is:
nextflow run nf-core/circrna -c ./hpc.config -params-file ./params.yaml -profile singularity -r dev -ansi-log false -resume
. Do let me know if there is anything I can help with.On a sidenote: in the
targetscan_format.sh
script, its mentioned in a comment thatSubset mature.fa according to the species provided by user to '--genome'
but from briefly looking around wasn't able to find where this might be included in the pipeline?[*1] Tried using a custom image with a downgraded STAR version, still get the same error
Command used and terminal output
STAR
CIRIquant
DCC (own fasta/gtf)
Relevant files
No response
System information
Nextflow Version: 23.10.0
Hardware: HPC/Cluster
Executor: Slurm
Container: Singularity
OS: Ubuntu
nf-core/circrna version: dev
The text was updated successfully, but these errors were encountered: