Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastq-scan fails on large FASTQs #183

Open
kapsakcj opened this issue Oct 20, 2022 · 2 comments
Open

fastq-scan fails on large FASTQs #183

kapsakcj opened this issue Oct 20, 2022 · 2 comments

Comments

@kapsakcj
Copy link
Contributor

2GB of RAM ain't enough when your FASTQ files are >11GB in size, like from a NovaSeq.

This line:

https://github.com/theiagen/public_health_viral_genomics/blob/main/tasks/quality_control/task_fastq_scan.wdl#L50

and this line:

should be upped to at least 8 GB.

Although...when I ran the 11GB FASTQ file through the WDL on the commandline, it consumed upwards of 18GB of RAM, so if Terra kicks in the "memory retry" feature then these files should get processed fine with 2nd or 3rd attempts

@kapsakcj
Copy link
Contributor Author

For this particular failure, we are downsampling the FASTQs with RASUSA first, but it doesn't hurt to fix these potential issues anyways

@rpetit3
Copy link
Contributor

rpetit3 commented Oct 30, 2022

Large fastqs are now supported in fastq-scan (https://github.com/rpetit3/fastq-scan/releases/tag/v1.0.1). But I agree with your approach of subsampling to a reasonable coverage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants