Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sherlock optimizations #264

Merged
merged 10 commits into from
Dec 3, 2024
Merged

Sherlock optimizations #264

merged 10 commits into from
Dec 3, 2024

Conversation

thalassemia
Copy link
Contributor

When running workflows on Sherlock, the Nextflow process that orchestrates the workflow is submitted to SLURM as a long-running, 1 CPU job that stays alive for the duration of the workflow. It is responsible for submitting additional jobs to SLURM for each task in the workflow. Those tasks must wait in the SLURM queue until a node frees up. All the while, the Nextflow job remains mostly idle. This PR fills this idle CPU time by running certain short workflow tasks (build runtime image, analyses, create variants) directly on the same job as the Nextflow orchestrator, simultaneously avoiding the wait that currently comes with submitting those tasks to the SLURM queue (~15 minute average). I found that after including queue wait time, the ParCa takes about the same time to finish whether it is submitted to the queue as a 4 CPU job or run locally on the 1 CPU Nextflow job. Therefore, I made that run locally as well.

Other changes:

  • Fixed a previously overlooked symlinking issue
  • Fixed chromosome_segment unique molecule that is required for superhelical_density sims

@thalassemia thalassemia merged commit f89c28c into master Dec 3, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant